AmdConsortium

From Wiki

(Difference between revisions)
Jump to: navigation, search
Line 31: Line 31:
* '''Most of Baltimore's references are unknown to me !!!'''
* '''Most of Baltimore's references are unknown to me !!!'''
* Only a few references from Bonn are still missing
* Only a few references from Bonn are still missing
 +
but why are most of Individual like 'ZM23041920' (I found them because I had the coll_comment field containing 'Individual was ZM23041920 , now H-208-2-MZ' )
* Paris, Creteil and Southampton are ok
* Paris, Creteil and Southampton are ok

Revision as of 08:13, 17 September 2010

The AMD Consortium within EVI-Genoret Database

The Genoret Database aims to host all data for the Amd Consortium.

See the AMD Consortium Welcome Page

Integration of AMD_COLL_xxx files containing all genotypes 2010/06/18

On https://beaune.cng.fr/amd/ I found the files

	 	 -
 AMD_COLL_AUDO.tar	17-Jun-2010 11:50 	35M
 AMD_COLL_LOTERY.tar	17-Jun-2010 11:49 	128M
 AMD_COLL_SCHOLL.tar	15-Jun-2010 15:29 	70M
 AMD_COLL_SOUIED.tar	17-Jun-2010 11:52 	70M
 AMD_COLL_ZACK.tar	17-Jun-2010 11:53 	41M
 results_AMD.txt.gz	17-Sep-2009 10:40 	11M
  • result_AMD.txt.gz was the same I already had.
  • I did tar -xvf AMD_COLL_xxxx.tar creating the correspondig directory containing (ie Audo)
64465487 2008-09-01 10:46 Final_dat_AMD_COLL_AUDO.txt
     717 2008-08-12 10:01 report.txt
26731497 2008-08-12 09:52 success_mark.txt
    3139 2008-08-12 09:52 success_samp.txt

The Final_dat_AMD_COLL_xxx contain

 Family Individual Father Mother Sex Status rs12354060 rs12184279 rs12564807 rs3115860 ... ... ... (with 332224 rs)

I created a small one called Audo.txt with the first 10 columns, run gscope CngGenoPheno and got the CngGenoPheno list available on the AmdConsortium welcome page 'Phenotypic Information.

  • Most of Baltimore's references are unknown to me !!!
  • Only a few references from Bonn are still missing

but why are most of Individual like 'ZM23041920' (I found them because I had the coll_comment field containing 'Individual was ZM23041920 , now H-208-2-MZ' )

  • Paris, Creteil and Southampton are ok

Integration of the CNG Status 2008/10/01

  1. Reading the file AMD_verif_statuts_Leveillard_envoye_1oct08.xls
    1. I normalized the headers (without blank, no /)
    2. I set sex to M or F instead of 1 or 2
    3. I replaced the centre with respectively Bonn, Creteil ...
    4. I replaced NAT2-xyz by xyz for Creteil (some sex values don't correpond)
    5. I remove TL to obtain the Paris ID
    6. I took the beginning of the Bonn's Ids (L-060-GA => L-060- but some Ids are ambiguous L-060-1-BH or L-060-2-KS ?
  2. Reading the files snplist_NXNL2.xlt snplist_TXNL6.xlt genotypeNXNL2.xls genotypeTXNL6.xls
    1. convert to .csv
    2. remove the comment lines at beginning

Integration of the Phenotyping Data by Raymond 2008/02/11

  • Baltimore January version
  1. Betsy Campochiaro sent several Excel files corresponding to an Access database
  2. Creation of a .csv file containing nearby all fields pointed as secondary tables (Genoret Tcl program)
  3. Integration of this file in the csvschema table csvt8
  4. Detection of errors with the Genoret Tcl program
  5. Corrections and updates of small errors
  6. Add ped to father_id and mother_id (2008/01/11)
  7. I replaced the numbers corresponding to the diagnosis with the text of the diagnosis (2008/01/10)
  • Bonn
  1. Direct connection to the Phenotyping Database in Tuebingen (2007-11)
  2. Save the display of all patients as .csv file
  3. Integration of this file in the csvschema table csvt16
  4. They are 3 sub_table in our table ... the 3rd oe should be ok
  5. It seems that now a connection to Tuebingen gives only the 3rd table ...
  6. Keep only the year of birth (Genoret Tcl program)
  7. Keep only the centre Bonn or the FamStudy (ok)
  8. Some values were stored as boolean and couldn't be displayed correctly (a small square). I replaced nul by NO and not null by YES
  • Creteil
  1. Eric Souied sent an Excel file
  2. I removed birthdays and created a csvt table
  3. As the file contains only CNV patients (without this diagnosis) I add CNV to the AmdDiagOdOs column in the common table.
  • Paris CIC
  1. Isabelle Audo sent an Excel file
  • Jerusalem
  1. Itay Chowers sent a new file 2007/12/24 and a .doc file db_codes
    1. I deleted all empty rows especially at the end
    2. and I added the missing empty columns in the rows at the end
    3. I got the db_codes and I replaced the initial_stage_fellow_eye_areds (2 with J=DRY-2, etc.) in the commonview (2008/02/05).
  2. 2008/02/15 I merged different columns to get an integrated diagnosis. It depends now on the firsteyewithamd, etc. See the AmdConsortium Welcome page.
  • London
  1. I got the data from Andrew Webster as Excel file corresponding to his Access database.
  2. I removed names and birthdates
  3. Guillaume created a local SQL database for these level2 data.
  4. He created also a level1 database (as defiened by Tuebingen) with the data from Montpellier and Tuebingen.
  5. I'll extract the London data to create a csvt table (not yet done)
  • Southampton
  1. Angela sent the .xls file
  2. Raymond upload it in AmdConsortium Gallery
  3. Open Excel
  4. Delete the 'nearly empty' columns on the right
  5. Rename the duplicated column Project no.2
  6. Save as .csv
  7. Integrate it as csvt27
  8. Modify birthday to keep only the year. And some dates were written as dd.mm.yy
  9. I merged the cohort, diag_dry, diag_wet_amd, amd_, consolidated_areds in one value (2008/02/05)