AmdConsortium
From Wiki
Line 27: | Line 27: | ||
Family Individual Father Mother Sex Status rs12354060 rs12184279 rs12564807 rs3115860 ... ... ... (with 332224 rs) | Family Individual Father Mother Sex Status rs12354060 rs12184279 rs12564807 rs3115860 ... ... ... (with 332224 rs) | ||
- | I created a small one called | + | I created a small one called xxxx.txt (Audo, Lotery, Scholl, Souied, Zack) with the first 10 columns, run '''gscope CngGenoPheno''' and got the CngGenoPheno list available on the AmdConsortium welcome page 'Phenotypic Information. |
- | |||
* Only a few references from Bonn are still missing | * Only a few references from Bonn are still missing | ||
but why are most of Individual like 'ZM23041920' (I found them because I had the coll_comment field containing 'Individual was ZM23041920 , now H-208-2-MZ' ) | but why are most of Individual like 'ZM23041920' (I found them because I had the coll_comment field containing 'Individual was ZM23041920 , now H-208-2-MZ' ) | ||
- | * Paris, Creteil and Southampton are ok | + | * Paris, Creteil and Southampton are ok, Baltimore seems to be ok but there are still multiple diagnosis ... |
==Integration of the CNG Status 2008/10/01== | ==Integration of the CNG Status 2008/10/01== | ||
Line 48: | Line 47: | ||
==Integration of the Quality Control by Raymond Ripp 2008/04== | ==Integration of the Quality Control by Raymond Ripp 2008/04== | ||
2010/10/06 I have to integrate QC_results_AMD_Don_Zack_complete_sent_7april08.xls in the AllTogether table | 2010/10/06 I have to integrate QC_results_AMD_Don_Zack_complete_sent_7april08.xls in the AllTogether table | ||
+ | |||
+ | With this qc_alltogether table I could feed the AmdCommon in which we mist a lot Baltimore | ||
Comparing diagnosis from sep 2007 with apr 2008 I did minor corrections in Apr 2008 (upper case lower cae etc,) and major differences are listed below | Comparing diagnosis from sep 2007 with apr 2008 I did minor corrections in Apr 2008 (upper case lower cae etc,) and major differences are listed below |
Revision as of 15:39, 8 October 2010
The AMD Consortium within EVI-Genoret Database
The Genoret Database aims to host all data for the Amd Consortium.
See the AMD Consortium Welcome Page
Contents |
Integration of AMD_COLL_xxx files containing all genotypes 2010/06/18
On https://beaune.cng.fr/amd/ I found the files
- AMD_COLL_AUDO.tar 17-Jun-2010 11:50 35M AMD_COLL_LOTERY.tar 17-Jun-2010 11:49 128M AMD_COLL_SCHOLL.tar 15-Jun-2010 15:29 70M AMD_COLL_SOUIED.tar 17-Jun-2010 11:52 70M AMD_COLL_ZACK.tar 17-Jun-2010 11:53 41M results_AMD.txt.gz 17-Sep-2009 10:40 11M
- result_AMD.txt.gz was the same I already had.
- I did tar -xvf AMD_COLL_xxxx.tar creating the correspondig directory containing (ie Audo)
64465487 2008-09-01 10:46 Final_dat_AMD_COLL_AUDO.txt 717 2008-08-12 10:01 report.txt 26731497 2008-08-12 09:52 success_mark.txt 3139 2008-08-12 09:52 success_samp.txt
The Final_dat_AMD_COLL_xxx contain
Family Individual Father Mother Sex Status rs12354060 rs12184279 rs12564807 rs3115860 ... ... ... (with 332224 rs)
I created a small one called xxxx.txt (Audo, Lotery, Scholl, Souied, Zack) with the first 10 columns, run gscope CngGenoPheno and got the CngGenoPheno list available on the AmdConsortium welcome page 'Phenotypic Information.
- Only a few references from Bonn are still missing
but why are most of Individual like 'ZM23041920' (I found them because I had the coll_comment field containing 'Individual was ZM23041920 , now H-208-2-MZ' )
- Paris, Creteil and Southampton are ok, Baltimore seems to be ok but there are still multiple diagnosis ...
Integration of the CNG Status 2008/10/01
- Reading the file AMD_verif_statuts_Leveillard_envoye_1oct08.xls
- I normalized the headers (without blank, no /)
- I set sex to M or F instead of 1 or 2
- I replaced the centre with respectively Bonn, Creteil ...
- I replaced NAT2-xyz by xyz for Creteil (some sex values don't correpond)
- I remove TL to obtain the Paris ID
- I took the beginning of the Bonn's Ids (L-060-GA => L-060- but some Ids are ambiguous L-060-1-BH or L-060-2-KS ?
- Reading the files snplist_NXNL2.xlt snplist_TXNL6.xlt genotypeNXNL2.xls genotypeTXNL6.xls
- convert to .csv
- remove the comment lines at beginning
Integration of the Quality Control by Raymond Ripp 2008/04
2010/10/06 I have to integrate QC_results_AMD_Don_Zack_complete_sent_7april08.xls in the AllTogether table
With this qc_alltogether table I could feed the AmdCommon in which we mist a lot Baltimore
Comparing diagnosis from sep 2007 with apr 2008 I did minor corrections in Apr 2008 (upper case lower cae etc,) and major differences are listed below
2 004001 DiagSep=" " DiagApr2008="High Risk AMD" 3 004001 DiagSep=" " DiagApr2008="High Risk AMD" 31 034002 DiagSep=" " DiagApr2008="Not confirmed" 32 034002 DiagSep=" " DiagApr2008="Not confirmed" 42 043002 DiagSep=" " DiagApr2008="Unaffected/No Pathology" 50 049001 DiagSep="Advanced GA" DiagApr2008=" " 52 052004 DiagSep=" " DiagApr2008="Not confirmed" 88 107010 DiagSep="Low risk AMD" DiagApr2008="Unaffected/No Pathology" 129 139002 DiagSep="Exudative AMD" DiagApr2008="Advanced GA and Exudative AMD" 206 185002 DiagSep=" " DiagApr2008="Exudative AMD" 301 247020 DiagSep=" " DiagApr2008="Not confirmed" 300 247020 DiagSep=" " DiagApr2008="Not confirmed" 325 258001 DiagSep="Can't Confirm Dx" DiagApr2008="Can't Grade" 457 316002 DiagSep=" " DiagApr2008="Exudative AMD" 586 406002 DiagSep=" " DiagApr2008="Not confirmed" 631 435006 DiagSep=" " DiagApr2008="Not confirmed" 630 435006 DiagSep=" " DiagApr2008="Not confirmed" 666 447002 DiagSep=" " DiagApr2008="High Risk AMD" 678 452003 DiagSep=" " DiagApr2008="Not confirmed" 677 452003 DiagSep=" " DiagApr2008="Not confirmed"
I set Advanced GA for 049001 because it was already for Pk 49
Important changes are 107010 139002 258001 (written 2010/09/20) I got following files
2007-12-12 10:08 QualityControlCreteil.xls 2007-12-12 10:20 QualityControlCreteil.csv 2007-12-12 10:36 QualityControlBaltimore.csv 2007-12-12 10:40 QualityControlBonn.csv 2007-12-12 10:46 QualityControl.csv 2008-04-07 13:44 QC_results_AMD_Audo_sent_7april08.xls 2008-04-07 14:52 QC_results_AMD_Don_Zack_complete_sent_7april08.xls (this was not integrated before 2010/10/06) 2008-05-14 09:23 QualityControlBonn.xls 2008-05-14 09:23 QualityControlBaltimore.xls 2008-05-15 18:08 AMD_QC_results_sent_batch2_may08.zip 2009-04-07 14:29 QualityControlParis.csv
and created the Quality Control tables (see 'List all available Quality Controls' and per center on the AmdConsortium welcome page)
Integration of the Phenotyping Data by Raymond 2008/02/11
- Baltimore January version
- Betsy Campochiaro sent several Excel files corresponding to an Access database
- Creation of a .csv file containing nearby all fields pointed as secondary tables (Genoret Tcl program)
- Integration of this file in the csvschema table csvt8
- Detection of errors with the Genoret Tcl program
- Corrections and updates of small errors
- Add ped to father_id and mother_id (2008/01/11)
- I replaced the numbers corresponding to the diagnosis with the text of the diagnosis (2008/01/10)
- Bonn
- Direct connection to the Phenotyping Database in Tuebingen (2007-11)
- Save the display of all patients as .csv file
- Integration of this file in the csvschema table csvt16
- They are 3 sub_table in our table ... the 3rd oe should be ok
- It seems that now a connection to Tuebingen gives only the 3rd table ...
- Keep only the year of birth (Genoret Tcl program)
- Keep only the centre Bonn or the FamStudy (ok)
- Some values were stored as boolean and couldn't be displayed correctly (a small square). I replaced nul by NO and not null by YES
- Creteil
- Eric Souied sent an Excel file
- I removed birthdays and created a csvt table
- As the file contains only CNV patients (without this diagnosis) I add CNV to the AmdDiagOdOs column in the common table.
- Paris CIC
- Isabelle Audo sent an Excel file
- Jerusalem
- Itay Chowers sent a new file 2007/12/24 and a .doc file db_codes
- I deleted all empty rows especially at the end
- and I added the missing empty columns in the rows at the end
- I got the db_codes and I replaced the initial_stage_fellow_eye_areds (2 with J=DRY-2, etc.) in the commonview (2008/02/05).
- 2008/02/15 I merged different columns to get an integrated diagnosis. It depends now on the firsteyewithamd, etc. See the AmdConsortium Welcome page.
- London
- I got the data from Andrew Webster as Excel file corresponding to his Access database.
- I removed names and birthdates
- Guillaume created a local SQL database for these level2 data.
- He created also a level1 database (as defiened by Tuebingen) with the data from Montpellier and Tuebingen.
- I'll extract the London data to create a csvt table (not yet done)
- Southampton
- Angela sent the .xls file
- Raymond upload it in AmdConsortium Gallery
- Open Excel
- Delete the 'nearly empty' columns on the right
- Rename the duplicated column Project no.2
- Save as .csv
- Integrate it as csvt27
- Modify birthday to keep only the year. And some dates were written as dd.mm.yy
- I merged the cohort, diag_dry, diag_wet_amd, amd_, consolidated_areds in one value (2008/02/05)