Skip to content

package for published de novo datasets in rare disease

License

Notifications You must be signed in to change notification settings

jeremymcrae/dnm_cohorts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dnm_cohorts

This small package collects sample information and de novo mutations from published studies of de novo mutations in rare disease.

This also reconciles individuals and variants included in multiple studies, so we don't double count individuals if they are present in multiple studies.

All of the information is obtained from the relevant supplementary data tables associated with each journal publication. Where a study has collected healthy controls (e.g. unaffected siblings), the phenotype of the healthy controls is 'unaffected'.

Install

pip install git+git://github.com/jeremymcrae/dnm_cohorts.git

Usage

The data files are included in the package, and can be loaded in python with:

from dnm_cohorts import de_novos, cohort

Build data files

# to create a table of all individuals in the cohorts
dnm_cohorts --cohorts --output test.txt

# to create a table of all de novo mutations found in those individuals
dnm_cohorts --de-novos --output test.txt

The package contains a dataset of de novos on their original genome build (the default), as well as a dataset lifted to GRCh37 and a dataset lifted to GRCh38.

Cohorts

reference year unique individuals phenotype assay deprecated
De Ligt et al. N Engl J Med 367:1921-1929 2012 100 intellectual disability exome no
Iossifov et al. Neuron 74:285-299 2012 686 (343 asd, 343 unaffected) autism spectrum disorder, unaffected siblings exome no
O'Roak et al. Nature 485:246-250 2012 206 autism spectrum disorder exome no
Rauch et al. Lancet 380:1674-1682 2012 51 intellectual disability exome no
Sanders et al. Nature 485:237-241 2012 452 (238 asd, 214 unaffected) autism spectrum disorder, unaffected siblings exome no
Epi4K Consortium. Nature 501:217-221 2013 264 epilepsy exome no
EuroEPINOMICS-RES Consortium. AJHG 95:360-370 2014 92 epilepsy exome no
Gilissen et al. Nature 511:344-347 2014 0 intellectual disability exome no, but only extends De ligt et al
De Rubeis et al. Nature 515:209-215 2014 1604 (1443 asd, 161 unaffected) autism spectrum disorder, unaffected siblings exome no
Iossifov et al. Nature 498:216-221 2014 3095 (1726 asd, 1369 unaffected) autism spectrum disorder, unaffected siblings exome no
Sanders et al. Neuron 87:1215-1233 2015 750 (314 asd, 436 unaffected) autism spectrum disorder, unaffected siblings exome no
Homsy et al. Science 350:1262-1266 2015 1213 congenital heart disease exome yes, superseded by Jin et al cohort
Lelieveld et al. Nature Neuroscience 19:1194-1196 2016 820 intellectual disability exome yes, superseded by Kaplanis et al cohort
McRae et al. Nature 542:433-438 2017 4293 intellectual disability exome yes, superseded by Kaplanis et al cohort
Jónsson et al. Nature 549:519-522 2017 1458 unaffected genome (no chrX in males) yes
Jin et al. Nature Genetics 49:1593-1601 2017 2465 congenital heart disease exome no
Yuen et al. Nature Neuroscience 20:602-611 2017 5265 autism spectrum disorder genome no
An et al. Science 362:eaat6576 2018 1902 affected, 1902 unaffected autism spectrum disorder, unaffected siblings genome no
Halldorsson et al. Science 2019 2976 unaffected, supersedes Jónsson et al genome no
Kaplanis et al. Nature 2019 31058 intellectual disability, supersedes Lelieveld et al and McRae et al cohorts exome no
Fu et al. Nature Genetics 54:1320-1331 2022 42607 autism spectrum disorder exome (no chrX) no, but Fu et al and Zhou et al mostly overlap
Zhou et al. Nature Genetics 54:1305-1319 2022 42607 autism spectrum disorder exome yes, by Fu et al, because Fu et al include all sample IDs

Excluded cohorts

We exclude some published cohorts with de novo mutations, for the reasons below:

  • Helbig et al in Genetics in Medicine 18:898-905. This study only reported the likely pathogenic de novo mutations.
  • Goldman et al. in Nature Genetics 48:935-939. This study includes monozygotic twins, so some de novo mutations are not independent, but does not include sample or family IDs that would permit exclusion of the monozygotic twins.
  • Goldmann et al. in Nature Genetics 50:487-492. This study only reported clustered DNMs, so not representative of all coding DNMs.

About

package for published de novo datasets in rare disease

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages