Skip to content

Latest commit

 

History

History
59 lines (34 loc) · 2.28 KB

README.md

File metadata and controls

59 lines (34 loc) · 2.28 KB

Alzheimer's Disease prediction from exRNA in blood

The goal of this project is to predict Alzheimer Disease (AD) using blood extracellular RNA sequencing data. Currently, no gene panel is found to be able to reproducibly classify AD and control subjects across multiple research cohorts.

We have 3 studies available:

  • Data from our SILVER-seq technique.
  • Data from Toden et al. published on Bioproject at this link.
  • Data from Burgos et al. uploaded on dbGaP and downloadable under official request.

We aim to obtain a gene set that can be used together with personal information that are shared among the cohorts, such as sex, age, ApoE status, etc. to increase the power of the prediction. We might decide to discard some samples if any of the features chosen in our model is not available for them.

Data were processed consistenly using STAR as aligner with --sjdbGTFfile Homo_sapiens.GRCh38.84.chr.gtf and then featureCounts to count uniquely mapped reads (MAPQ = 255) over exons. Gene IDs are consistent among the 3 datasets and are ENSEMBL IDs.

Per each dataset we provide:

  • Metadata with sample information.
  • Raw count table where genes are on the rows and samples are on the columns.

SILVER-seq

Converter samples are available but not considered here.

115 samples in total:

  • 41 Normal (N) from 9 donors
  • 74 Alzheimer Disease (AD) from 15 donors

Data:

Toden et al.

One sample from the pool is discarded being control water. Other 3 samples of patient 1031 are discarded because this one appears as a converter (1 sample NCI, 2 samples AD).

334 samples in total:

  • 164 Normal (NCI) from 114 donors
  • 170 Alzheimer Disease (AD) from 126 donors

Data:

Burgos et al.

267 samples in total:

  • 138 Normal (Control) from 74 donors
  • 129 Alzheimer Disease (Alzheimer's Disease) from 70 donors

Data: