Skip to content
Dimitri Papadopoulos Orfanos edited this page Feb 26, 2021 · 64 revisions

We discuss software infrastructure for databank operations of the c-VEDA project.

Recruitment and acquisition centres

The c-VEDA project is an accelerated and planned missing longitudinal study. Participants are organized into 3 age bands which may be administered different instruments:

Age band ID Age band name Age range
C1 Children 6-11
C2 Adolescents 12-17
C3 Adults 18-23

The multiple recruitment and acquisition centres represent the socio-cultural diversity of India:

  • 5 geographical regions
    • Punjab and adjoining states (PGIMER)
    • Eastearn Coalfields (KOLKATA)
    • Northeast India (IMPHAL)
    • Bengaluru & Mysuru (MYSORE, NIMHANS, SJRI)
    • Chittoor (RISHIVALLEY)
  • urban and rural areas
ID CENTRE Site Location MRI
11 PGIMER Postgraduate Institute of Medical Education & Research Chandigarh Siemens Verio
12 IMPHAL Regional Institute of Medical Sciences Imphal, Manipur
13 KOLKATA Regional Occupational Health Centre (Eastern),
Indian Council of Medical Research
Kolkata, West Bengal Siemens Trio
14 RISHIVALLEY Rishi Valley Rural Health Centre Rishy Valley, Andhra Pradesh
15 MYSORE CSI Holdsworth Memorial Hospital Mysuru, Karnataka Philips Ingenia
16 NIMHANS NIMHANS Bengaluru, Karnataka Siemens Skyra /
Philips Ingenia
17 SJRI St. John's Research Institute Bengaluru, Karnataka

From acquisition centres to databank

Acquisition centres send pseudonymized data to the database team:

  • Acquisition centres collect clinical and environmental data using Psytools. The databank team download CSV files pseudonymized with PSC1 identifiers directly from the Delosis server.
  • Acquisition centres pseudonymize biological samples at the source using PSC1 identifiers. The c-VEDA biobank team collect samples and collate genetic data before sending them to the databank.
  • Acquisition centres anonymize DICOM files before sending them to the databank. Again subjects are identified by their PSC1 identifier only.

The database team, acting as a trusted third party, pseudonymize data a second time, by converting dates to age and PSC1 identifiers to PSC2. We provide a list of valid identifiers to help end-users detect and investigate possible identifier errors.

Databank operations

System software has been installed on the c-VEDA server as described in file /cveda/databank/framework/INSTALL.txt.

Application software has been deployed on the c-VEDA server under /cveda/databank/framework/git:

cd /cveda/databank/framework/git
git clone https://github.com/cveda/cveda_databank.git
git clone https://github.com/cveda/cveda_misc.git
git clone https://github.com/cveda/cveda_r.git

Databank operations, based on the above application software, are described in specific pages: