Data model and migration: high-level overview

Current data model

Data model after migration - ER-Diagram

Entities

Trial - the main entity, clinical study (16k)
Source - source of data, e.g. "Cochrane" (1)
Condition - illness, injury, impairment etc (256)
Method - trial type, methodology (45)
Drug - a chemical substance to heal people (0)
Review - a publication about a trial (20k)
Document - blob document to associate with any entity (0)

Relations

Any Trial has a Source (where we've got the data).

With Trial can be associated one or many (m2m relation):

Condition
Method
Drug
Review
Document

With Review can be associated one or many (m2m relation):

Document

Cochrane's missing

The cochrane database lacks:

structured data about Drug (only as a free text interventions mixed with other data)
codified data about Condition (only as a free text)
long scientific title for Trial.scientific_title
data about sponsors for Trial.source_of_funding
trial date range information for Trial.date_from, Trial.date_to
data about exclusion criteria for Trial.exclusion_criteria
concrete data about participants age for Trial.age_from, Trial.age_to
codified data about interventions for Trial.interventions (only as a free text)
codified data about outcomes for Trial.outcomes (only as a free text)

Current data model's missing

Important Cochrane's data we don't use:

data about Review's reviews like date, reviewer etc
data about reviewers (306 items in Cochrane)

Structuring data

For now we have not the data we've logically structured while the migration process. We've converted some m2m tables to arrays, added enum for sex etc. But it's not an adding a new structure.

Notes

The Cochrane database lacks structured data in a comparision with the data from https://clinicaltrials.gov/. Having (and learning from) more datasets we can improve our data model, make it more structured. It means the Cochrane's data also can be improved after we've got other datasets. For example we can parse trial-drug relations after getting some drug list.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OVERVIEW.md

OVERVIEW.md

Data model and migration: high-level overview

Current data model

Entities

Relations

Cochrane's missing

Current data model's missing

Structuring data

Notes

Files

OVERVIEW.md

Latest commit

History

OVERVIEW.md

File metadata and controls

Data model and migration: high-level overview

Current data model

Entities

Relations

Cochrane's missing

Current data model's missing

Structuring data

Notes