Skip to content

STARR OMOP Useful SQL Queries

Michael Wornow edited this page Mar 31, 2023 · 6 revisions

STARR OMOP Sanity Checks

Statistics

Statistics from STARR OMOP som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.

Total # of...

  • Unique patients: 3,668,615 patients
SELECT COUNT(*) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.person`
  • Unique patients with an in-patient visit: 246,357 patients
SELECT COUNT(DISTINCT(person_id)) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` 
-- Get "Visit/IP" events
WHERE visit_concept_id=9201 
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
  • Unique patients with an out-patient visit: 1,060,684 patients
SELECT COUNT(DISTINCT(person_id)) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` 
-- Get "Visit/OP" events
WHERE visit_concept_id=9202
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
  • In-patient visits: 354,077 visits
SELECT COUNT(*) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` 
-- Get "Visit/IP" events
WHERE visit_concept_id=9201 
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
  • Outpatient visits: 8,656,948 visits
SELECT COUNT(*) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` 
-- Get "Visit/OP" events
WHERE visit_concept_id=9202
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
  • Deaths: 37,192 total deaths (Note: all of them have a NULL value for death_DATETIME -- use death_DATE instead)
SELECT COUNT(*) FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.death`
  • Deaths during inpatient visit: 3,731 deaths
SELECT d.person_id, vo.visit_start_DATETIME, d.death_DATE, vo.visit_end_DATETIME, vo.visit_concept_id
FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.death` as d
  LEFT JOIN `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` as vo
    ON vo.person_id = d.person_id
-- Get "Visit/IP" events
WHERE vo.visit_concept_id = 9201
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
-- Ensure that death occurs during visit
AND vo.visit_start_DATETIME <= d.death_DATE AND d.death_DATE <= vo.visit_end_DATETIME
  • Deaths during ICU visits: 290 visits
SELECT d.person_id, vo.visit_start_DATETIME, d.death_DATE, vo.visit_end_DATETIME, vo.visit_concept_id
FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.death` as d
  LEFT JOIN `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` as vo
    ON vo.person_id = d.person_id
  LEFT JOIN `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.care_site` as cs
    ON vo.care_site_id = cs.care_site_id
-- Get "ICU" visits
WHERE cs.care_site_name LIKE "%ICU%"
-- Filter out visits that occur at a single point in time
AND visit_start_DATETIME < visit_end_DATETIME
-- Ensure that death occurs during visit
AND vo.visit_start_DATETIME <= d.death_DATE AND d.death_DATE <= vo.visit_end_DATETIME
  • ICU visits: 90,505 visits (visit_detail seems more comprehensive than visit_occurrence for ICU queries)
SELECT vo.person_id, vo.visit_detail_start_DATETIME, vo.visit_detail_end_DATETIME
  FROM `som-nero-nigam-starr.starr_omop_cdm5_deid_lite_2023_02_08.visit_detail` as vo
  LEFT JOIN `som-nero-nigam-starr.starr_omop_cdm5_deid_lite_2023_02_08.care_site` as cs
    ON vo.care_site_id = cs.care_site_id
-- Limit to just ICU visits
WHERE cs.care_site_name LIKE "%ICU%"
and vo.visit_detail_start_DATETIME < vo.visit_detail_end_DATETIME

Sanity Checks

Death dates

Align each patients with their last visit date and recorded death date:

-- Get last visit for each patient
WITH last_visit_datetime AS (
  SELECT MAX(visit_end_DATETIME) as last_visit_time, COUNT(*) as total_number_of_visits, person_id 
  FROM  `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.visit_occurrence` 
  -- Needed to filter out single time point visits (which weirdly occur frequently after death dates)
  WHERE visit_start_DATETIME < visit_end_DATETIME
  GROUP BY person_id
)
SELECT d.person_id, 
        last_visit_datetime.last_visit_time, 
        d.death_DATE, 
        last_visit_datetime.total_number_of_visits,
        CASE WHEN d.death_DATE <= last_visit_datetime.last_visit_time THEN True ELSE False END AS death_occurs_before_visit_end,
        DATETIME_DIFF(d.death_DATE, last_visit_datetime.last_visit_time, DAY) AS days_after_visit_end_till_death
FROM `som-rit-phi-starr-prod.starr_omop_cdm5_deid_2023_02_08.death` as d
LEFT JOIN last_visit_datetime 
  ON d.person_id = last_visit_datetime.person_id
  ORDER BY person_id

Note that there are many patients with visits after their death date, sometimes even years after (see patient_id = 29924381 or patient_id = 29924660 for example). These after-death visits are all single time point events, so we need to filter them out.

Days between death and last visit ends

Birth dates

Distribution of patient birth years (jittered)

Birth Years

Gotchas

Deaths

Most of the deaths in the STARR OMOP dataset are "in-hospital" aka "in-patient" mortality. They are not all-cause, b/c all-cause requires linkage to the Social Security Death Index (which we don't do). They are also not all inhospital, see:

Screen Shot 2023-03-23 at 12 43 19 PM