Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local environment #993

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 0 additions & 19 deletions working/DevV5_DDL.sql
Original file line number Diff line number Diff line change
Expand Up @@ -240,23 +240,12 @@ CREATE TABLE concept_synonym_manual (
language_concept_id int4 NOT NULL
);

/*
the next four columns are for our internal use, you don't need to include them in your work environment:
created
created_by
modified
modified_by
*/
--Create a base table for manual relationships, it stores all manual relationships from all vocabularies
DROP TABLE IF EXISTS base_concept_relationship_manual;
CREATE TABLE base_concept_relationship_manual (
LIKE concept_relationship_manual,
concept_id_1 INT4 NOT NULL,
concept_id_2 INT4 NOT NULL,
created TIMESTAMPTZ NOT NULL,
created_by INT4 NOT NULL REFERENCES admin_pack.virtual_user(user_id),
modified TIMESTAMPTZ,
modified_by INT4 REFERENCES admin_pack.virtual_user(user_id),
CONSTRAINT idx_pk_base_crm PRIMARY KEY (
concept_code_1,
concept_code_2,
Expand All @@ -271,10 +260,6 @@ DROP TABLE IF EXISTS base_concept_manual CASCADE;
CREATE TABLE base_concept_manual (
LIKE concept_manual,
concept_id INT4 NOT NULL,
created TIMESTAMPTZ NOT NULL,
created_by INT4 NOT NULL REFERENCES admin_pack.virtual_user(user_id),
modified TIMESTAMPTZ,
modified_by INT4 REFERENCES admin_pack.virtual_user(user_id),
CONSTRAINT idx_pk_base_cm PRIMARY KEY (
concept_code,
vocabulary_id
Expand All @@ -286,10 +271,6 @@ DROP TABLE IF EXISTS base_concept_synonym_manual CASCADE;
CREATE TABLE base_concept_synonym_manual (
LIKE concept_synonym_manual,
concept_id INT4 NOT NULL,
created TIMESTAMPTZ NOT NULL,
created_by INT4 NOT NULL REFERENCES admin_pack.virtual_user(user_id),
modified TIMESTAMPTZ,
modified_by INT4 REFERENCES admin_pack.virtual_user(user_id),
CONSTRAINT idx_pk_base_csm PRIMARY KEY (
synonym_vocabulary_id,
synonym_name,
Expand Down
8 changes: 4 additions & 4 deletions working/generic_update.sql
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ BEGIN
END $$;

--1.3 Start logging manual work
PERFORM admin_pack.LogManualChanges();
/*PERFORM admin_pack.LogManualChanges();*/

--1.4 Clear concept_id's just in case
UPDATE concept_stage
Expand Down Expand Up @@ -243,8 +243,8 @@ BEGIN
SELECT concept_id, LEAD (concept_id) OVER (ORDER BY concept_id) next_id FROM
(
SELECT concept_id FROM concept
UNION ALL
SELECT concept_id FROM devv5.concept_blacklisted --blacklisted concept_id's (AVOF-2395)
/*UNION ALL
SELECT concept_id FROM devv5.concept_blacklisted*/ --blacklisted concept_id's (AVOF-2395)
) AS i
WHERE concept_id >= 581480 AND concept_id < 500000000
) AS t
Expand Down Expand Up @@ -1083,7 +1083,7 @@ BEGIN
ANALYZE concept_synonym;

--36. Update concept_id fields in the "basic" manual tables for storing in audit
PERFORM admin_pack.UpdateManualConceptID();
/*PERFORM admin_pack.UpdateManualConceptID();*/

--QA (should return NULL)
--SELECT * FROM QA_TESTS.GET_CHECKS();
Expand Down
133 changes: 133 additions & 0 deletions working/local_environment/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Preparing a Local Environment for the Vocabulary Development Process

# Description

This document describes the local database preparation that needs to be done for the vocabulary development process

# Prerequisites

PostgreSQL 14 or higher.

# Creating Extensions

Pgcrypto and Tablefunc extensions should be installed for some functions to work correctly.
Create extensions:
>CREATE EXTENSION pg_trgm;
>CREATE EXTENSION tablefunc;
>CREATE EXTENSION plpython3u;

# Creating Schemas

As an initial step, it needs to create schemas in the database.
It is also necessary to give an appropriate name to a schema for the vocabulary development process (e.g. **dev_hemonc** for the HemOnc vocabulary).
>-- replace <dev_schema_name> with an actual schema name:
>CREATE SCHEMA <dev_schema_name>;
>CREATE SCHEMA devv5;
>CREATE SCHEMA sources;
>CREATE SCHEMA qa_tests;
>CREATE SCHEMA vocabulary_pack;
>CREATE SCHEMA admin_pack;

# Creating Functions

This section describes the features and procedures that should be installed in the database in the appropriate schemas.
All functions except Generic_Update() (which is in the **local_environment** brantch) are in the **master** branch.

## devv5

- GenericUpdate() (a version of the GenericUpdate function for a local environment) [Vocabulary-v5.0/working/generic_update.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/local_environment/working/generic_update.sql)
- FastRecreateSchema() [Vocabulary-v5.0/working/fast_recreate_schema.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/fast_recreate_schema.sql)
- GetPrimaryRelationshipID() [Vocabulary-v5.0/working/packages/admin_pack/GetPrimaryRelationshipID.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/admin_pack/GetPrimaryRelationshipID.sql)
- Functions from [Vocabulary-v5.0/working/packages/DevV5_additional_functions](https://github.com/OHDSI/Vocabulary-v5.0/tree/master/working/packages/DevV5_additional_functions)

## vocabulary_pack

- DropFKConstraints() [Vocabulary-v5.0/working/packages/vocabulary_pack/DropFKConstraints.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/DropFKConstraints.sql)
- SetLatestUpdate() [Vocabulary-v5.0/working/packages/vocabulary_pack/SetLatestUpdate.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/SetLatestUpdate.sql)
- ProcessManualSynonyms() [Vocabulary-v5.0/working/packages/vocabulary_pack/ProcessManualSynonyms.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/ProcessManualSynonyms.sql)
- CheckManualSynonyms() [Vocabulary-v5.0/working/packages/vocabulary_pack/CheckManualSynonyms.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/CheckManualSynonyms.sql)
- ProcessManualRelationships() [Vocabulary-v5.0/working/packages/vocabulary_pack/ProcessManualRelationships.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/ProcessManualRelationships.sql)
- CheckReplacementMappings() [Vocabulary-v5.0/working/packages/vocabulary_pack/CheckReplacementMappings.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/CheckReplacementMappings.sql)
- AddFreshMAPSTO() [Vocabulary-v5.0/working/packages/vocabulary_pack/AddFreshMAPSTO.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/AddFreshMAPSTO.sql)
- GetActualConceptInfo() [Vocabulary-v5.0/working/packages/vocabulary_pack/GetActualConceptInfo.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/GetActualConceptInfo.sql)
- DeprecateWrongMapsTo() [Vocabulary-v5.0/working/packages/vocabulary_pack/DeprecateWrongMapsTo.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/DeprecateWrongMapsTo.sql)
- DeleteAmbiguousMapsTo() [Vocabulary-v5.0/working/packages/vocabulary_pack/DeleteAmbiguousMapsTo.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/vocabulary_pack/DeleteAmbiguousMapsTo.sql)

## qa_tests

- Functions from [Vocabulary-v5.0/working/packages/QA_TESTS](https://github.com/OHDSI/Vocabulary-v5.0/tree/master/working/packages/QA_TESTS)

# Creating Tables

## Devv5

In case the table structure already exists in the database, this part can be skipped.

When tables are created for the first time, everything should be executed from the script except creating constraints and indexes so that they do not interfere with the import of vocabularies from Athena.

- DevV5_DDL.sql (create only tables structures without any constraints)
[Vocabulary-v5.0/working/DevV5_DDL.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/local_environment/working/DevV5_DDL.sql)
- Prepare_manual_tables.sql [Vocabulary-v5.0/working/packages/admin_pack/prepare_manual_tables.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/admin_pack/prepare_manual_tables.sql)

## development schema

- DevV5_DDL.sql without created, created_by, modified, modified_by fields
[Vocabulary-v5.0/working/DevV5_DDL.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/DevV5_DDL.sql)
- Prepare_manual_tables.sql [Vocabulary-v5.0/working/packages/admin_pack/prepare_manual_tables.sql](https://github.com/OHDSI/Vocabulary-v5.0/blob/master/working/packages/admin_pack/prepare_manual_tables.sql)

## Update Vocabulary table structure

If the table structure was created before 03/04/2024, the following script should be run:
- 2024-03-04.sql (working/manual_changes/2024/2024-03-04.sql)

## sources

Creating source tables according to instructions for your vocabulary.

# Import Vocabulary Data from Athena

To initially fill out the vocabulary tables, it needs to download the necessary vocabularies from [Athena (ohdsi.org)](https://athena.ohdsi.org/vocabulary/list) and import them using the “copy” command in psql.

The following vocabularies have to be installed:
- SNOMED
- (TBD)

Also there is the video instruction how to do it using DBeaver:
[Demo: Getting Vocabularies Into My OMOP CDM (Michael Kallfelz • Nov. 9 OHDSI Community Call) (youtube.com)](https://www.youtube.com/watch?v=FCHxAQOBptE)

The data should be imported to the tables in **devv5** schema.

Before importing dictionaries from CSV files, the client encoding should be set to “UTF8” (psql):
>SET client_encoding = ‘UTF8’;

The “COPY” command to import data from CSV files (psql):
>COPY DRUG_STRENGTH FROM '&lt;path_to_csv_file&gt;\\DRUG_STRENGTH.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY CONCEPT FROM '&lt;path_to_csv_file&gt;\\CONCEPT.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY CONCEPT_RELATIONSHIP FROM '&lt;path_to_csv_file&gt;\\CONCEPT_RELATIONSHIP.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY CONCEPT_ANCESTOR FROM '&lt;path_to_csv_file&gt;\\CONCEPT_ANCESTOR.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY CONCEPT_SYNONYM FROM '&lt;path_to_csv_file&gt;\\CONCEPT_SYNONYM.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY VOCABULARY FROM '&lt;path_to_csv_file&gt;\\VOCABULARY.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY RELATIONSHIP FROM '&lt;path_to_csv_file&gt;\\RELATIONSHIP.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY CONCEPT_CLASS FROM '&lt;path_to_csv_file&gt;\\CONCEPT_CLASS.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';
>
>COPY DOMAIN FROM '&lt;path_to_csv_file&gt;\\DOMAIN.csv' WITH DELIMITER E'\\t' CSV HEADER QUOTE E'\\b';

After all the data has been imported, it is necessary to create constraints and indexes from the DevV5_DDL.sql, mentioned above.

# Import Source Data

Filling out source tables according to instructions for vocabulary you are going to work.

# Start the Vocabulary Development Process

1. **devv5**.FastRecreateSchema()
2. Load_Stage in **dev_schema_name**
3. **devv5**.GenericUpdate
4. **qa_tests**.get_\* functions