Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 1.88 KB

README.md

File metadata and controls

23 lines (14 loc) · 1.88 KB

MedSpaCy Dutch

This repository contains the resources for extracting concepts and their context using MedSpaCy.

For more information on MedSpaCy and installation instructions, please visit the MedSpaCy Github page.

Concept Extraction

To extract concepts from Dutch biomedical or clinical text, a reference dataset is required containing all the concepts and their terms that need to be extracted. Please note that we cannot directly provide the reference dataset as it includes the UMLS vocabularies and the Dutch SNOMED CT vocabulary.

Before downloading the UMLS, you will need to obtain a license from the National Library of Medicine. Similarly, for access to the Dutch SNOMED CT vocabulary, you will need to obtain a license from NICTIZ and follow their instructions.

The "QuickUMLS_resources" folder contains a jupyter notebook, that takes the UMLS and Dutch SNOMED CT files as input to build a concept reference database.

Context Detection

The Dutch rules for detecting context information about a concept are listed in the "Concept_resources" folder.

Setting up the MedSpaCy Dutch Pipeline

The pipeline can be set up in a similar manner to the English pipeline, but with links to the language-specific resources. You can find an example notebook here.

Studies using the Dutch MedSpaCy pipeline: