From 15953b6d8cd684637845401cfa9fc8d1723c6826 Mon Sep 17 00:00:00 2001 From: Florian Borchert Date: Fri, 12 Jul 2024 09:12:07 +0200 Subject: [PATCH] Update README.md --- README.md | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 6d02971..8b940dd 100644 --- a/README.md +++ b/README.md @@ -40,17 +40,10 @@ dataset = load_dataset("distemist", "distemist_linking_bigbio_kb") To use xMEN with existing NER pipelines, you can also create a dataset at runtime. -#### [spaCy](https://spacy.io/) - -```python -from xmen.data import from_spacy -docs = ... # list of spaCy docs with entity spans -dataset = from_spacy(docs) -``` +#### Span-based Formats -for an example, see: [examples/02_spaCy_German.ipynb](examples/02_spaCy_German.ipynb) - -#### [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) +Any span-based annotation format (e.g., based on character offsets), can be converted to a xMEN-compatible dataset. +For instance, using [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) predictions: ```python from span_marker import SpanMarkerModel @@ -62,6 +55,17 @@ from xmen.data import from_spans dataset = from_spans(preds, sentences) ``` +#### [spaCy](https://spacy.io/) + +```python +from xmen.data import from_spacy +docs = ... # list of spaCy docs with entity spans +dataset = from_spacy(docs) +``` + +for an example, see: [examples/02_spaCy_German.ipynb](examples/02_spaCy_German.ipynb) + + ## 🔧 Configuration and CLI xMEN provides a convenient command line interface to prepare entity linking pipelines by creating target dictionaries and pre-computing indices to link to concepts in them.