Simple walktrough for a named entitiy recognition (NER) setup for the German language

Different embeddings for Named Entity Recognition (NER) in German text are benchmarked. As stated in my bachelor thesis, a combination of BERT embeddings and Flair embeddings yields new best performances on the GermEval-14 NER dataset (F1-score of 86.62).

Setup

Flair Experiments: Flair and Google Colab

BERT Experiments: Transformers and Google Colab

Datasets

BIOES and BIO/IOB formats are considered in the evaluation.

The datasets used in my benchmark are Conll-03 and GermEval-14. Additionally, i compared several embeddings on a complaint dataset in my bachelor thesis. Unfortunately, this dataset is not public, but the performance can be found in the thesis.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
BERT scripts		BERT scripts
CoNLL 2003		CoNLL 2003
Flair scripts		Flair scripts
GermEval 2014		GermEval 2014
paper		paper
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple walktrough for a named entitiy recognition (NER) setup for the German language

Setup

Datasets

About

Releases

Packages

Languages

License

pascalhuszar/NER-benchmark

Folders and files

Latest commit

History

Repository files navigation

Simple walktrough for a named entitiy recognition (NER) setup for the German language

Setup

Datasets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages