This repo contains chain velds encapsulating a spacy NER training setup on APIS data.
- git
- docker compose (note: older docker compose versions require running
docker-compose
instead ofdocker compose
)
Clone this repo with all its submodules
git clone --recurse-submodules https://github.com/veldhub/veld_chain__train_spacy_apis_ner.git
The following chain velds were used. Open the respective veld yaml file for more information.
Cleaning and converting json into spaCy docbin
docker compose -f veld_convert.yaml up
Creates a spacy training config according to passed arguments. See https://spacy.io/usage/training/#config for the target outcome.
docker compose -f veld_create_config.yaml up
A NER trainig setup, utilizing spaCy 3's config system.
docker compose -f veld_train.yaml up
Analyses out-of vocabulary occurrences of training data.
docker compose -f veld_analysis.yaml up
Pushing spacy model to huggingface.
docker compose -f veld_publish_to_hf.yaml up