Skip to content

macairececile/terminology_project

Repository files navigation

terminology_project

Created by Cécile MACAIRE & Ludivine ROBERT

Aim

Python script which extracts and annotates terms from a corpus of articles about "Text-to-Speech System".

Context

The goal of this project is to develop a term identification system for a specific domain. It has been realized under the Terminology course, as part of the Natural language processing masters' degree in Université de Lorraine (Nancy).

Instructions

Requirements

Pandas

Spacy

Running process

The script to run is termsprocessing.py. The following instructions need to be filled out:

  • type of annotation:
    • 1 = terms annotation
    • 2 = terms annotation and IOB tags
    • 3 = terms annotation, IOB tags and Part-Of-Speech (POS) tags
  • input file that you want to annotate in .txt format
  • name of your output file (also in .txt format)

Finally, enter the following command in your terminal:

python termsprocessing.py -n 1  -i input.txt -o output.txt

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published