Skip to content

Using Stanford's CoreNLP, 'stanza', module to convert scraped reviews (TripAdvisor - Airlines) into CoNLL formatted .tsv files (for a sequence labelling task)

Notifications You must be signed in to change notification settings

PeterCaine/CoreNLP_Scraped_Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Processing files with Stanford CoreNLP - Python module, Stanza

Data used here has been scraped from TripAdvisor airlines (see https://github.com/PeterCaine/Trip-Advisor_Scrape.git for details)

This uses Stanford CoreNLP preprocessing module, Stanza to add information regarding lemma, pop (xpos & upos) as well as dependency relation and dependency head. This is used to convert the reviews into a CoNLL formatted .tsv file for use in a sequence labelling task

About

Using Stanford's CoreNLP, 'stanza', module to convert scraped reviews (TripAdvisor - Airlines) into CoNLL formatted .tsv files (for a sequence labelling task)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages