Skip to content

Web scraping and Named Entity Recognition in Python

Notifications You must be signed in to change notification settings

Nathanpamart/text-mining

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text Mining Challenge

Given a list of webpage URLs in sources.txt, the challenge is to scrape the websites and perform Named Entity Recognition

Install the requirements into a python 3.7 virtual environment using conda env create -f environment.yml

Also, requires connection to a neo4j database. Follow instructions here: https://neo4j.com/developer/get-started/

Then follow the Jupyter Notebook in a sequential manner

Most of the functionality has also been created into a streamlit application. Run streamlit run app.py

About

Web scraping and Named Entity Recognition in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 63.1%
  • Python 27.3%
  • DIGITAL Command Language 9.4%
  • Io 0.2%