Text Mining Challenge

Given a list of webpage URLs in sources.txt, the challenge is to scrape the websites and perform Named Entity Recognition

Install the requirements into a python 3.7 virtual environment using conda env create -f environment.yml

Also, requires connection to a neo4j database. Follow instructions here: https://neo4j.com/developer/get-started/

Then follow the Jupyter Notebook in a sequential manner

Most of the functionality has also been created into a streamlit application. Run streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
json		json
.gitignore		.gitignore
README.md		README.md
app.py		app.py
environment.yml		environment.yml
mining.ipynb		mining.ipynb
sources.txt		sources.txt
test_file.txt		test_file.txt

Provide feedback