Given a list of webpage URLs in sources.txt
, the challenge is to scrape the websites and perform Named Entity Recognition
Install the requirements into a python 3.7 virtual environment using conda env create -f environment.yml
Also, requires connection to a neo4j database. Follow instructions here: https://neo4j.com/developer/get-started/
Then follow the Jupyter Notebook in a sequential manner
Most of the functionality has also been created into a streamlit application. Run streamlit run app.py