Final Year Project Group RAYW4: A Knowledge Graph-based Recommendation Website for Computer Science Learners
We took reference from the selenium official page https://www.selenium.dev/documentation/webdriver/ to do the crawling
Crawling_final.ipynb
: Scraping the article HTML source code from website using seleniumhtml2txt.ipynb
: Extract text and heading from HTML source code to txt files
Follow the instructions in https://github.com/stanfordnlp/stanza to pip install the stanza library and run the code after it. This subfolder includes two .py files:
corenlp.py
: Extraction of OpenIE triples and their wiki entitiescorenlp_wiki.py
: Tokenization of the article and the extraction of tokens' wiki entities
This subfolder contains two schemes to get the knowledge graph embedding referencing https://github.com/thunlp/Fast-TransX
wikidata_server.py
: Server for requesting OpenKE WikiData KG entity embeddingskg_preprocess.py
: Preprocess the data stream for trainingFast-TransX
: THU C++ implementation for KG training
data_loader.py
: Data Loader for tensorflow versionDKN.py
: Tensorflow implementation of the DKN modeltrain.py
: training function of the Tensorflow implementationmain.py
: Microsoft recommenders version implementation of the training flow
See the readme in the subfolder for more instructions to run the code
We wrote the frontend on the Wix online editor. This code is a copy from the editor, where each page is saved independently in a .js file.
home.js
: HOME pagesearch.js
: SEARCH pagecollection.js
: COLLECTION pageinterests.js
: INTERESTS pagesignup.js
: REGISTER pagelogin.js
: LOGIN page