Name		Name	Last commit message	Last commit date
parent directory ..
__pycache__		__pycache__
csvs		csvs
txts		txts
README.md		README.md
chunk_candidate_relations_triples.py		chunk_candidate_relations_triples.py
corenlp_chunk_candidate_relations_triples.py		corenlp_chunk_candidate_relations_triples.py
data_spider.py		data_spider.py
pre_process_text.py		pre_process_text.py
scenarios.py		scenarios.py
tags_spider.py		tags_spider.py
test.py		test.py

README.md

The implementation of HDSKG Chunking

python tags_spider.py --start 1 --end 5 to spider tags from page 1 to 5 in stackoverflow ranked by popularity.
python .\corenlp_chunk_candidate_relations_triples.py to spider data from tags' page, preprocess text(resolve coreference) and chunk candidate relation triples
After above 2 steps,
- In txts folder you can get tags.txt generated by spider tag process
- In csvs folder you can get input_sentence.csv generated by preprocessing text and candidate_relation_triples.csv generated by chunk candidate relation triples