GitHub - ZooBeasts/NLP_keyword_Summarization_for_physics_paper: This part has been integrated into Nanophotonics_design_command_interactive_chatbot project

Keyword extraction for short physics letters, using TFIDF. Extractive summarization will use Textrank and Abstractive summarization will use a pre-trained model and further apply to QA chatbot 28/09/23

More details that changed plz see Development_blog.txt

The task changed, so this project will apply NLTK to split sentences and words to achieve better results. Networkx package is used for textrank since self-written textrank.py has an issue returning an empty list.

Development blog:

added text summarization Maximal Marginal Relevance(MMR) 28/09/23

Texkrank summarization is uploaded and useable for extracting BBC news dataset: https://www.kaggle.com/datasets/pariza/bbc-news-summary. self-written textrank.py works for Chinese, not sure why in English it returns an empty list, will continue investigating. Word_embedding is used glove.6b.50d.txt.https://www.kaggle.com/datasets/adityajn105/glove6b50d (28/09/23)

Seems that nltk separates few flaws, but still able to extract 4 important words. 27/09/23 (end, problem solved for M, caused by not lower() the content)

Stil error in textrank. added simple partofspeech pos_tagging.py. TFIDF can extract the correct keywords in a length of 6. The result below is the keywords extracted from my paper. 22/09/23

Problem with textrank, I don't know why show ValueError: max() arg is an empty sequence, and why pass empty to min and max, didn't pass the unitest. (Yet TFIDF is working perfectly. Will add LDA later on ) 21/09/23

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Summarization		Summarization
keywords		keywords
test		test
Development_blog		Development_blog
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyword extraction for short physics letters, using TFIDF. Extractive summarization will use Textrank and Abstractive summarization will use a pre-trained model and further apply to QA chatbot 28/09/23

More details that changed plz see Development_blog.txt

Development blog:

About

Releases

Packages

Languages

License

ZooBeasts/NLP_keyword_Summarization_for_physics_paper

Folders and files

Latest commit

History

Repository files navigation

Keyword extraction for short physics letters, using TFIDF. Extractive summarization will use Textrank and Abstractive summarization will use a pre-trained model and further apply to QA chatbot 28/09/23

More details that changed plz see Development_blog.txt

Development blog:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages