A backbone for news analysis prototype.
This application scrapes the headlines in a specified 5 days period from https://www.bbc.com/zhongwen/simp and outputs a wordcloud. The starting date of the time period can be specified. If starting date time period is specified, the application scrapes the headlines from the past 5 days.
Note: The period must be after 2015 as the website did not exist before then.
项目说明: 对某新闻网站进行爬取,或定时对实时新闻进行自动爬取。该程序自动爬取BBC中文网的新闻头条,并将其中的关键词组织成个文字云。
Attached are examples of a wordclouds generated.
The image on the left is generated for the month of May 2023 while the image on the right is for May 2020. The major happenings around the world can easily be identified through the wordclouds. Have fun!