Skip to content

Project P13: Content Aggregator

Soumya Ghosh Dastidar edited this page Jun 29, 2019 · 2 revisions

Goal

To create a content aggregator using python that can collect article headlines and summary from source websites and present them in visually appealing manner through a front end client web app.

Technology stack

  1. Web Scraping using Python (Requests/Urllib2 and BeautifulSoup)
  2. Database (SQLite)
  3. Python-Flask for hosting web app.
  4. JINJA2 templates and HTML/CSS2

Road Map

  • Creating the Web Scraper (2-3 days)
  • The trainees will learn about web scraping and create a simple web scraper that can extract the headlines and a short summary of posts from various websites.
  • Storing Data in DB ( 1 day)
  • After successfully extracting the data from the websites the data will be stored in a simple SQLite DB, the trainees will learn to work with databases and learn the CRUD functionalities.
  • Creating Web App (2-3 days)
  • Finally to display the scraped data the trainees will create a simple FLASK application and learn about the JINJA templating engine to cater the content dynamically

Conclusion

The trainees on completion will have basic knowledge about web scraping, flask web framework and database concepts. These concepts can later on be used by the trainees for their own personal projects.

Resources