Skip to content

Web scraping and data extraction by building a robust Python application that interacts with APIs and parses dynamic web content

Notifications You must be signed in to change notification settings

smissaertj/bs-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Nature Articles Scraper

Description

This Python project scrapes articles from the Nature website, based on user input for the number of pages and the type of articles to filter. The program dynamically navigates multiple pages of articles, filters them based on the specified type, and saves the filtered articles as .txt files into corresponding directories named by page number.

Each article is saved with a cleaned-up title as its filename, and the article content is extracted and stored in the file.

Features

  • Scrapes multiple pages of articles from the Nature website.
  • Filters articles based on user-specified type (e.g., News, Nature Briefing, etc.).
  • Saves the articles to separate text files, named after the article titles.
  • Organizes articles into directories corresponding to the page number.
  • Handles cases where no articles match the given type on a page by still crea

About

Web scraping and data extraction by building a robust Python application that interacts with APIs and parses dynamic web content

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages