Pravda News Extractor

Description

The Pravda News Extractor is a Python script designed to fetch news data from a specified Pravda domain, extract key details using Beautiful Soup, and save the collected data in a JSON format. The script handles different formats of HTML content and iteratively collects news items until no more are found.

Features

Fetch initial news items from a specified domain's API.
Extract news items including ID, image URL, link, title, category, and timestamp.
Continue fetching additional news items based on the last ID until all are collected.
Handle different HTML content structures for image sources.
Save extracted news items in a JSON file with a naming convention based on the domain and current date.

Installation

Ensure Python 3 is installed on your system.
Install Beautiful Soup 4 and Requests library:
```
pip install beautifulsoup4 requests
```
Download pravda-extract.py from this repository.

Usage

Run the script with the domain as an argument:

python3 pravda-extract.py [DOMAIN]

For example:

python3 pravda-extract.py pravda-fi.com

This command will fetch news from 'pravda-fi.com' and save it in a JSON file named pravda-fi.com_DD-MM-YY.json.

Dependencies

Python 3
Beautiful Soup 4
Requests

Contributing

Contributions, issues, and feature requests are welcome. Feel free to check issues page if you want to contribute.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pravda-extract.py		pravda-extract.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pravda News Extractor

Description

Features

Installation

Usage

Dependencies

Contributing

License

About

Languages

License

CheckFirstHQ/Pravda-links-extractor

Folders and files

Latest commit

History

Repository files navigation

Pravda News Extractor

Description

Features

Installation

Usage

Dependencies

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Languages