Skip to content

A .NET 8.0 C# WPF desktop application for web scraping data into structured databases with a modern UI, comprehensive logging and optimized high performance.

License

Notifications You must be signed in to change notification settings

bitArtisan1/netDigger

Repository files navigation

netDigger

netDigger is a web scraping application built using .NET 8.0 and C# WPF in Visual Studio. It collects various types of data and exports them into organized and structured databases. The application features a modern UI design with detailed and comprehensive logging.

netDigger Logo.

A Powerful In-Depth Web Scraping Application.

Features

  • Asynchronous Web Scraping: Efficiently scrape web pages using asynchronous tasks with minimized latency and multi-threading for parallel processing.
  • Data Collection: Collects data such as PDFs, CSVs, DOCX, XLS, PPTX, TXT, Images, Videos, JSON, DBSQL, XML, HTML, PHP, JS, Archives, and Miscellaneous files.
  • Comprehensive JSON, XML, and HTML Parsing: Utilizes advanced parsing techniques to extract valuable information from JSON, XML, and HTML documents, including finding and processing hidden element data and meta data.
  • Database Integration: Organizes scraped URLs into SQLite databases based on their file types.
  • Modern UI Design: User-friendly WPF interface with rich text logging.
  • Detailed Logging: Comprehensive log messages with timestamps, log levels, and thread IDs.
  • Export Options: Export scraped data to database files, CSV, and TXT formats.
  • Multi OS Support: Compatible with Windows x64/x86/ARM, Linux and MacOS.

Technologies Used

  • .NET 8.0
  • C#
  • WPF (Windows Presentation Foundation)
  • AngleSharp for HTML parsing
  • PuppeteerSharp: A headless browser automation library for .NET.
  • Newtonsoft.Json (Json.NET): A popular library for working with JSON in .NET.
  • SQLite for database management
  • Concurrent Collections for thread-safe operations

Prerequisites

  • .NET 8.0 Desktop Runtime or SDK Framework.
  • Visual Studio 2022. (In case you want to build it yourself).

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/netDigger.git
    cd netDigger
  2. Open the solution file (netDigger.sln) in Visual Studio.

  3. Build the project:

Select Build > Build Solution. Run the application:

Select Debug > Start Debugging or press F5.

Contribution

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Commit your changes (git commit -m 'Add new feature').
  4. Push to the branch (git push origin feature-branch).
  5. Create a new Pull Request.

License

This project is licensed under the GNU Affero General Public License v3.0. See the LICENSE file for more details.

Support Me

If you find RepoUp useful, consider supporting me by:

  • Starring the repository on GitHub
  • Sharing the tool with others
  • Providing feedback and suggestions
  • Follow me for more :)


For any issues or feature requests, please open an issue on GitHub. Happy coding!

octodance

About

A .NET 8.0 C# WPF desktop application for web scraping data into structured databases with a modern UI, comprehensive logging and optimized high performance.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages