netDigger is a web scraping application built using .NET 8.0 and C# WPF in Visual Studio. It collects various types of data and exports them into organized and structured databases. The application features a modern UI design with detailed and comprehensive logging.
- Asynchronous Web Scraping: Efficiently scrape web pages using asynchronous tasks with minimized latency and multi-threading for parallel processing.
- Data Collection: Collects data such as PDFs, CSVs, DOCX, XLS, PPTX, TXT, Images, Videos, JSON, DBSQL, XML, HTML, PHP, JS, Archives, and Miscellaneous files.
- Comprehensive JSON, XML, and HTML Parsing: Utilizes advanced parsing techniques to extract valuable information from JSON, XML, and HTML documents, including finding and processing hidden element data and meta data.
- Database Integration: Organizes scraped URLs into SQLite databases based on their file types.
- Modern UI Design: User-friendly WPF interface with rich text logging.
- Detailed Logging: Comprehensive log messages with timestamps, log levels, and thread IDs.
- Export Options: Export scraped data to database files, CSV, and TXT formats.
- Multi OS Support: Compatible with Windows x64/x86/ARM, Linux and MacOS.
- .NET 8.0
- C#
- WPF (Windows Presentation Foundation)
- AngleSharp for HTML parsing
- PuppeteerSharp: A headless browser automation library for .NET.
- Newtonsoft.Json (Json.NET): A popular library for working with JSON in .NET.
- SQLite for database management
- Concurrent Collections for thread-safe operations
- .NET 8.0 Desktop Runtime or SDK Framework.
- Visual Studio 2022. (In case you want to build it yourself).
-
Clone the repository:
git clone https://github.com/your-username/netDigger.git cd netDigger
-
Open the solution file (netDigger.sln) in Visual Studio.
-
Build the project:
Select Build > Build Solution
.
Run the application:
Select Debug > Start Debugging
or press F5
.
- Fork the repository.
- Create a new branch (git checkout -b feature-branch).
- Commit your changes (git commit -m 'Add new feature').
- Push to the branch (git push origin feature-branch).
- Create a new Pull Request.
This project is licensed under the GNU Affero General Public License v3.0. See the LICENSE file for more details.
If you find RepoUp useful, consider supporting me by:
- Starring the repository on GitHub
- Sharing the tool with others
- Providing feedback and suggestions
- Follow me for more :)
For any issues or feature requests, please open an issue on GitHub. Happy coding!