From d8a8af8d05ee0b040e5e047b2c32d2c84449fbd8 Mon Sep 17 00:00:00 2001 From: Karim shoair Date: Sat, 16 Nov 2024 21:29:05 +0200 Subject: [PATCH] Updating the README to reflect new changes --- README.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 29e6c6b..34b4068 100644 --- a/README.md +++ b/README.md @@ -6,9 +6,9 @@ Dealing with failing web scrapers due to anti-bot protections or website changes Scrapling is a high-performance, intelligent web scraping library for Python that automatically adapts to website changes while significantly outperforming popular alternatives. For both beginners and experts, Scrapling provides powerful features while maintaining simplicity. ```python ->> from scrapling import Fetcher, StealthyFetcher, PlayWrightFetcher +>> from scrapling.default import Fetcher, StealthyFetcher, PlayWrightFetcher # Fetch websites' source under the radar! ->> page = StealthyFetcher().fetch('https://example.com', headless=True, network_idle=True) +>> page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True) >> print(page.status) 200 >> products = page.css('.product', auto_save=True) # Scrape data that survives website design changes! @@ -211,12 +211,21 @@ python -m browserforge update ``` ## Fetching Websites Features -All fetcher-type classes are imported in the same way +You might be a little bit confused by now so let me clear things up. All fetcher-type classes are imported in the same way ```python from scrapling import Fetcher, StealthyFetcher, PlayWrightFetcher ``` And all of them can take these initialization arguments: `auto_match`, `huge_tree`, `keep_comments`, `storage`, `storage_args`, and `debug` which are the same ones you give to the `Adaptor` class. +If you don't want to pass arguments to the generated `Adaptor` object and want to use the default values, you can use import instead for cleaner code: +```python +from scrapling.default import Fetcher, StealthyFetcher, PlayWrightFetcher +``` +then use it right away without initializing like: +```python +page = StealthyFetcher.fetch('https://example.com') +``` + Also, the `Response` object returned from all fetchers is the same as `Adaptor` object except it has these added attributes: `status`, `reason`, `cookies`, `headers`, and `request_headers`. All `cookies`, `headers`, and `request_headers` are always of type `dictionary`. > [!NOTE] > The `auto_match` argument is enabled by default which is the one you should care about the most as you will see later.