Skip to content

Commit

Permalink
Updating the README to reflect new changes
Browse files Browse the repository at this point in the history
  • Loading branch information
D4Vinci committed Nov 16, 2024
1 parent b60ac7f commit d8a8af8
Showing 1 changed file with 12 additions and 3 deletions.
15 changes: 12 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Dealing with failing web scrapers due to anti-bot protections or website changes
Scrapling is a high-performance, intelligent web scraping library for Python that automatically adapts to website changes while significantly outperforming popular alternatives. For both beginners and experts, Scrapling provides powerful features while maintaining simplicity.

```python
>> from scrapling import Fetcher, StealthyFetcher, PlayWrightFetcher
>> from scrapling.default import Fetcher, StealthyFetcher, PlayWrightFetcher
# Fetch websites' source under the radar!
>> page = StealthyFetcher().fetch('https://example.com', headless=True, network_idle=True)
>> page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)
>> print(page.status)
200
>> products = page.css('.product', auto_save=True) # Scrape data that survives website design changes!
Expand Down Expand Up @@ -211,12 +211,21 @@ python -m browserforge update
```
## Fetching Websites Features
All fetcher-type classes are imported in the same way
You might be a little bit confused by now so let me clear things up. All fetcher-type classes are imported in the same way
```python
from scrapling import Fetcher, StealthyFetcher, PlayWrightFetcher
```
And all of them can take these initialization arguments: `auto_match`, `huge_tree`, `keep_comments`, `storage`, `storage_args`, and `debug` which are the same ones you give to the `Adaptor` class.
If you don't want to pass arguments to the generated `Adaptor` object and want to use the default values, you can use import instead for cleaner code:
```python
from scrapling.default import Fetcher, StealthyFetcher, PlayWrightFetcher
```
then use it right away without initializing like:
```python
page = StealthyFetcher.fetch('https://example.com')
```

Also, the `Response` object returned from all fetchers is the same as `Adaptor` object except it has these added attributes: `status`, `reason`, `cookies`, `headers`, and `request_headers`. All `cookies`, `headers`, and `request_headers` are always of type `dictionary`.
> [!NOTE]
> The `auto_match` argument is enabled by default which is the one you should care about the most as you will see later.
Expand Down

0 comments on commit d8a8af8

Please sign in to comment.