Skip to content

Commit

Permalink
Merge branch 'master' into update-actors-in-store
Browse files Browse the repository at this point in the history
  • Loading branch information
TC-MO committed Mar 6, 2024
2 parents 5e65aba + 74bee05 commit 3bb89c0
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -83,4 +83,6 @@ Each of the items in the dataset will represent a scraped offer, and will have t

From this course, you should have all the knowledge to build this scraper by yourself. Give it a try, then come back to compare your scraper with our solution.

The challenge can be completed using either [CheerioCrawler](https://crawlee.dev/api/cheerio-crawler/class/CheerioCrawler) or [PlaywrightCrawler](https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler). Playwright is significantly slower but doesn't get blocked as much. You will learn the most by implementing both.

Let's start off this section easy by [initializing and setting up](./initializing_and_setting_up.md) our project with the Crawlee CLI (don't worry, no additional install is required).
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,12 @@ const { keyword } = await KeyValueStore.getInput();

const crawler = new CheerioCrawler({
requestHandler: router,

// If you have access to Apify Proxy, you can use residential proxies and
// high retry count which helps with blocking
// If you don't, your local IP address will likely be fine for a few requests if you scrape slowly.
// proxyConfiguration: await Actor.createProxyConfiguration({ groups: ['RESIDENTIAL'] }),
// maxRequestRetries: 10,
});

log.info('Starting the crawl.');
Expand Down

0 comments on commit 3bb89c0

Please sign in to comment.