Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update Firefox User-Agent string #4546

Merged
merged 2 commits into from
Nov 17, 2024

Conversation

ryanwohara
Copy link
Contributor

@ryanwohara ryanwohara commented Nov 12, 2024

Without this update foodnetwork.com
returns a 403 to our requests.

Full details are available in the issue:
#4024

What type of PR is this?

  • bug

What this PR does / why we need it:

To pull recipes from foodnetwork

Which issue(s) this PR fixes:

Fixes #4024

Testing

Import any recipe from foodnetwork.com. Presently the import fails, but with this PR it succeeds.

>>> import requests
>>> requests.get("https://www.foodnetwork.com/recipes/ina-garten/garlic-roasted-potatoes-recipe-1913067", headers={"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0"})
<Response [200]>
>>> requests.get("https://www.foodnetwork.com/recipes/ina-garten/garlic-roasted-potatoes-recipe-1913067", headers={"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/123.0"})
<Response [403]>

Without this update footnetwork.com
returns a 403 to our requests.

Full details are available in the issue:
mealie-recipes#4024
Copy link
Collaborator

@Kuchenpirat Kuchenpirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey thanks for your PR.

After performing some automated testing on additional recipe pages, it appears that the updated userAgent works better on some pages. Importantly, I did not find a single page where the previous userAgent worked but the updated one did not. Therefore, I will approve the changes. However, it should be noted that this is only used as a fallback userAgent if importing the user agent provided from the scraper library fails. Usually, Mealie would scrape using the user agent below, with version being the recipe_scrapers version.

"User-Agent": f"Mozilla/5.0 (compatible; Windows NT 10.0; Win64; x64; rv:{__version__}) recipe-scrapers/{__version__}"

Also it seems like scraping from the foodnetwork.com is working again with the current setup which you can verrify on the demo.

@Kuchenpirat Kuchenpirat merged commit db47890 into mealie-recipes:mealie-next Nov 17, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SCRAPER] - foodnetwork.com returns 403 due to user-agent string
2 participants