Skip to content

Scrapy HTTP authentication credentials potentially leaked to target websites

Moderate severity GitHub Reviewed Published Oct 6, 2021 in scrapy/scrapy • Updated Nov 15, 2023

Package

pip Scrapy (pip)

Affected versions

< 1.8.1
>= 2.0.0, < 2.5.1

Patched versions

1.8.1
2.5.1

Description

Impact

If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for HTTP authentication, all requests will expose your credentials to the request target.

This includes requests generated by Scrapy components, such as robots.txt requests sent by Scrapy when the ROBOTSTXT_OBEY setting is set to True, or as requests reached through redirects.

Patches

Upgrade to Scrapy 2.5.1 and use the new http_auth_domain spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials.

If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.5.1 is not an option, you may upgrade to Scrapy 1.8.1 instead.

Workarounds

If you cannot upgrade, set your HTTP authentication credentials on a per-request basis, using for example the w3lib.http.basic_auth_header function to convert your credentials into a value that you can assign to the Authorization header of your request, instead of defining your credentials globally using HttpAuthMiddleware.

For more information

If you have any questions or comments about this advisory:

References

@Gallaecio Gallaecio published to scrapy/scrapy Oct 6, 2021
Reviewed Oct 6, 2021
Published to the GitHub Advisory Database Oct 6, 2021
Published by the National Vulnerability Database Oct 6, 2021
Last updated Nov 15, 2023

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
Low
User interaction
Required
Scope
Unchanged
Confidentiality
High
Integrity
None
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:U/C:H/I:N/A:N

EPSS score

0.366%
(73rd percentile)

CVE ID

CVE-2021-41125

GHSA ID

GHSA-jwqp-28gf-p498

Source code

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.