-
-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get_filings method fails 403 Forbidden error when fetching N-PX filings. #208
Comments
Hey, |
Jacob, is the problem still occurring? The initial timeout attempt followed by the HTTP 403 error seems like a transient backend server failure on SEC Edgar. I don't think it has to do with identity which you say is set properly |
I am no longer getting 403 errors, but I'm now encountering different HTTP errors related to the
The error trace suggests it is happening in the network layer, in I believe you may be right about it being a server failure as it appears to be a network-level timeout rather than an API rejection. Is this a known issue with larger downloads? I successfully downloaded all N-PX filings from 2023, but I'm only able to download 2-3GB before the connection times out. For context, the 2023 filings were approximately 13GB of data. |
Due to the sheer number of http requests you will get ReadTimeouts. For your use case you might be better using LocalStorage and downloading all filings. It takes about a minute to download all filings for a given day and it's 2 http requests per day download_filings("2025-01-24")
# Download for a range
download_filings("2024-01-01:2024-12-31") The drawback is that it downloads all forms so it requires a lot of storage. It occurred to me today that we could download the bulk files then use the |
Thanks for the suggestion. However, this approach might be too resource-intensive for my hardware limitations. I noticed there's a filing_date attribute in get_filings(), but I'm having trouble implementing it. I tried to download specific filings day by day using: filing = get_filings(filing_date="2024-01-01", form="N-PX")
filing.attachments.download(path) But I get I'm still learning the library and its classes. I'm beginning to understand some components, but the library is quite comprehensive. Appreciate the help! |
The correct code is something like this filings = get_filings(filing_date="2024-01-01", form="N-PX")
if filings is not None:
for filing in filings:
path = Path("base_path") / filing.accession_number
filing.attachments.download(path) There are no filings on 2024-01-01 (New Years Day) so that's why the None was returned. |
Description
When attempting to download N-PX filings for 2023, the script encounters two sequential failures:
httpx.ConnectionTimeout
exceptionReproduction Steps:
I ran the following python script:
Error Details
httpx.HTTPStatusError: Client error '403 Forbidden' for url 'https://www.sec.gov/Archives/edgar/full-index/2023/QTR1/form.gz.'
Environment
Context and Questions
set_identity()
method appears to be properly configured.The text was updated successfully, but these errors were encountered: