Skip to content

Troubleshooting

Andromeda Yelton edited this page Jun 13, 2019 · 8 revisions

Initial Steps

When fetching pages on the site, please keep in mind that these requests must:

  • follow the rate limits for the type of page you're requesting, or else be authenticated (via API authentication for scripted requests, or logging in to the site for manual requests)
  • not come from an IP that's been blocked
  • use a custom user agent, if you're fetching pages through a script; default user agents like curl, libcurl, and scrapertron5000 are blocked by default

Error Pages

301 / 300 Page Redirect: This can happen if you're attempting to fetch a page on the old ChillingEffects site, or one that has otherwise changed its URL.

404 Page Not Found: You appear to be requesting a page that doesn't exist.

429 Too Many Requests, Too Quickly: If you exceed the rate limit or do not have a custom user-agent, you will encounter a 429 error as your request.

500 Error: You may have been automatically banned by the system for excessive scraping activity or unauthorized vulnerability testing, or else you've found a bug with Lumen. Contact us to resolve this issue.

503 Service Unavailable: We are currently working on the site and it will be back online shortly. Try again in a few minutes. Thanks for your patience.

Rate Limits

To ensure that all visitors have equal access to the site, we've instituted the following rate limits on the Lumen Database:

  • For json files, up to 5 requests per day
  • For notice pages, up to 10 requests per minute, or 30 per hour

I have an access token but I got rate limited anyway.

If you're making requests through the API:

If you're making requests manually on the web site:

  • make sure that you have a researcher login, not just an API token
  • make sure that you are logged in with that account.

If you have an API token but do not have a researcher login you will need to make your requests through the API, not the web site.

I exceeded the rate limit. Now what?

Wait a day (for JSON) or an hour (for web requests) and try again.

Example: Barb accidentally requested 10 JSON pages at once. The first five will succeed and the other five will fail with a 429 error. She can retry tomorrow.

No, I really exceeded the rate limit...

We automatically ban IP addresses that far exceed the average daily requests or that conduct unauthorized vulnerability testing. These automatic bans may last longer than the rate limit periods above. Email us to resolve the ban: [email protected]

How many pages can I fetch?

All of them. But only within the rate limits defined above.

But I need millions of notices for my research project!

At a rate of up to 86400 json requests per day, it will take approximately 12 days to fetch one million results.

Without an API key and advance notice of your project needs, your access may be further limited. If planning a large scraping project, please contact us as far in advance as possible so that we can discuss how best to meet your research needs.

However, if you require millions of notices, we highly recommend narrowing the scope of your Lumen requests.

Can't you just give me a database dump or something?

No. For privacy reasons, we cannot release unredacted copies of the database. For practical reasons, we cannot send terabytes of data to researchers upon request. We also do not currently offer torrents or other downloads of redacted data aside from the website.

I am facing an issue that isn't mentioned here

Please email us at [email protected], and be sure to include:

  • Your API key, if any
  • Basic info about your research project, if applicable
  • Your IP address
  • Your custom user-agent