Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kan_* scrapers not working #1316

Open
grossir opened this issue Feb 7, 2025 · 2 comments
Open

kan_* scrapers not working #1316

grossir opened this issue Feb 7, 2025 · 2 comments
Assignees

Comments

@grossir
Copy link
Contributor

grossir commented Feb 7, 2025

The site URL has changed recently. New URL: https://searchdro.kscourts.gov/

Seems to be a Cloudflare block. We may need to request permission to continue scraping
Image

Sentry Issue: COURTLISTENER-93E

HTTPError: 403 Client Error: Forbidden for url: https://searchdro.kscourts.gov/
(1 additional frame(s) were not displayed)
...
  File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 393, in handle
    self.parse_and_scrape_site(mod, options)
  File "cl/scrapers/management/commands/cl_scrape_opinions.py", line 356, in parse_and_scrape_site
    site = mod.Site(save_response_fn=save_response).parse()
@flooie flooie moved this to March 10 to March 21 in Case Law Sprint Feb 10, 2025
@flooie flooie moved this from March 10 to March 21 to Feb 10 to Feb 21 in Case Law Sprint Feb 10, 2025
@flooie flooie assigned quevon24 and flooie and unassigned grossir and quevon24 Feb 10, 2025
@flooie
Copy link
Contributor

flooie commented Feb 10, 2025

I will reach out to the court here and try to get a conversation going.

@flooie flooie moved this from Feb 10 to Feb 21 to To Do in Case Law Sprint Feb 10, 2025
@flooie flooie moved this from To Do to In progress in Case Law Sprint Feb 18, 2025
@flooie
Copy link
Contributor

flooie commented Feb 18, 2025

I have reached out to the reporter of decisions and I am waiting to hear back. Unfortunately, it doesnt look like we've been blocked as much as they've blocked headless scrapers. Even selenium appears to have issues accessing the pages.

@flooie flooie moved this from In progress to Blocked in Case Law Sprint Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Blocked
Development

No branches or pull requests

3 participants