Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError for every story from chyoa.com? #105

Open
radiologicalhazard opened this issue Nov 9, 2024 · 10 comments
Open

IndexError for every story from chyoa.com? #105

radiologicalhazard opened this issue Nov 9, 2024 · 10 comments

Comments

@radiologicalhazard
Copy link

This is the error I get anytime I attempt to download from chyoa:

Traceback (most recent call last): File "I:\Ebook-Publisher-3.3.3\Ebook-Publisher.py", line 339, in <module> clas=MakeClass(i) File "I:\Ebook-Publisher-3.3.3\Ebook-Publisher.py", line 202, in MakeClass site=sites[domain](url) File "I:\Ebook-Publisher-3.3.3\Ebook-Publisher.py", line 21, in <lambda> 'chyoa.com':lambda x:Chyoa.Chyoa(x), ~~~~~~~~~~~^^^ File "I:\Ebook-Publisher-3.3.3\Site\Chyoa.py", line 239, in __init__ self.AddNextPage(self.pageQueue.pop(0)) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^ File "I:\Ebook-Publisher-3.3.3\Site\Chyoa.py", line 482, in AddNextPage prevLinkId1=urllib.parse.urlparse(prevLinkCheck2)[2].split('.')[2] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range

Other websites seem to work, (only tested literotica so far) so is this an issue with chyoa itself? Any story no matter what seems to do this.

@adwag-hue
Copy link

I'm experiencing the same thing as well

@adwag-hue
Copy link

File "C:\Users\Hp\Downloads\Ebook-Publisher-master\Ebook-Publisher-master\Ebook-Publisher.py", line 339, in
clas=MakeClass(i)
File "C:\Users\Hp\Downloads\Ebook-Publisher-master\Ebook-Publisher-master\Ebook-Publisher.py", line 202, in MakeClass
site=sitesdomain
File "C:\Users\Hp\Downloads\Ebook-Publisher-master\Ebook-Publisher-master\Ebook-Publisher.py", line 21, in
'chyoa.com':lambda x:Chyoa.Chyoa(x),
~~~~~~~~~~~^^^
File "C:\Users\Hp\Downloads\Ebook-Publisher-master\Ebook-Publisher-master\Site\Chyoa.py", line 239, in init
self.AddNextPage(self.pageQueue.pop(0))
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Hp\Downloads\Ebook-Publisher-master\Ebook-Publisher-master\Site\Chyoa.py", line 482, in AddNextPage
prevLinkId1=urllib.parse.urlparse(prevLinkCheck2)[2].split('.')[2]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

@theslavicbear
Copy link
Owner

I'm pretty sure chyoa updated their site to prevent scraping, nothing I can really do to work around it for now unfortunately

@adwag-hue
Copy link

Maybe you could take a look at https://github.com/Wasmae/CHYOADownloader

Their code seems to work fine. Only problem Is that a html page isn't as good as epub file.

@radiologicalhazard
Copy link
Author

radiologicalhazard commented Dec 24, 2024

Maybe you could take a look at https://github.com/Wasmae/CHYOADownloader

Their code seems to work fine. Only problem Is that a html page isn't as good as epub file.

Tried this tool out, but unfortunately it errored out on one story and when I successfully downloaded another, the links to other chapters didn't work 😓

@strauss999
Copy link

strauss999 commented Dec 24, 2024

Maybe you could take a look at https://github.com/Wasmae/CHYOADownloader
Their code seems to work fine. Only problem Is that a html page isn't as good as epub file.

Tried this tool out, but unfortunately it errored out on one story and when I successfully downloaded another, the links to other chapters didn't work 😓

Been years now since I checked that out but IIRC the links only worked in IE. Maybe they do in Edge IE mode too.

As for error, I think it couldn't handle stories with no cover so you had to edit the relevant part to fix that but again it has been far too long for me.

EDIT: The most up to date seems to be https://github.com/filen1/CHYOADownloader/tree/nessian-fix which seems to work all-around better.

@theslavicbear
Copy link
Owner

Maybe you could take a look at https://github.com/Wasmae/CHYOADownloader
Their code seems to work fine. Only problem Is that a html page isn't as good as epub file.

Tried this tool out, but unfortunately it errored out on one story and when I successfully downloaded another, the links to other chapters didn't work 😓

Been years now since I checked that out but IIRC the links only worked in IE. Maybe they do in Edge IE mode too.

As for error, I think it couldn't handle stories with no cover so you had to edit the relevant part to fix that but again it has been far too long for me.

EDIT: The most up to date seems to be https://github.com/filen1/CHYOADownloader/tree/nessian-fix which seems to work all-around better.

looks like that directly grabs the cookies from the browser to use. I could probably implement something like that

@theslavicbear
Copy link
Owner

Yeah I tried adding the cookies both manually and with browser_cookie3 like in that downloader, still only getting 403 errors. That downloader didn't seem to work on its own either

@strauss999
Copy link

Yeah I tried adding the cookies both manually and with browser_cookie3 like in that downloader, still only getting 403 errors. That downloader didn't seem to work on its own either

Tested it again after having previously updated a few dozen stories, with a random new tiny one so it was fast, and still working here so no idea why it doesn't on your side. Don't really know enough to try narrowing down possibilities either.

Only issue is that it doesn't know to not try to scrape some links(new chapter and the AI thingy).

@theslavicbear
Copy link
Owner

Yeah I tried adding the cookies both manually and with browser_cookie3 like in that downloader, still only getting 403 errors. That downloader didn't seem to work on its own either

Tested it again after having previously updated a few dozen stories, with a random new tiny one so it was fast, and still working here so no idea why it doesn't on your side. Don't really know enough to try narrowing down possibilities either.

Only issue is that it doesn't know to not try to scrape some links(new chapter and the AI thingy).

Yeah could be anything on my side like cloudflare tagging my IP, country, region, whatever for human validation. I think my plan going forward is to try to implement something like selenium for browser control. Thankfully it's just a matter of getting the data off the pages and sending it back to all the code that already exists, though selenium will be new to me and also an extra requirement to get to use this downloader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants