Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ad-hoc cookies retrieval during runtime? #400

Open
Silverbullet069 opened this issue Feb 9, 2025 · 6 comments
Open

Ad-hoc cookies retrieval during runtime? #400

Silverbullet069 opened this issue Feb 9, 2025 · 6 comments

Comments

@Silverbullet069
Copy link

Silverbullet069 commented Feb 9, 2025

This is your reply from worker_vm#2:

Is it possible to also pass cookies to ComicCrawler?

Yes, I know what cookies Colamanga used to authenticate, so I can populate them manually.

There are several ways to use cookies in ComicCrawler:

  1. Define browser and browser_profile in setting.ini. ComicCrawler will import cookies from the browser automatically. (https://github.com/eight04/ComicCrawler?tab=readme-ov-file#%E9%97%9C%E6%96%BC%E9%9C%80%E8%A6%81%E7%99%BB%E5%85%A5%E7%9A%84%E7%B6%B2%E7%AB%99)
  2. In the module, set a cookie global variable to define cookies.
  3. During runtime, get the session object and configure its cookies property. For example:
    def init_api_session():
    . This example configures headers but cookies are similar.

All 3 required populating cookies manually either by going to the website once, open DevTools and copy the cookie or specify browser= in setting.ini (btw 3 is actually the last step of 1). What if I know the API that set the authentication cookies, can I fetch it and retrieve the cookies from response object? It could become the 4th way.

@eight04
Copy link
Owner

eight04 commented Feb 9, 2025

grabber returns a Response object and you can get the cookie from .cookies:
https://github.com/lexiforest/curl_cffi/blob/69932267b52219ea353b1c7a3e69bd2e593d5fd8/curl_cffi/requests/models.py#L49

Note that the session object handles cookies automatically, so the next request will use the cookie set by the previous response (suppose they both use the same session.)

@Silverbullet069
Copy link
Author

A chain of same-session requests will accumulate cookies values right? Does grabhtml and grabimg also set cookies for the session?

@Silverbullet069
Copy link
Author

I see in a module, cookies can be specified in 2 locations: config and cookie. Can I ask about your design decision regarding it?

@eight04
Copy link
Owner

eight04 commented Feb 11, 2025

A chain of same-session requests will accumulate cookies values right?

Yes. It works like browser sessions.

Does grabhtml and grabimg also set cookies for the session?

Yes. they use grabber under the hood.

cookies can be specified in 2 locations: config and cookie.

In config, users can edit the value in setting.ini. In cookie they can't.

design decision

When a site requires login, users used to specify credentials in setting.ini. Nowadays, we can read cookies directly from browsers, so there is no need to set them manually anymore. Therefore you won't see cookie configs in new modules.

Another method to set cookies is to use curl configs. Still, it is more convenient to read from the browser automatically.

@Silverbullet069
Copy link
Author

Silverbullet069 commented Feb 12, 2025

I see, you keep config from legacy and backward-compatible perspective.

P/s: I was wrong. Turns out downloading the encrypted image of Colamanga doesn't require any cookies value. I wrote my progress here, can you check it.

@Silverbullet069
Copy link
Author

Do you use Discord, sometimes I want to ask something with faster reply time so I can continue working on the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants