-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --cookies option to pass in a cookie file #47
base: master
Are you sure you want to change the base?
Conversation
I'd love to see this implemented without the need for temporary files. |
RSully, I agree, but I didn't see a way for cookielib to load a set of cookies via anything except a filename. If there is some other builtin set of libraries besides |
Use a browser extension such as: https://chrome.google.com/webstore/detail/cookietxt-export/lopabhfecdfhgogdbojmaicoicjekelh ...to dump cookies into a file, and then pass in the filename with "--cookies FILENAME". You can also copy the cookie file content, and then do "--cookies <(pbpaste)" to skip the intermediate file.
I don't like the cookie file as a user interface - ideally we'd accept cookie name/value pairs on the command line and use that to generate a cookie header manually (if urllib2 or cookielib can't do it then the format isn't that hard) then pass that to Also, is |
--cookie-file is the old behavior, where you pass in a filename (or a FIFO). --cookie is used to pass in key-value pairs, either in a single string, or with multiple --cookie options: --cookie 'name1=value1; name2=value2; name3=value3' ...is the same as: --cookie name1=value1 --cookie name2=value2 --cookie name3=value3 Both --cookie and --cookie-file are allowed at the same time, in which case --cookie values are blindly appended to the end of the cookies that are parsed out of the cookie file.
Fair enough. I've renamed the arguments to allow for both:
I find the cookie file to be far more useful for me: I'm scraping my development Wordpress blog (requires user auth), and I do not want to mess around assembling all the cookies that are required to authenticate with Wordpress on the command line. It's far faster to simply dump the cookies out of my browser and then feed that directly into I'm not sure about |
RSully, if you use
|
# doesn't match a particular regex, so we always guarantee the magic_re | ||
# will succeed | ||
tmp = tempfile.NamedTemporaryFile(delete=False) | ||
tmp.write("# Netscape HTTP Cookie File\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why webkit2png needs to fix the format of the file here - isn't that the responsibility of whatever tool created the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it totally is the responsibility of the tool that created the file: but that tool was a Chrome plugin I don't have any control over. Since then, I've found another plugin that does generate the cookie dump with this required line, so I'd be fine removing this hack.
This line's non-existence was such a silly reason for cookielib to throw out an otherwise perfectly good cookie file, I was annoyed enough to code around it. And since I was already forced to write a named file, it was not a big reach to simply guarantee this line existed in the file.
@aperlscript Thanks for the updated code - having both as an option works for me. I just left one question as a code comment... Also, before merging this I'd like to understand how webkit2png interacts with Safari's cookie jar. Right now it appears to use Safari's cookies, which feels like a bad thing to me, but I haven't had time to investigate why. |
(Also, don't worry about the part where this pull request will no longer automatically merge, I'll deal with that when it's ready) |
I hadn't realized that this already used Safari's cookiejar - I didn't notice this during my testing. Right now to get around any auth issues I have been saving pages as webarchives and running webkit2png against that. |
any plans to pull this in? |
Merged in the latest |
Similarly, I'd love to have a means to tell it not to use Safari's cookie jar. --no-cookies |
By default, NSURLMutableRequest uses Safari's cookies. This option will explicitly set the HTTP request header "Cookie" to empty.
I'm embarrassed to admit how long it took me to realize that I've added an option to suppress this default behavior in the request object by simply setting |
|
Add --cookies option to pass in a cookie file
I tried
|
For anyone interested, in my fork |
This is a re-pullrequest of #31, but against the most recent master (since I'm a bit more up to speed with git now)
Use a browser extension such as:
https://chrome.google.com/webstore/detail/cookietxt-export/lopabhfecdfhgogdbojmaicoicjekelh
...to dump cookies into a file, and then pass in the filename with
"--cookies FILENAME". You can also copy the cookie file content, and
then do "--cookies <(pbpaste)" to skip the intermediate file.