You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when we use a weblink that is not open to the public (like Wikipedia), but requires login (like JSTOR, or any of the databases on the UChicago Library site), the link contains something like "proxy.uchicago.edu" and scraping returns the following:
"Shibboleth Authentication Request
If your browser does not continue automatically, click ..."
how can we go around this?
The text was updated successfully, but these errors were encountered:
Collecting data from websites that require login can be complicated. A common way is to use the Selenium package.
For instance, the following code automatically login to GitHub using Selenium. This allows you to access and collect all contents in GitHub that require login (Note that Selenium package may not work well on Google Colab).
when we use a weblink that is not open to the public (like Wikipedia), but requires login (like JSTOR, or any of the databases on the UChicago Library site), the link contains something like "proxy.uchicago.edu" and scraping returns the following:
"Shibboleth Authentication Request
If your browser does not continue automatically, click ..."
how can we go around this?
The text was updated successfully, but these errors were encountered: