-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suddenly unable to bypass CloudFlare challenge (Ubuntu Server) #2842
Comments
Just tested on from seleniumbase import SB
with SB(uc=True, test=True) as sb:
url = "https://wildbet.gg/"
sb.driver.uc_open_with_reconnect(url, 10) If you're still getting blocked, it might be that you exceeded a rate limit for your IP Address. |
Thank you for your reply. I am not sure to understand. Why would it be related to any kind of ratelimiting if I can edit the code to disconnect permanently then click myself so that it goes through ? The only difference between error and success is that I click manually. |
You can try that if it helps. When I ran the UC Mode script for that page, the script never had to click anything to bypass the CAPTCHA. |
We have the same problem at headless systems -ubuntu server-. 2 different servers, the ip is rotated - whitelisted |
@mdmintz same problem here on linux vds (ubuntu without gpu), seleniumbase became unable to bypass the CloudFlare challenge. Using latest sb version. On local macos and windows keep working without any problem. |
did you find any solution sir? |
Not yet, waiting if @mdmintz has any suggestions |
OK, sounds like something has changed because there are a few different people showing up here. And this also sounds like this is specific to Linux because everything is working normally on macOS and Windows for me. I'm still going to need more details to assist:
This will help me figure out what's going on. Everything is working normally for me on macOS and Windows. |
Hello @mdmintz ,
At my personal laptop (Windows 11):
It looks like Cloudflare is able to tell if I'm using a VM somehow, but yet, it works when clicking manually.. I hope my description is helpful, |
@mdmintz
|
Looks like we've narrowed it down to an environment issue. from seleniumbase import SB
with SB(uc=True, test=True) as sb:
url = "https://visa.vfsglobal.com/fra/en/hrv/login"
sb.driver.uc_open_with_reconnect(url, 6)
sb.switch_to_frame("iframe")
sb.driver.uc_click("span")
breakpoint() Worked for me on my Mac, but sounds like it won't work on some Linux server environments. |
2024-06-08_19-28-57.mp4 |
@mdmintz For Ubuntu 20.04.6 LTS - server edtition, the script you provided not even clicks at span for some reason
|
OK, getting somewhere. Seeing differences on Mac vs Windows. Two videos below. On a Mac, script worked as is. (Adjust Screen.Recording.2024-06-08.at.1.45.15.PM.movOn Windows, had to add IMG_0750.movDid adding If so, add it to all your scripts. |
Driver object:
Logic:
Worth to add I have tested on my local IP address, tested on proxyscrape.com dedicated proxies and rotating ones and also other provider with IPv6 proxies. On all proxy versions same issue. |
I have this code (tryied also with incognito):
at windows, I get the CF passed. |
@JimKarvo Thank you... That is helpful. I see the same thing: Working on both macOS and Windows, but not for Linux (Ubuntu server 20.04). I will look into improvements specifically on the Ubuntu side (since things appear to be working on macOS and Windows, at least with I'll be working on improvements to SeleniumBase Ubuntu configuration. In the meantime, people can keep passing on any information that I might find useful. (If anyone is still having macOS or Windows issues, I'll need an example that reproduces your issue, because so far things appear to be normal on the local desktop front, at least with |
@mdmintz I don't know, but changing any user agent, maybe will help the situation? If I can help with access to any of my VMs (ubuntu server), I can send ssh details |
@JimKarvo Can you tell me the user-agent that appears when you ran on Ubuntu? And the agent you saw when running locally? Maybe there's a clue there. |
From Linux I get:
from Windows I get:
|
If there's a human-controlled web browser with that same user-agent on that system ( |
Not really. As mentioned by me above. I still struggle with bypassing on Windows. I don't know if there is anything else I could add to it. If you need more info tell me exactly what should I be looking for. |
@OpsecGuy You seem to be the only one having Windows issues at this time. You also introduced a lot of variables into the equation in your example. Eg. 1. using |
@mdmintz I tryied this one, after failing to resolve the captcha look like the
the data i received:
|
@JimKarvo your reconnect duration seems too small, make it bigger like 7 / 8, in my case i use 20 and it works just fine |
the above code, it's just for getting the user-agent and all data that browser sends to a server while request a page. |
@JimKarvo This site is a good one for seeing all the headers: https://browserleaks.com/client-hints So what have we learned? Cloudflare made changes. Previously, they only blocked you if they detected Selenium, but now they are blocking you for other things, such as User Agent. Three types of User Agents now (in combination with UC Mode):
You may have to change your User Agent on Linux to be "Good". For the "Not that good", you'll need to use This should work every time as long as the machine has a GUI: import pyautogui
from seleniumbase import SB
ua = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
with SB(uc=True, test=True, agent=ua, disable_features="UserAgentClientHint", incognito=True) as sb:
url = "https://www.virtualmanager.com/da/login"
sb.driver.uc_open_with_reconnect(url, 8)
if sb.is_element_visible("iframe"):
sb.switch_to_frame("iframe")
sb.execute_script('document.querySelector("input").focus()')
sb.disconnect()
pyautogui.press(" ")
sb.driver.reconnect(4)
breakpoint() |
Thanks @mdmintz! |
Sadly the pyautogui does not always seems to bypass. |
@jens4626 Make sure all your |
@mdmintz It does work - but its only 50% chance that it works. I currently have 3 situations:
I did use that code you provided and it works - but just not always. So not sure what to do tbh. ` sb.switch_to_frame("iframe")
` |
@jens4626 The spacebar from |
This was also not working for me when disconnecting from remote windows 10 (azure vm). So as @mdmintz mentioned, we have to keep chrome window active. Updated code:
As a note, when I disconnect from remote desktop, windows gui is disabled. So, in order to keep your current session active, you can use this instruction. |
I test https://github.com/sarperavci/CloudflareBypassForScraping the script uses the DrissionPage |
I confirm, the DrissionPage solution bypasses the cloudflare click under Linux. |
The pyautogui.press(" ") solution works consistently as well under Linux.
|
@EnmeiRyuuDev SeleniumBase uses the built-in self._xvfb_display = Display(visible=0, size=(width, height))
self._xvfb_display.start() Will that work with the code you added? import Xlib.display
pyautogui._pyautogui_x11._display = Xlib.display.Display(os.environ['DISPLAY']) I assume you installed this: |
@mdmintz in my tests, this piece of code was required:
Otherwise, Selenium will not run headless. Also,
This is my complete test code:
What was interesting, is that when running multiple headless instances (+20 chrome driver instances), they all click independently without that window overlapping issue. |
Thanks for the input @mdmintz and @ismayilibrahimov. I think the issue with not clicking was due to me. It might be caused by bad IP score and now using pyautogui - it was never an issue with uc.click. I was already using what you mentioned @ismayilibrahimov so thats not the problem either. |
More details: Now, if your User-Agent looks untrustworthy, CloudFlare makes you click the CAPTCHA (which has been improved). If they detect either Selenium in the browser or JavaScript involvement in clicking the CAPTCHA, they don't let the click through. That's why I'm working on an update that can optionally utilize the This probably means a new UC Mode Video Tutorial (Part 3) is likely to happen soon to explain the changes. |
I am using windows 10 (without headless mode) at azure, and CloudFlare requires to click. I think user-agent is not the only issue
|
@ismayilibrahimov Azure has a known IP-range (just like AWS or GCP). That's why residential proxies have become so popular lately for web-scraping. |
Were you guys able to use this alternative with proxy? So, for me CF started showing the challenge the same time around, and it only happens when I am using proxy (on servers). Local without proxy it works fine. But when I use proxy, even on local env, bam there is the captcha. They somehow are detecting the proxy. If I try to use the alternative with pyautogui, it works without proxy, but if I use proxy this is what happens: |
@enricodvn Which alternative are you using? The one with |
Yes, the last one from #2842 (comment). Hmm, this time zone setting is interesting, anyway I can set it through driver? I will try to tweak with it. |
@enricodvn The time zone can be set via |
|
This was resolved in Read #2865 for all the details. You may need to use the new UC Mode methods in |
Hello, overnight my instances of seleniumbase became unable to bypass the CloudFlare challenge ( which uses CloudFlare turnstile ).
I was using an older version of SB so I updated to latest ( 4.27.4 ), and it is still not passing the challenge.
I am using your demo code for clicking on the CloudFlare turnstile captcha:
if I instead use:
sb.driver.uc_open_with_reconnect(url, reconnect_time=9999)
and click manually, it works. This means they are detecting something ?
I also tried adding
reconnect_time=5
on uc_click and it did not help.I'm a big fan of your project and I've been using it for some time :)
The text was updated successfully, but these errors were encountered: