Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDP Mode (alpha) #3215

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,30 +19,30 @@
<p align="center">
<a href="#python_installation">🚀 Start</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/features_list.md">🏰 Features</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/ReadMe.md">📚 Examples</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/customizing_test_runs.md">🎛️ Options</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/ReadMe.md">📚 Examples</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/console_scripts/ReadMe.md">🌠 Scripts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/mobile_testing.md">📱 Mobile</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/method_summary.md">📘 APIs</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/syntax_formats.md"> 🔡 Formats</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/example_logs/ReadMe.md">📊 Dashboard</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/syntax_formats.md"> 🔠 Formats</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/recorder_mode.md">🔴 Recorder</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/example_logs/ReadMe.md">📊 Dashboard</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/locale_codes.md">🗾 Locales</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/utilities/selenium_grid/ReadMe.md">🌐 Grid</a>
<a href="https://seleniumbase.io/devices/?url=seleniumbase.com">💻 Farm</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/commander.md">🎖️ GUI</a> |
<a href="https://seleniumbase.io/demo_page">📰 TestPage</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/case_plans.md">🗂️ CasePlans</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/uc_mode.md">👤 UC Mode</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/master_qa/ReadMe.md">🧬 Hybrid</a> |
<a href="https://seleniumbase.io/devices/?url=seleniumbase.com">💻 Farm</a>
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/cdp_mode/ReadMe.md">🐙 CDP Mode</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/chart_maker/ReadMe.md">📶 Charts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/seleniumbase/utilities/selenium_grid/ReadMe.md">🌐 Grid</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/how_it_works.md">👁️ How</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/migration/raw_selenium">🚝 Migrate</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/boilerplates">♻️ Templates</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/integrations/node_js">🚉 NodeGUI</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/chart_maker/ReadMe.md">📶 Charts</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/case_plans.md">🗂️ CasePlans</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/tree/master/examples/boilerplates">♻️ Template</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/master_qa/ReadMe.md">🧬 Hybrid</a> |
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/examples/tour_examples/ReadMe.md">🚎 Tours</a>
<br />
<a href="https://github.com/seleniumbase/SeleniumBase/blob/master/integrations/github/workflows/ReadMe.md">🤖 CI/CD</a> |
Expand Down
301 changes: 301 additions & 0 deletions examples/cdp_mode/ReadMe.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
<!-- SeleniumBase Docs -->

## [<img src="https://seleniumbase.github.io/img/logo6.png" title="SeleniumBase" width="32">](https://github.com/seleniumbase/SeleniumBase/) CDP Mode 🐙

🐙 <b translate="no">SeleniumBase</b> <b translate="no">CDP Mode</b> (Chrome Devtools Protocol Mode) is a special mode inside of <b><a href="https://github.com/seleniumbase/SeleniumBase/blob/master/help_docs/uc_mode.md" translate="no"><span translate="no">SeleniumBase UC Mode</span></a></b> that lets bots appear human while controlling the browser with the <b translate="no">CDP-Driver</b>. Although regular <span translate="no">UC Mode</span> can't perform <span translate="no">WebDriver</span> actions while the <code>driver</code> is disconnected from the browser, the <span translate="no">CDP-Driver</span> can still perform actions (while maintaining its cover).

👤 <b translate="no">UC Mode</b> avoids bot-detection by first disconnecting WebDriver from the browser at strategic times, calling special <code>PyAutoGUI</code> methods to bypass CAPTCHAs (as needed), and finally reconnecting the <code>driver</code> afterwards so that WebDriver actions can be performed again. Although this approach works for bypassing simple CAPTCHAs, more flexibility is needed for bypassing bot-detection on websites with advanced protection. (That's where <b translate="no">CDP Mode</b> comes in.)

🐙 <b translate="no">CDP Mode</b> is based on <a href="https://github.com/HyperionGray/python-chrome-devtools-protocol" translate="no">python-cdp</a>, <a href="https://github.com/HyperionGray/trio-chrome-devtools-protocol" translate="no">trio-cdp</a>, and <a href="https://github.com/ultrafunkamsterdam/nodriver" translate="no">nodriver</a>. <code>trio-cdp</code> was an early implementation of <code>python-cdp</code>, whereas <code>nodriver</code> is a modern implementation of <code>python-cdp</code>. (Refactored CDP code is imported from <a href="https://github.com/mdmintz/MyCDP" translate="no">MyCDP</a>.)

🐙 <b translate="no">CDP Mode</b> includes multiple updates to the above, such as:

* Sync methods. (Using `async`/`await` is not necessary!)
* The ability to use WebDriver and CDP-Driver together.
* Backwards compatibility for existing UC Mode scripts.
* More configuration options when launching browsers.
* More methods. (And bug-fixes for existing methods.)
* Faster response time for support. (Eg. [Discord Chat](https://discord.gg/EdhQTn3EyE))

--------

### 🐙 <b translate="no">CDP Mode</b> initialization:

* `sb.activate_cdp_mode(url)`

> (Call that from a **UC Mode** script)

--------

### 🐙 <b translate="no">CDP Mode</b> examples:

> [SeleniumBase/examples/cdp_mode](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode)

### 🔖 Example 1: (Pokemon site using Incapsula/Imperva protection with invisible reCAPTCHA)

> [SeleniumBase/examples/cdp_mode/raw_pokemon.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_pokemon.py)

<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.pokemon.com/us"
sb.activate_cdp_mode(url)
sb.sleep(1)
sb.cdp.click_if_visible("button#onetrust-reject-all-handler")
sb.cdp.click('a[href="https://www.pokemon.com/us/pokedex/"]')
sb.sleep(1)
sb.cdp.click('b:contains("Show Advanced Search")')
sb.sleep(1)
sb.cdp.click('span[data-type="type"][data-value="electric"]')
sb.cdp.click("a#advSearch")
sb.sleep(1)
sb.cdp.click('img[src*="img/pokedex/detail/025.png"]')
sb.cdp.assert_text("Pikachu", 'div[class*="title"]')
sb.cdp.assert_element('img[alt="Pikachu"]')
sb.cdp.scroll_into_view("div.pokemon-ability-info")
sb.sleep(1)
sb.cdp.flash('div[class*="title"]')
sb.cdp.flash('img[alt="Pikachu"]')
sb.cdp.flash("div.pokemon-ability-info")
name = sb.cdp.get_text("label.styled-select")
info = sb.cdp.get_text("div.version-descriptions p.active")
print("*** %s: ***\n* %s" % (name, info))
sb.sleep(2)
sb.cdp.highlight_overlay("div.pokemon-ability-info")
sb.sleep(2)
sb.cdp.click('a[href="https://www.pokemon.com/us/play-pokemon/"]')
sb.cdp.click('h3:contains("Find an Event")')
location = "Concord, MA, USA"
sb.cdp.type('input[data-testid="location-search"]', location)
sb.sleep(1)
sb.cdp.click("div.autocomplete-dropdown-container div.suggestion-item")
sb.cdp.click('img[alt="search-icon"]')
sb.sleep(2)
events = sb.cdp.select_all('div[data-testid="event-name"]')
print("*** Pokemon events near %s: ***" % location)
for event in events:
print("* " + event.text)
sb.sleep(2)
```

</details>

### 🔖 Example 2: (Hyatt site using Kasada protection)

> [SeleniumBase/examples/cdp_mode/raw_hyatt.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_hyatt.py)

<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.hyatt.com/"
sb.activate_cdp_mode(url)
sb.sleep(1)
sb.cdp.click_if_visible('button[aria-label="Close"]')
sb.sleep(0.5)
sb.cdp.click('span:contains("Explore")')
sb.sleep(1)
sb.cdp.click('a:contains("Hotels & Resorts")')
sb.sleep(2.5)
location = "Anaheim, CA, USA"
sb.cdp.press_keys("input#searchbox", location)
sb.sleep(1)
sb.cdp.click("div#suggestion-list ul li a")
sb.sleep(1)
sb.cdp.click('div.hotel-card-footer button')
sb.sleep(1)
sb.cdp.click('button[data-locator="find-hotels"]')
sb.sleep(4)
hotel_names = sb.cdp.select_all(
'div[data-booking-status="BOOKABLE"] [class*="HotelCard_header"]'
)
hotel_prices = sb.cdp.select_all(
'div[data-booking-status="BOOKABLE"] div.rate-currency'
)
sb.assert_true(len(hotel_names) == len(hotel_prices))
print("Hyatt Hotels in %s:" % location)
print("(" + sb.cdp.get_text("ul.b-color_text-white") + ")")
if len(hotel_names) == 0:
print("No availability over the selected dates!")
for i, hotel in enumerate(hotel_names):
print("* %s: %s => %s" % (i + 1, hotel.text, hotel_prices[i].text))
```

</details>

### 🔖 Example 3: (BestWestern site using DataDome protection)

* [SeleniumBase/examples/cdp_mode/raw_bestwestern.py](https://github.com/seleniumbase/SeleniumBase/tree/master/examples/cdp_mode/raw_bestwestern.py)

<div></div>
<details>
<summary> ▶️ (<b>Click to expand code preview</b>)</summary>

```python
from seleniumbase import SB

with SB(uc=True, test=True, locale_code="en") as sb:
url = "https://www.bestwestern.com/en_US.html"
sb.activate_cdp_mode(url)
sb.sleep(1.5)
sb.cdp.click_if_visible("div.onetrust-close-btn-handler")
sb.sleep(0.5)
sb.cdp.click("input#destination-input")
sb.sleep(1.5)
location = "Palm Springs, CA, USA"
sb.cdp.press_keys("input#destination-input", location)
sb.sleep(0.6)
sb.cdp.click("ul#google-suggestions li")
sb.sleep(0.6)
sb.cdp.click("button#btn-modify-stay-update")
sb.sleep(1.5)
sb.cdp.click("label#available-label")
sb.sleep(4)
print("Best Western Hotels in %s:" % location)
summary_details = sb.cdp.get_text("#summary-details-column")
dates = summary_details.split("ROOM")[0].split("DATES")[-1].strip()
print("(Dates: %s)" % dates)
flip_cards = sb.cdp.select_all(".flipCard")
for i, flip_card in enumerate(flip_cards):
hotel = flip_card.query_selector(".hotelName")
price = flip_card.query_selector(".priceSection")
if hotel and price:
print("* %s: %s => %s" % (
i + 1, hotel.text.strip(), price.text.strip())
)
```

</details>

(<b>Note:</b> Extra <code translate="no">sb.sleep()</code> calls have been added to prevent bot-detection because some sites will flag you as a bot if you perform actions too quickly.)

(<b>Note:</b> Some sites may IP-block you for 36 hours or more if they catch you using regular <span translate="no">Selenium WebDriver</span>. Be extra careful when creating and/or modifying automation scripts that run on them.)

--------

### 🐙 CDP Mode API / Methods

(Some method args have been left out for simplicity. Eg: <code translate="no">timeout</code>)

```python
sb.cdp.get(url)
sb.cdp.reload()
sb.cdp.refresh()
sb.cdp.add_handler(event, handler)
sb.cdp.find_element(selector)
sb.cdp.find_all(selector)
sb.cdp.find_elements_by_text(text, tag_name=None)
sb.cdp.select(selector)
sb.cdp.select_all(selector)
sb.cdp.click_link(link_text)
sb.cdp.tile_windows(windows=None, max_columns=0)
sb.cdp.get_all_cookies(*args, **kwargs)
sb.cdp.set_all_cookies(*args, **kwargs)
sb.cdp.save_cookies(*args, **kwargs)
sb.cdp.load_cookies(*args, **kwargs)
sb.cdp.clear_cookies(*args, **kwargs)
sb.cdp.sleep(seconds)
sb.cdp.bring_active_window_to_front()
sb.cdp.get_active_element()
sb.cdp.get_active_element_css()
sb.cdp.click(selector)
sb.cdp.click_active_element()
sb.cdp.click_if_visible(selector)
sb.cdp.mouse_click(selector)
sb.cdp.nested_click(parent_selector, selector)
sb.cdp.get_nested_element(parent_selector, selector)
sb.cdp.flash(selector)
sb.cdp.focus(selector)
sb.cdp.highlight_overlay(selector)
sb.cdp.remove_element(selector)
sb.cdp.remove_from_dom(selector)
sb.cdp.remove_elements(selector)
sb.cdp.scroll_into_view(selector)
sb.cdp.send_keys(selector, text)
sb.cdp.press_keys(selector, text)
sb.cdp.type(selector, text)
sb.cdp.evaluate(expression)
sb.cdp.js_dumps(obj_name)
sb.cdp.maximize()
sb.cdp.minimize()
sb.cdp.medimize()
sb.cdp.set_window_rect()
sb.cdp.reset_window_size()
sb.cdp.get_window()
sb.cdp.get_text()
sb.cdp.get_title()
sb.cdp.get_current_url()
sb.cdp.get_origin()
sb.cdp.get_page_source()
sb.cdp.get_user_agent()
sb.cdp.get_cookie_string()
sb.cdp.get_locale_code()
sb.cdp.get_screen_rect()
sb.cdp.get_window_rect()
sb.cdp.get_window_size()
sb.cdp.get_window_position()
sb.cdp.get_element_rect(selector)
sb.cdp.get_element_size(selector)
sb.cdp.get_element_position(selector)
sb.cdp.get_gui_element_rect(selector)
sb.cdp.get_gui_element_center(selector)
sb.cdp.get_document()
sb.cdp.get_flattened_document()
sb.cdp.get_element_attributes(selector)
sb.cdp.get_element_html(selector)
sb.cdp.set_attributes(selector, attribute, value)
sb.cdp.internalize_links()
sb.cdp.is_element_present(selector)
sb.cdp.is_element_visible(selector)
sb.cdp.assert_element(selector)
sb.cdp.assert_element_present(selector)
sb.cdp.assert_text(text, selector="html")
sb.cdp.assert_exact_text(text, selector="html")
sb.cdp.save_screenshot(name, folder=None, selector=None)
```

--------

### 🐙 CDP Mode WebElement API / Methods

```python
element.clear_input()
element.click()
element.flash()
element.focus()
element.highlight_overlay()
element.mouse_click()
element.mouse_drag(destination)
element.mouse_move()
element.query_selector(selector)
element.querySelector(selector)
element.query_selector_all(selector)
element.querySelectorAll(selector)
element.remove_from_dom()
element.save_screenshot(*args, **kwargs)
element.save_to_dom()
element.scroll_into_view()
element.select_option()
element.send_file(*file_paths)
element.send_keys(text)
element.set_text(value)
element.type(text)
element.get_position()
element.get_html()
element.get_js_attributes()
```

--------

<img src="https://seleniumbase.github.io/cdn/img/sb_text_f.png" alt="SeleniumBase" title="SeleniumBase" align="center" width="335">

<div><a href="https://github.com/seleniumbase/SeleniumBase"><img src="https://seleniumbase.github.io/cdn/img/sb_logo_gs.png" alt="SeleniumBase" title="SeleniumBase" width="335" /></a></div>
Empty file added examples/cdp_mode/__init__.py
Empty file.
47 changes: 47 additions & 0 deletions examples/cdp_mode/raw_async.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import asyncio
import time
from seleniumbase.core import sb_cdp
from seleniumbase.undetected import cdp_driver


async def main():
driver = await cdp_driver.cdp_util.start()
page = await driver.get("https://www.priceline.com/")
time.sleep(3)
print(await page.evaluate("document.title"))
element = await page.select('[data-testid*="endLocation"]')
await element.click_async()
time.sleep(1)
await element.send_keys_async("Boston")
time.sleep(2)

if __name__ == "__main__":
# Call an async function with awaited methods
loop = asyncio.new_event_loop()
loop.run_until_complete(main())

# Call everything without using async / await
driver = loop.run_until_complete(cdp_driver.cdp_util.start())
page = loop.run_until_complete(driver.get("https://www.pokemon.com/us"))
time.sleep(3)
print(loop.run_until_complete(page.evaluate("document.title")))
element = loop.run_until_complete(page.select("span.icon_pokeball"))
loop.run_until_complete(element.click_async())
time.sleep(1)
print(loop.run_until_complete(page.evaluate("document.title")))
time.sleep(1)

# Call CDP methods via the simplified CDP API
page = loop.run_until_complete(driver.get("https://www.priceline.com/"))
sb = sb_cdp.CDPMethods(loop, page, driver)
sb.sleep(3)
sb.internalize_links() # Don't open links in a new tab
sb.click("#link_header_nav_experiences")
sb.sleep(2)
sb.remove_element("msm-cookie-banner")
sb.sleep(1)
sb.press_keys('input[data-test-id*="search"]', "Amsterdam")
sb.sleep(2)
sb.click('span[data-test-id*="autocomplete"]')
sb.sleep(5)
print(sb.get_title())
Loading
Loading