Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using default browser context #471

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions lib/ferrum/browser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ class Browser
# @option options [Hash] :env
# Environment variables you'd like to pass through to the process.
#
# @option options [Boolean] :use_default_context
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the best name for this is... it's kinda confusing since Contexts#default_context is already a thing and the current behavior of that method is to create a new BrowserContext. But I felt like "default" should be the right thing because the effective change is to not pass a browserContextId when creating a target.

Perhaps a less confusing name would be use_persistent_context? Similar to how Playwright has launchPersistentContext.

# When true, allows using the default browser context that has access to the browser's persistent state.
#
def initialize(options = nil)
@options = Options.new(options)
@client = @process = @contexts = nil
Expand Down
3 changes: 2 additions & 1 deletion lib/ferrum/browser/options.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ class Options
:js_errors, :base_url, :slowmo, :pending_connection_errors,
:url, :ws_url, :env, :process_timeout, :browser_name, :browser_path,
:save_path, :proxy, :port, :host, :headless, :browser_options,
:ignore_default_browser_options, :xvfb, :flatten
:ignore_default_browser_options, :xvfb, :flatten, :use_default_context
attr_accessor :timeout, :default_user_agent

def initialize(options = nil)
Expand Down Expand Up @@ -45,6 +45,7 @@ def initialize(options = nil)
@base_url = parse_base_url(@options[:base_url]) if @options[:base_url]
@url = @options[:url].to_s if @options[:url]
@ws_url = @options[:ws_url].to_s if @options[:ws_url]
@use_default_context = @options.fetch(:use_default_context, false)

@options = @options.merge(window_size: @window_size).freeze
@browser_options = @options.fetch(:browser_options, {}).freeze
Expand Down
3 changes: 2 additions & 1 deletion lib/ferrum/context.rb
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ def create_page(**options)
end

def create_target
@client.command("Target.createTarget", browserContextId: @id, url: "about:blank")
target_params = {browserContextId: @id, url: "about:blank"}.compact
@client.command("Target.createTarget", **target_params)
target = @pendings.take(@client.timeout)
raise NoSuchTargetError unless target.is_a?(Target)

Expand Down
15 changes: 15 additions & 0 deletions lib/ferrum/contexts.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ class Contexts
def initialize(client)
@contexts = Concurrent::Map.new
@client = client
@default_context = create_default_context if @client.options.use_default_context
Copy link
Author

@ibrahima ibrahima Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One side effect of setting this in the constructor is that any existing tabs/targets in the browser's default context will be available as targets (e.g. browser.default_context.targets will return a nonzero number of targets). This felt right to me, because then it handles use cases like #320 as well. Similar to the fork mentioned in #320 (comment) but in my opinion cleaner because it just gives you access to all the existing tabs to do with what you will, rather than somewhat arbitrarily picking the default (latest?) tab to be the default target.

It also seems to me that doing this actually makes the discover method call in the constructor do something, which I was wondering about. Without setting up any initial contexts, calling discover will start enumerating targets but because no Contexts exist yet, it won't do anything with those targets.

subscribe
auto_attach
discover
Expand All @@ -20,6 +21,20 @@ def default_context
@default_context ||= create
end

def create_default_context
default_context_id = compute_default_context_id
# Targets created in this context will not be created with a browserContextId
@contexts[default_context_id] = ::Ferrum::Context.new(@client, self, nil)
end

# Compute the default context ID by looking for contexts not returned by Target.getBrowserContexts
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kinda hacky but I didn't find a better way to find this looking through https://chromedevtools.github.io/devtools-protocol. Target.getBrowserContexts does not return the default context, but when you create targets without specifying the browserContextId it returns this context ID. This is why we need to create an entry in @contexts with that ID to receive the targetCreated and put the Target in the right Context.

Another way to get the ID that might be more reliable is to create a target in the default context and then get the ID from there, but that would require creating a throwaway target or something. I guess it'd be fine to create a Target, get its browserContextId, and then destroy it.

Copy link
Author

@ibrahima ibrahima Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm this method was also suggested (independently) by one of the Chromium engineers, so it might be fine. I am slightly nervous about the case when no tabs exist in the default context which presumably could happen, but I think generally people using browser automation should be able to ensure that the correct prerequisites are met.

One other approach might be to have the Contexts class (optionally) enumerate all existing contexts at startup and create entries for them, and then any targets that aren't part of the discovered contexts are probably from the default context. This could get out of sync potentially, but I'm not sure if there is really any problem with a Target (in Ruby) being assigned to the incorrect Context object from the Contexts map, since the Target seems to have sufficient information to operate on its own and doesn't really need info from the Context. I think the main purpose of that association is that when Context#dispose is called, it closes the connections to the Targets in that Context, so that's probably not going to be an issue in practice because if Ferrum is not aware of that Context you're never going to call #dispose on it.

Edit: The above might be slightly confusing, but this is what I mean: ibrahima@b1b7732

def compute_default_context_id
created_contexts = Set.new(@client.command("Target.getBrowserContexts")["browserContextIds"])
targets = @client.command("Target.getTargets")["targetInfos"]
all_contexts = Set.new(targets.map { |target| target["browserContextId"] })
(all_contexts - created_contexts).first
end

def each(&block)
return enum_for(__method__) unless block_given?

Expand Down
Loading