Make websites accessible for AI agents 🤖.
Browser use is the easiest way to connect your AI agents with the browser. If you have used Browser Use for your project feel free to show it off in our Discord.
With pip:
pip install browser-use
(optional) install playwright:
playwright install
Spin up your agent:
from langchain_openai import ChatOpenAI
from browser_use import Agent
import asyncio
async def main():
agent = Agent(
task="Find a one-way flight from Bali to Oman on 12 January 2025 on Google Flights. Return me the cheapest option.",
llm=ChatOpenAI(model="gpt-4o"),
)
result = await agent.run()
print(result)
if __name__ == "__main__":
asyncio.run(main())
And don't forget to add your API keys to your .env
file.
OPENAI_API_KEY=
ANTHROPIC_API_KEY=
apply.to.jobs.8x.mp4
hugging_face_high_quality.mp4
- Vision + html extraction
- Automatic multi-tab management
- Extract clicked elements XPaths and repeat exact LLM actions
- Add custom actions (e.g. save to file, push to database, notify me, get human input)
- Self-correcting
- Use any LLM supported by LangChain (e.g. gpt4o, gpt4o mini, claude 3.5 sonnet, llama 3.1 405b, etc.)
If you want to add custom actions your agent can take, you can register them like this:
You can use BOTH sync or async functions.
from browser_use.agent.service import Agent
from browser_use.browser.service import Browser
from browser_use.controller.service import Controller
# Initialize controller first
controller = Controller()
@controller.action('Ask user for information')
def ask_human(question: str, display_question: bool) -> str:
return input(f'\n{question}\nInput: ')
Or define your parameters using Pydantic
class JobDetails(BaseModel):
title: str
company: str
job_link: str
salary: Optional[str] = None
@controller.action('Save job details which you found on page', param_model=JobDetails, requires_browser=True)
async def save_job(params: JobDetails, browser: Browser):
print(params)
# use the browser normally
page = browser.get_current_page()
page.go_to(params.job_link)
and then run your agent:
model = ChatAnthropic(model_name='claude-3-5-sonnet-20240620', timeout=25, stop=None, temperature=0.3)
agent = Agent(task=task, llm=model, controller=controller)
await agent.run()
To get the entire history of everything the agent has done, you can use the output of the run
method:
history: list[AgentHistory] = await agent.run()
print(history)
You can configure the browser using the BrowserConfig
class.
The most important options are:
headless
: Whether to run the browser in headless modekeep_open
: Whether to keep the browser open after the script finishesdisable_security
: Whether to disable browser security features (very useful if dealing with cross-origin requests like iFrames)cookies_file
: Path to a cookies file for persistenceminimum_wait_page_load_time
: Minimum time to wait before getting the page state for the LLM inputwait_for_network_idle_page_load_time
: Time to wait for network requests to finish before getting the page statemaximum_wait_page_load_time
: Maximum time to wait for the page to load before proceeding anyway
For more examples see the examples folder or join the Discord and show off your project.
We collect anonymous usage data to help us understand how the library is being used and to identify potential issues. There is no privacy risk, as no personal information is collected. We collect data with PostHog.
You can opt out of telemetry by setting the ANONYMIZED_TELEMETRY=false
environment variable.
Contributions are welcome! Feel free to open issues for bugs or feature requests.
- Create a virtual environment and install dependencies:
# To install all dependencies including dev
pip install . ."[dev]"
- Add your API keys to the
.env
file:
cp .env.example .env
or copy the following to your .env
file:
OPENAI_API_KEY=
ANTHROPIC_API_KEY=
You can use any LLM model supported by LangChain by adding the appropriate environment variables. See langchain models for available options.
hatch build
Feel free to join the Discord for discussions and support.
Made with ❤️ by the Browser-Use team