In this session, we delve into the evolving landscape of AI agents. 🤖
Discover how large language models (LLMs) are transitioning from text-only tools to dynamic agents capable of navigating browsers and desktop environments.
We’ll explore two cutting-edge projects:
- Browser Use — enabling AI agents to automate complex web interactions
- Computer Use — allowing models to control desktop applications through GUIs
You’ll also learn key lessons from real-world experimentation: what worked, what didn’t, and how these insights can guide the future of AI agent development.
Timeline
00:00 Intro & Agenda
01:02 Browser Use: Overview and Capabilities
02:28 Browser Use: Practical Use Cases and Demo
09:36 Browser Use: Practical Tips
15:50 Computer Use: Introduction & OpenAI Details
21:35 Computer Use: Challenges & Tips
24:29 Computer Use: Anthropic Demo
26:02 Benchmark Results and Performance Insights
Speaker
Konrad Czarnota
Senior Machine Learning Engineer

Piotr Falkiewicz
Senior Machine Learning Engineer
