OpenAI has launched Operator, an AI agent capable of independently performing tasks through its own browser. The new feature, currently exclusive to Pro users in the United States, enables users to delegate repetitive internet-based tasks to the AI, such as booking travel, filling out forms, or ordering groceries.
Operator is one of OpenAI’s first AI agents designed to execute tasks autonomously. Unlike traditional AI tools, which often rely on API integrations, Operator interacts directly with graphical user interfaces (GUIs) visible on a screen. This includes clicking buttons, typing, scrolling, and navigating, broadening AI’s applicability in everyday digital workflows.
Technology Behind Operator
The technology powering Operator is based on the new Computer-Using Agent (CUA) model. This model combines the visual capabilities of GPT-4o with advanced reasoning skills developed through reinforcement learning. Operator can “see” what’s displayed on a screen via screenshots and respond with actions mimicking those of a mouse and keyboard.
Features and Applications
Operator is designed for a wide range of tasks. Users can, for instance, ask the AI to search for and book a top-rated tour in Rome on TripAdvisor, complete a form, or order specific products. Additionally, workflows can be personalized by adding custom instructions for specific websites, such as airline preferences on Booking.com.
The AI can also handle multiple tasks simultaneously. For example, users can book a campsite and order a personalized gift at the same time.
For more complex tasks, such as entering payment information or solving CAPTCHAs, Operator requires users to manually intervene. This interactive approach minimizes errors and ensures task accuracy.
Partnerships and Business Applications
To enhance Operator’s effectiveness, OpenAI is collaborating with companies such as DoorDash, Instacart, OpenTable, Priceline, and Uber. These partnerships help Operator better address real-world customer needs while offering businesses innovative ways to improve user experiences. Additionally, OpenAI is exploring public sector applications, such as a collaboration with the city of Stockton to simplify residents’ enrollment in government services.
Future Plans
Operator is currently being rolled out as a research preview and will be improved based on user feedback. OpenAI plans to eventually make the feature available to Plus, Team, and Enterprise users and integrate its capabilities into ChatGPT. With this development, OpenAI is positioning itself as an active participant in the digital ecosystem.