Google DeepMind has revealed the Gemini 2.5 Computer Use model, a specialised version of its Gemini 2.5 Pro AI that will be able to interact with user interfaces. The model is available in preview via the Gemini API through Google AI Studio and Vertex AI Studio.
The model allows AI agents to complete tasks by interacting directly with graphical interfaces, such as filling out forms, clicking buttons, scrolling, and operating behind logins.
Merits for Developers
Developers will be able to access the model through the Computer Use tool, which operates in a loop. Inputs include the user request, a screenshot of the environment, and a history of recent actions. The model generates responses in the form of UI actions, which are executed by the client-side code. The loop continues with updated screenshots and context until the task is completed or terminated.
Gemini 2.5 performance against Benchmarks
Gemini 2.5 has demonstrated strong performance on web and mobile control benchmarks, including Online MID2Web, WebVoyager, and AndroidWorld. Google DeepMind emphasised safety, noting that AI agents controlling computers carry the risk of misuse, unexpected behaviour, and web-based scams.
The company said it has integrated safety features into the model and provides developers with controls to prevent harmful actions. “Developers can further specify that the agent either refuses or asks for user confirmation before it takes specific kinds of high-stakes actions,” it added.
Also Read: HMD Launches New Hybrid Touch 4G at Just Rs 3,999 – Best Choice For a Secondary Phone?











