
Hugging Face has launched a new AI-based, cloud-operated tool called "Open Computer Agent," enabling users to execute digital tasks through a virtual environment running on a Linux system.
Despite occasional slowness and errors, this tool represents a significant step in the world of smart automation. It empowers AI to interact with a computer similarly to how a human user would.
The tool's concept is quite similar to OpenAI's "Operator" tool. Here, a user can write text instructions, and the agent executes them within a virtual machine as if it were a real user. For instance, it can open the "Firefox" browser, navigate to Google Maps, and search for a specific address. It achieves this through its ability to see the screen and interact with it visually.
We're launching Computer Use in smolagents! 🥳
— m_ric (@AymericRoucher) May 6, 2025
-> As vision models become more capable, they become able to power complex agentic workflows. Especially Qwen-VL models, that support built-in grounding, i.e. ability to locate any element in an image by its coordinates, thus to… pic.twitter.com/mI8MuWZkIS
However, its capabilities are still limited. When testing the tool, it became apparent that it handles simple tasks well but struggles when dealing with complex tasks or responses that require interacting with CAPTCHA tests, which significantly hinder its performance.
Furthermore, running the tool requires waiting in a digital queue that might take a few seconds or minutes, depending on the load on the servers.
On another note, the goal behind developing this tool was not to launch a complete, final product but rather to provide a practical demonstration showcasing the rapid progress being made by open-source models in the field of artificial intelligence.
According to Aymeric Roucher, a member of the agents team at Hugging Face, the tool benefits from vision models capable of identifying any element on the screen using coordinates, allowing it to click and directly interact with visual elements within the virtual machine.
Despite the technical challenges, interest in "AI agent" technologies is notably increasing.
Data released by KPMG indicates that 65% of companies have already begun experimenting with this type of smart agent to improve productivity and automate repetitive tasks.
A study by Markets and Markets projected this sector to grow from $7.84 billion in 2025 to over $52 billion by 2030, reflecting widespread belief in its future potential.
The launch of "Open Computer Agent" also reflects a clear shift in how AI interacts with the digital world and represents a promising experiment in the path of developing open-source tools capable of executing real tasks in a digital environment.
As vision models continue to evolve and infrastructure efficiency improves, these agents are expected to get better and become more reliable and dependable in the future.