Amazon Nova Act: New AI Agent Challenges OpenAI

Amazon has announced Nova Act, a new artificial intelligence model and digital agent capable of automatically performing tasks within web browsers.

The company positions it as its bid to compete with tools like OpenAI's Operator and Anthropic's Computer Use.

This announcement coincides with the launch of a new website allowing interested users to explore the Nova models the company unveiled last December.

What is Amazon Nova Act?

Nova Act represents the first public product from Amazon's Artificial General Intelligence (AGI) lab in San Francisco, led by former OpenAI experts such as David Luan and Peter Abbeel.

This lab aims to develop AI agents capable of executing complex tasks within the browser without relying on traditional APIs, aiming to provide users with a smoother and more efficient experience.

Nova Act allows users to build agents capable of performing step-by-step tasks, such as booking appointments and completing recurring orders, even handling elements challenging for other systems, like dropdown menus or pop-up dialog boxes.

The model is currently available as a Python software package (SDK). This enables the creation of agents that can execute natural language instructions and operate in the background without displaying any visual interfaces, making it an ideal choice for advanced commercial applications.

Amazon describes this initial release as a "Research Preview," emphasizing that it's still in an early stage, intended to gather feedback before a wider rollout.

Comparing Nova Act with Competitors

Amazon's internal tests suggest Nova Act outperforms OpenAI and Anthropic agents in tasks like interacting with on-screen text.

Specifically, it achieved 94% accuracy on the ScreenSpot Web Text benchmark compared to 88% for OpenAI's model and 90% for Anthropic's Claude 3.7 Sonnet model.

However, Amazon has not subjected its agent to more widely recognized benchmark tests like WebVoyager. This raises questions about the results' comparability against common industry standards.

Through Nova Act, Amazon also aims to enhance the capabilities of Alexa+, the upcoming version of its generative AI-powered voice assistant.

With this move, the company is betting on integrating AI agents into its consumer products, which could give it a competitive edge considering the widespread adoption of Alexa devices.

Accessing Nova Models and Nova Act

Amazon has made its latest generative AI models accessible via its new portal: nova.amazon.com.

Interface section for Amazon Nova Act, indicating it's a 'Research Preview' for the AI agent building SDK.

The company noted that exploring the new model and downloading the Software Development Kit (SDK) is currently available to US-based customers with an Amazon account.

However, the platform isn't limited to just Nova Act. It offers users worldwide the chance to explore and experiment with other foundational models in the Nova family directly through the website's interface.

These available models for experimentation include:

  • Nova Pro and Nova Lite: Multimodal models capable of accepting text, image, or even video inputs to produce text outputs.

Example of text generation using the Amazon Nova Pro model.
Example of text generation using Amazon Nova Pro. Credit: Arab AI

  • Nova Micro: Primarily focused on processing text inputs and generating text outputs.

Furthermore, the site allows users to generate images directly using the Nova Canvas model. Users can also explore a video gallery showcasing the capabilities of the Nova Reel model (which is fully available within the Amazon Bedrock service).

Example of image generation by the Amazon Nova Canvas AI model.
Example of image generation using Amazon Nova Canvas. Credit: Arab AI

This broad access allows interested individuals to get a practical sense of the capabilities of Amazon's new generation of language and multimodal models.

Can Amazon Overcome the Challenges?

As competition intensifies in the AI agent space, Amazon faces challenges related to reliability and execution speed.

Previous experiences with systems from OpenAI, Google, and Anthropic indicate that these sophisticated models can still make errors when operating autonomously for extended periods.

Therefore, Amazon will need to demonstrate that Nova Act can effectively overcome these hurdles to become a truly practical and reliable tool in the market.

In conclusion, with this launch, Amazon positions itself as a major player in the AI agent field. It leverages its significant investments in foundational AI models and its Bedrock platform, which provides access to a diverse range of models.

As development in this rapidly evolving area continues, real-world testing and adoption will ultimately determine whether Nova Act represents a significant leap forward or simply another step in the ongoing AI race.

  • Related Posts

    OpenAI Announces Open-Weight AI Model, Rivals DeepSeek
    • April 1, 2025

    OpenAI has announced its plans to release an open-weight AI model in the coming months, signaling a strategic shift following the…

    Gen-4: Runway Unveils Top-Tier AI Video Generator
    • March 31, 2025

    AI technology company Runway has unveiled its new model, Gen-4, regarded as one of the most advanced video generators developed to…

    Leave a Reply

    Your email address will not be published. Required fields are marked *