MiniMax-M1: 1M-Token Open-Source AI Takes on Giants

A visual representation of the open-source MiniMax-M1 AI model, highlighting its data processing capabilities and 1-million-token context window.

Chinese startup MiniMax, known for its realistic video generation model Hailuo, has just unveiled its latest large language model, MiniMax-M1.

The release is welcome news for enterprises and developers alike. The model is fully open-source under the Apache 2.0 license, allowing companies to use it for commercial applications and modify it freely without restrictions or fees.

MiniMax-M1: Features and Availability

M1 presents itself as an open-weight model setting new standards in long-context reasoning, effective tool use, and computational efficiency.

The model is now available on the AI code-sharing platform Hugging Face and on GitHub, kicking off what the company has dubbed "MiniMaxWeek" on its X account, with more product announcements expected.

Access the model on Hugging Face

A standout feature of MiniMax-M1 is its ability to process one million input tokens and up to 80,000 output tokens. This places it among the most capable models for handling reasoning tasks that require extensive context.

For comparison, OpenAI's GPT-4o handles just 128,000 tokens, while Google's Gemini 1.5 Pro offers a 1-million-token limit, with reports of a 2-million-token version in development.

But M1's advantages don't stop there. It was trained using an innovative and highly efficient reinforcement learning technique. The model employs a hybrid Mixture-of-Experts (MoE) architecture with a swift attention mechanism designed to reduce inference costs. According to the technical report, M1 consumes only 25% of the floating-point operations (FLOPs) required by the DeepSeek-R1 model when generating an output of 100,000 tokens.

Architecture, Versions, and a Surprising Cost

The model comes in two versions: "MiniMax-M1-80k" and "MiniMax-M1-40k," referring to their respective "thinking budgets" or output lengths.

Its architecture is based on the company's previous model, MiniMax Text-01, and features 456 billion total parameters, with 45.9 billion parameters activated per token.

One of the most striking aspects of this release is the model's training cost. MiniMax reported that M1 was trained using large-scale reinforcement learning with rare efficiency, at a total cost of just $534,700. This efficiency is attributed to a custom reinforcement learning algorithm called CISPO and a hybrid attention design that simplifies scaling.

Such a figure is astonishingly low for an advanced large language model. For perspective, the training cost for the DeepSeek-R1 model was reportedly between $5 million and $6 million, while training OpenAI's GPT-4—a model now over two years old—exceeded $100 million.

MiniMax-M1's Performance and Competitive Edge

The model underwent evaluation across a range of established benchmarks testing advanced reasoning, software engineering, and tool use. In the AIME 2024 math competition, the M1-80k model achieved an accuracy of 86.0%.

It also delivered a strong performance in coding and long-context tasks, outperforming open-weight competitors like DeepSeek-R1 and Qwen3-235B-A22B on several complex tasks.

While closed-source models like OpenAI's o3 and Gemini 1.5 Pro still lead on some benchmarks, MiniMax-M1 significantly closes the performance gap while remaining freely available under the permissive Apache 2.0 license.

Deployment Options and Impact for Decision-Makers

MiniMax recommends using vLLM as a serving backend, given its optimization for large model workloads, memory efficiency, and batched request processing. The company also offers deployment options using the Transformers library.

MiniMax-M1 supports structured function-calling capabilities and comes with a chatbot API that includes web search, video and image generation, speech synthesis, and voice cloning.

For engineering leaders managing the LLM lifecycle, the model offers lower operational costs while supporting advanced reasoning tasks. Its long context window can dramatically reduce preprocessing efforts for enterprise documents or log data spanning tens or hundreds of thousands of tokens.

For AI orchestration managers, the ability to fine-tune and deploy MiniMax-M1 using established tools facilitates smooth integration with existing infrastructure.

From a data platform perspective, teams responsible for maintaining efficient and scalable infrastructure can leverage M1's support for structured function calling and its compatibility with automated pipelines.

Finally, security leaders may find value in evaluating the model's potential for a secure, on-premise deployment of a highly capable model that doesn't rely on sending sensitive data to third-party endpoints.

In conclusion, MiniMax-M1 represents a flexible and powerful option for organizations looking to experiment with or scale advanced AI capabilities while managing costs, operating within their limits, and avoiding vendor lock-in.