Google Officially Launches Gemini 2.5 Flash Preview: Its First Hybrid Model with Controlled Thinking

Google has officially launched the preview version of its Gemini 2.5 Flash model within the Gemini app and developer platforms such as Google AI Studio and Vertex AI.

The company had announced the model earlier this month, before launching its initial preview version on Thursday under the name Gemini 2.5 Flash Preview.

Balancing Quality, Cost, and Speed

The model is designed to give developers precise control over the level of analysis they require, depending on the specific task and available budget.

Gemini 2.5 Flash is Google's first model to adopt a fully hybrid thinking approach, similar to what Anthropic introduced with its recent Claude model.

Building on this, developers can flexibly enable or disable the thinking capability, as well as adjust the "thinking budget" based on specific task requirements.

Even when the thinking capability is disabled, the model remains faster than its predecessors and delivers improved results.

Therefore, this release is positioned as an enhanced version of the previous 2.0 Flash model. It delivers significantly higher reasoning performance while maintaining fast processing speeds and low costs.

جدول يقارن نموذج Gemini 2.5 Flash Preview من حيث التسعير والأداء في اختبارات قياسية مختلفة مع نماذج منافسة مثل Claude 3.7 Sonnet و OpenAI o4-mini، مع إظهار تفوقه في بعض المهام.

جدول مقارنة أداء وتسعير Gemini 2.5 Flash مع نماذج الذكاء الاصطناعي المنافسة. المصدر: جوجل

Google also revealed the API pricing details for the model. Input costs are set at $0.15 per million tokens, while output costs vary depending on whether the thinking capability is enabled.

With thinking disabled, the output cost is $0.60 per million tokens. However, this increases significantly to $3.50 per million tokens when thinking is activated, reflecting the higher cost associated with using the advanced reasoning features.

Regarding this, Google's Director of Product Management, Tulsi Doshi, stated that this tiered pricing allows developers to experiment with how the thinking budget impacts model accuracy. She noted a clear improvement in benchmark results as more tokens are allocated for thinking.

Additionally, the model supports "dynamic thinking," a feature allowing it to automatically adjust its analysis level based on input complexity.

Although this concept is still experimental, Google aims to use these preview releases to gather user feedback and refine control over this feature going forward.

The new model will replace the experimental Gemini 2.0 Thinking version within the app. Meanwhile, Google continues to offer the Gemini 2.5 Pro model, although it also remains in preview.

Furthermore, it has been added to the Google AI Studio interface, where developers can experiment with it for free.

لقطة شاشة لواجهة Google AI Studio تُظهر اختيار نموذج Gemini 2.5 Flash Preview بتاريخ 17-04، مع إشارة إلى تاريخ قطع المعرفة في يناير 2025 ضمن إعدادات النموذج.

Notably, Gemini 2.5 Flash is also now available to users within the Gemini app. It supports the Canvas feature, enabling interactive text editing and coding, with broader support for advanced research expected in the future.