• About
  • FAQ
  • Contact Us
Newsletter
Crypto News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
  • News
  • Market
  • Analysis
  • DeFi & NFTs
  • Guides
  • Tools
  • Flash
  • Insights
  • Subscribe
No Result
View All Result
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
  • News
  • Market
  • Analysis
  • DeFi & NFTs
  • Guides
  • Tools
  • Flash
  • Insights
  • Subscribe
No Result
View All Result
Crypto News
No Result
View All Result
Home Analysis

Google’s Gemini 2.5 Pro Tops Coding Charts and MENSA Tests in AI ‘IQ’ Battle

admin by admin
May 8, 2025
in Analysis
0
Google’s Gemini 2.5 Pro Tops Coding Charts and MENSA Tests in AI ‘IQ’ Battle
189
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter


In brief

  • Google’s new Gemini 2.5 Pro tops the WebDev Arena leaderboard, outperforming competitors like Claude in coding tasks, making it a standout choice for developers seeking superior coding capabilities.
  • The AI model also features a 1 million token context window (expandable to 2 million), enabling it to handle large codebases and complex projects far beyond the capacity of models like ChatGPT and Claude 3.7 Sonnet.
  • It also achieved the highest scores on reasoning benchmarks, including a MENSA IQ test and Humanity’s Last Exam, demonstrating advanced problem-solving skills essential for sophisticated development tasks.

Google’s recently launched Gemini 2.5 Pro has risen to the top spot on coding leaderboards, beating Claude in the famous WebDev Arena—a non-denominational ranking site akin to the LLM arena, but focused specifically on measuring how good AI models are at coding. The achievement comes amid Google’s push to position its flagship AI model as a leader in both coding and reasoning tasks.

Released earlier this year Gemini 2.5 Pro ranks first across several categories, including coding, style control, and creative writing. The model’s massive context window—one million tokens expanding to two million soon—allows it to handle large codebases and complex projects that would choke even the closest competitors. For context, powerful models like ChatGPT and Claude 3.7 Sonnet can only handle up to 128K tokens.

Related articles

CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

May 22, 2025
Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

May 22, 2025

Gemini also has the highest “IQ” of all AI models. TrackingAI put it through formalized MENSA tests, using verbalized questions from Mensa Norway to create a standardized way to compare AI models.

Gemini 2.5 Pro scored higher than competitors on these tests, even when using bespoke questions not publicly available in training data.

With an IQ score of 115 in offline tests, the new Gemini ranks among the “bright minded,” with the average human intelligence scoring around 85 to 114 points. But the notion of an AI having IQ needs unpacking. AI systems don’t have intelligence quotients like humans do, so it’s better to think of the benchmark as a metaphor for performance on reasoning benchmarks.

For benchmarks specifically designed for AI, Gemini 2.5 Pro scored 86.7% on the AIME 2025 math test and 84.0% on the GPQA science assessment. On Humanity’s Last Exam (HLE), a newer and harder benchmark created to avoid test saturation problems, Gemini 2.5 scored 18.8%, beating OpenAI’s o3 mini (14%) and Claude 3.7 Sonnet (8.9%) which is remarkable in terms of the performance boost..

The new version of Gemini 2.5 Pro is now available for free (with rate limits) to all Gemini users. Google previously described this release as an “experimental version of 2.5 Pro,” part of its family of “thinking models” designed to reason through responses rather than simply generate text.

Despite not winning every benchmark, Gemini has caught developers’ attention with its versatility. The model can create complex applications from single prompts, building interactive web apps, endless runner games, and visual simulations without requiring detailed instructions.

We tested the model asking it to fix a broken HTML5 code. It generated almost 1000 lines of code, providing results that beat Claude 3.7 Sonnet—the previous leader—in terms of quality and understanding of the full set of instructions.

For working developers, Gemini 2.5 Pro’s input costs $2.50 per million tokens and output costs $15.00 per million tokens, positioning it as a cheaper alternative to some competitors while still offering impressive capabilities.

The AI model handles up to 30,000 lines of code in its Advanced plan, making it suitable for enterprise-level projects. Its multimodal abilities—working with text, code, audio, images, and video—add flexibility that other coding-focused models can’t match.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



#Googles #Gemini #Pro #Tops #Coding #Charts #MENSA #Tests #Battle

Tags: BattlechartsCodingGeminiGooglesMensaproTeststops
Share76Tweet47

Related Posts

CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

by admin
May 22, 2025
0

In brief CFTC Commissioner Summer Mersinger said Thursday crypto perpetual futures could come to market in the U.S. "very soon."...

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

by admin
May 22, 2025
0

In brief Bitcoin just marked a new all-time high, but Myriad users are now betting whether it'll top $115K by...

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

by admin
May 22, 2025
0

In brief Bitcoin just marked a new all-time high, but Myriad users are now betting whether it'll top $115K by...

Bitcoin Options Open Interest Spikes to Record High as Traders Target 6K

Bitcoin Options Open Interest Spikes to Record High as Traders Target $116K

by admin
May 22, 2025
0

In brief Open interest for Bitcoin options most recently stood at an all-time high of around $65 billion. An increase...

‘Orgy of Corruption’: Senators Slam Trump Crypto Dinner, Demand Info on Attendees

‘Orgy of Corruption’: Senators Slam Trump Crypto Dinner, Demand Info on Attendees

by admin
May 22, 2025
0

Congressional Democrats unloaded on President Donald Trump’s plans to dine with top holders of his meme coin this evening, demanding...

Load More
  • Trending
  • Comments
  • Latest
Bitcoin and Ethereum Stuck in Range, DOGE and XRP Gain

Bitcoin and Ethereum Stuck in Range, DOGE and XRP Gain

April 25, 2025
Saylor says Warren Buffett’s Berkshire Hathaway is Bitcoin of 20th century – Deep Insight

Saylor says Warren Buffett’s Berkshire Hathaway is Bitcoin of 20th century – Deep Insight

May 7, 2025
Amazon CEO on Crypto and NFTs, EPNS to Expand Beyond Ethereum + More News

Amazon CEO on Crypto and NFTs, EPNS to Expand Beyond Ethereum + More News

April 25, 2025
Why DeFi agents need a private brain

Why DeFi agents need a private brain

May 4, 2025
US Commodities Regulator Beefs Up Bitcoin Futures Review

US Commodities Regulator Beefs Up Bitcoin Futures Review

0
Bitcoin Hits 2018 Low as Concerns Mount on Regulation, Viability

Bitcoin Hits 2018 Low as Concerns Mount on Regulation, Viability

0
India: Bitcoin Prices Drop As Media Misinterprets Gov’s Regulation Speech

India: Bitcoin Prices Drop As Media Misinterprets Gov’s Regulation Speech

0
Bitcoin’s Main Rival Ethereum Hits A Fresh Record High: 5.55

Bitcoin’s Main Rival Ethereum Hits A Fresh Record High: $425.55

0
CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

CFTC Signals Crypto Perps Could Trade in US as Commissioners Head for the Exits

May 22, 2025
Democrats Threaten Lawsuits, Join Protests Ahead of Trump Memecoin Dinner

Democrats Threaten Lawsuits, Join Protests Ahead of Trump Memecoin Dinner

May 22, 2025
Solana memecoin average daily volume surges 46% in May, echoing Bitcoin’s recovery

Solana memecoin average daily volume surges 46% in May, echoing Bitcoin’s recovery

May 22, 2025
Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

Myriad Moves: Bitcoin Price Predictions and Eyes on Coinbase Hack Bounty Prize

May 22, 2025
  • About
  • FAQ
  • Contact Us
Call us: +1 23456 JEG THEME

© 2025 Btc04.com

No Result
View All Result
  • Home
  • News
  • Market
  • Analysis
  • DeFi & NFTs
  • Guides
  • Tools
  • Flash
  • Insights
  • Subscribe
  • Contact Us

© 2025 Btc04.com