GritAI Newsletter: New Study Suggests AI Agents Are Moving Faster Than Moore's Law 📈


Hey 👋

This week's main story was the Nvidia GTC conference where Jensen Huang revealed the Blackwell Ultra GB300 AI chip shipping this year, Vera Rubin (coming 2026), personal AI supercomputers (DGX Spark and DGX Station), and GR00T N1, an open-source humanoid robot foundation model (and more..).

"AI is becoming the most valuable asset we have—it can be applied to solve some of the world’s most challenging problems.” - Jensen Huang

Another story that caught a lot of attention was a new study by METR that suggests AI agent capabilities are doubling every three to seven months, outpacing previous expectations. By comparing AI performance on 170 real-world tasks to human experts, the study found that while GPT-2 could handle only simple, seconds-long tasks, newer models like Claude 3.7 Sonnet can now complete hour-long tasks with 50% accuracy.

But that wasn’t the only AI news this week. Let’s dive in!

AI Pulse 🚨

A bite-sized curation of this week's most important AI news.

🚀 NVIDIA GTC 2025 - Jensen Huang revealed Blackwell Ultra GB300 chip, Vera Rubin (coming 2026), personal AI supercomputers (DGX Spark and DGX Station), and GR00T N1, an open-source humanoid robot foundation model.

🎯 OpenAI launches o1-pro API - Their most expensive model offers better reasoning with 200K context, starting at $150/1M input tokens—10x more than regular o1 🤯.

🌐 Claude 3.7 now includes web search — Anthropic adds real-time search to Claude with source citations. Unfortunately, it’s only available for US users for now.

🌐 Chinese Baidu launches ERNIE 4.5 and X1 — New models outperform competitors at much lower cost, with ERNIE Bot now free for all users.

🚀 Claude 3.7 achieves 50% success on long tasks - METR research shows that AI task length doubles every 7 months, with Claude 3.7 now completing 1-hour tasks reliably. At this rate, AI could handle week-long tasks by 2027-2028.

🔬 Google’s AI solves 10-year superbug mystery in 48 hours - AI co-scientist independently proposed the correct hypothesis for antibiotic resistance mechanisms, matching researchers’ decade-long findings.

🎯 Mistral Small 3.1 outperforms Gemma 3 and GPT-4o Mini - Mistral AI’s new 24B parameter model features improved text performance, multimodal understanding, and a 128k token context window.

🎨 Google adds Canvas and Audio Overview to Gemini - New features include an interactive coding editor with real-time previews and AI-generated podcast-style discussions from documents.

🗣️ Sesame releases CSM-1B speech model — An open-source version of its near-human speech model that shocked everyone a few weeks ago with emotional intelligence.

💪 OpenAI building massive data center - First facility for $100B Project Stargate will house 400,000 NVIDIA AI chips, becoming one of the biggest AI computing clusters when completed in 2026.

⚠️ Anthropic’s Red Team warns of AI security risks - AI models are approaching expert-level knowledge in cybersecurity and biology, requiring stronger safeguards as capabilities advance.

Adobe launches 10 AI agents for content creation - New agents help marketers scale content, find target audiences, and automate repetitive tasks.

🤔 Fei-Fei Li warns of AI risks - “Godmother of AI” co-led policy report calling for regulation that considers future risks and requires tech companies to disclose safety tests.

🤖 Google DeepMind showcases Gemini Robotics - Multimodal models enable robots to interact with the real world using advanced spatial understanding.

🧠 Google adds Mind Maps in NotebookLM - Interactive visual summaries help explore connections between ideas and provide tailored learning tools.

💻 Codeium launches “Windsurf Tab - AI code prediction feature integrates terminal and clipboard for better context-aware suggestions.

Trending on Product Hunt 📱

Here are the hottest AI tools trending on Product Hunt this week

  • 🤖 Aha - The world’s first “AI influencer marketing team”
  • 🔍 Sider 5.0 - Deep Research mimics human research
  • Twos PALs - AI-powered lists that save you time
  • 🗣️ Epiphany - Turn voice notes into actions
  • 📝 timeOS 3.0 - Ridiculously good meeting notes for macOS
  • 🎨 Gemini Canvas - Interact and collaborate with AI to create
  • 🎧 Elevenreader by ElevenLabs - Elevate your listening experience

These ChatGPT Features Save Me Hours

"ChatGPT can be your ultimate tool for work, learning, and creativity—if you know how to use it the right way. I've spent hundreds of hours testing every feature so you don't have to."

Click to watch the full tutorial 👇

Did someone forward this to you?

Click here to sign up to our newsletter​

Have questions? Hit reply to this email and we'll help out!

113 Cherry St #92768, Seattle, WA 98104-2205
Unsubscribe · Preferences

GritAI Newsletter

Weekly AI news and content on how to make AI work for you 💪

Read more from GritAI Newsletter

Hey 👋 This week, the "vibe coding" hype showed no signs of slowing down. Google rolled out an updated version of Gemini 2.5 Pro—its most capable model yet for coding apps—ranking first on the WebDev Arena Leaderboard. Rumors also surfaced that OpenAI is acquiring Windsurf, a leading AI coding tool, hinting at deeper integration of native code capabilities inside ChatGPT. At the same time, Cursor, now arguably the most popular AI coding tool, announced a massive funding round—reaching a $9B...

Hey 👋 This week Meta released a new AI assistant powered by Llama 4 that uses your Facebook, Instagram, and WhatsApp data to give more personalized responses. With nearly 1 billion people already using Meta AI each month across its apps, Meta is a strong contender in the AI consumer market competing with ChatGPT and Google Gemini — that is, if users are willing to share their social media data with AI. OpenAI had to remove their latest GPT-4o update because users found the AI was being too...

Hey 👋 This week, OpenAI made its new image model available for developers “gpt-image-1”, unlocking thousands of exciting startup use cases. Big names like Adobe, Figma, and others have already integrated it into their offerings, and we’re expecting a wave of new apps to build on top of these image generation capabilities. Elon Musk’s xAI also made moves: the Grok 3 family of models is now available via API, priced competitively compared to other reasoning models. On top of that, the Grok...