GPT-4.5 Passes the Turing Test | "When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant."
"Large Language Models Pass the Turing Test", Jones and Bergen 2025 ("When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant.")
Gemini 2.5: Our newest Gemini model with thinking
DeepSeek-V3-0324
OpenAI’s First Stargate Site to Hold Up to 400,000 Nvidia Chips
Waymo’s self-driving cars headed to San Jose and SFO
QwQ-32B: Embracing the Power of Reinforcement Learning
Waymo is now available exclusively on Uber in Austin
GPT-4.5 compared to Grok 3 base
DeepSeek rushes to launch new AI model as China goes all in
Grok 3 Benchmarks
First Grok 3 Benchmarks
"Competitive Programming with Large Reasoning Models", El-Kishky et al 2025
Trading Inference-Time Compute for Adversarial Robustness
Announcing The Stargate Project
It’s been a rough year for robotaxis — but not for Waymo
OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
How Google turned Jaguars into self-driving taxis, but General Motors gave up
ARC Prize 2024
"Mastering Board Games by External and Internal Planning with Language Models", Schultz et al 2024 (Google DeepMind)
Elon Musk's xAI Memphis Supercomputer Eyes Expansion to 1 Million GPUs
Predicting Emergent Capabilities by Finetuning
Uber and Lyft drivers say Waymo's robotaxis are hurting their earnings in Phoenix and LA
OK, I can partly explain the LLM chess weirdness now
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
Waymo’s robotaxi depot is still honking its San Francisco neighbors awake