Good morning, Yahoo haters.

Google may have announced AI’s next big step: a model that starts to understand how the real world works.

Source: Google’s Annual I/O

Today we’re breaking down why this could be AI’s next chapter, where models stop just predicting words and start learning how reality works.

Google’s AI event looked like a miss at first.

There was no new frontier model, no obvious OpenAI killer, and no clear sign that Gemini had jumped ahead of Claude.

Gemini 3.5 Flash did not beat Claude or GPT.

Source: Google

So if you were watching for a better chatbot, Google I/O probably felt underwhelming.

But that was the wrong thing to watch.

The real announcement was bigger: Google is trying to make AI understand how the world actually works.

That announcement was Gemini Omni.

Omni combines a language model (LLM) with something called a world model, which is basically an AI system that tries to understand how things behave in real life.

  • An LLM learns from words (reading) → it knows how people describe the world

  • A world model learns from watching things happen (living) → it understands how the world behaves

AI starts moving from reading about reality to truly understanding how reality works.

The real world runs on movement, physics, sound, objects, and consequences, and video captures all of that in a way words never could.

Being told how something works is not the same as experiencing it yourself.

The next phase is physical AI: AI that has to understand real objects, real space, and real consequences.

A robot has to understand how hard to grab a cup, what happens when it tilts, and where it lands if it drops.

That same challenge shows up in:

  • Smart glasses

  • Cars

  • Factories

  • Drones

Any machine that has to operate in the real world.

World models could become training grounds where robots practice inside simulations before touching the real world.

Kind of like how Tesla uses camera data to train FSD, its driver-assistance software.

Source: Tesla

That makes world models an early bridge between digital AI and physical AI.

And this is where Google gets interesting: it has distribution.

Gemini, YouTube, Android, Search, and Workspace give Google a way to put this technology in front of billions of people.

Source: Google

So if world models become the next major AI interface, Google may not need to win by having the smartest chatbot.

It can win by making the next version of AI impossible to avoid.

LLMs taught AI to understand language → world models may teach AI to understand reality.

Where will world models matter most first?

Login or Subscribe to participate

That’s it for today!

Login or Subscribe to participate

Keep Reading