What Will Be AI 2.0?
Will it be agentic AI? Or something else?
Happy Labor Day weekend and welcome to Investing in AI. I’m Rob May, CEO at Neurometric, and last week we finally announced what we are building. So you should go check it out.
When you look at the history of technology, some trends, like the web, have been divided into phases. We’ve all heard Web 1.0, Web 2.0, and Web 3. My question today is - are we at a market and technology breaking point in AI that will lead to an emergence of AI 2.0 on the other side? If so, what will that be?
We could demarcate the AI boundaries in a few ways. One would be technology, like saying pre and post Transformer. One way would be use cases, like maybe pre and post agentic. And I think agentic AI is a candidate for AI 2.0, if that takes off. But, I also think agents are going to end up much like slackbots - not living up to the hype in the next year or two, and fading out as a result.
My vote for the demarcating line between AI 1.0 and AI 2.0 is going to be new types of models that bring new types of capabilities. In particular, I am bullish on “world models”. If you don’t know what they are, I asked Claude to explain it below.
World models represent one of the most promising approaches to building more intelligent and adaptable AI systems. At their core, world models are internal representations that allow AI systems to understand and predict how their environment works. Rather than simply reacting to inputs, these systems build a mental model of the world that enables them to simulate outcomes, plan actions, and reason about cause and effect.
Traditional AI systems often operate reactively—they see an input and produce an output without truly understanding the underlying dynamics of their environment. World models change this paradigm by enabling AI to learn the rules and patterns governing their domain. For instance, a robot with a world model doesn't just learn that pushing an object makes it move; it learns the physics of how objects interact, allowing it to predict novel scenarios it hasn't directly experienced.
The architecture of world models typically involves learning a compressed representation of the environment's state, along with transition functions that predict how states evolve over time. This learned representation serves as a simulation engine that the AI can use for planning. By imagining different action sequences and their likely outcomes, the system can choose optimal behaviors without costly trial-and-error in the real world. This approach has proven particularly valuable in robotics and game-playing AI, where physical experimentation or extensive real-world training would be impractical or dangerous.
Yann LeCun's Joint Embedding Predictive Architecture (JEPA) represents a significant evolution in world model design. Unlike traditional predictive models that attempt to reconstruct full sensory inputs pixel-by-pixel, JEPA learns abstract representations of the world in a latent space. The key innovation is that JEPA predicts the representation of future states rather than the raw sensory data itself. This approach sidesteps the challenge of modeling irrelevant details—like the exact position of every leaf on a tree—while capturing the essential dynamics needed for intelligent behavior. JEPA uses self-supervised learning to train these representations, comparing predicted embeddings with actual future embeddings to refine its understanding without requiring labeled data.
The implications of effective world models extend far beyond current applications. They promise AI systems that can engage in counterfactual reasoning ("what would happen if..."), transfer knowledge between domains, and exhibit more human-like common sense. A robust world model could enable an AI to understand that water flows downhill, objects fall when unsupported, and broken things generally stay broken—intuitions that humans take for granted but current AI systems often lack.
However, significant challenges remain. Learning accurate world models for complex, high-dimensional environments requires enormous amounts of data and computation. The models must also handle uncertainty and partial observability—real-world environments rarely provide complete information. Furthermore, determining the right level of abstraction for the model remains an open problem: too detailed, and the model becomes computationally intractable; too abstract, and it may miss crucial details needed for effective decision-making.
As research continues, world models are likely to become increasingly central to AI development, potentially serving as the foundation for systems that can truly understand and reason about their environment rather than merely pattern-match their way to solutions.
It feels like we are close to solving these world model problems and making the technology viable. Doing so will enable the next wave of applications in AI, and so, seems to me like that will be the beginning of the next wave of AI. Now that even Sam Altman is starting to say we may not be on the cusp of AGI, it feels like the current approaches are stagnating, and world models could be the breakthrough for AI 2.0.
What do you think? And as always, thanks for reading.

Maybe 1.5 will be the reliable agent action phase of things.
I tend to agree that world models represent a sea change worthy of being 2.0, but i am not sure how close we are to solving them. Also, it is possible there wont be a 2.0 for everything. For example, agentic might be 2.0 for some usecases, and WMs will be a paradigm shift for robotics, where these dynamics are crucial for making robots better by an order of magnitude. I'm not sure WM will have the same qualitative impact on 'regular' software contexts. I do agree that agents are at risk for being a total flop (and they can be alsp transformative - i just think its not a done deal yet).