Three 9's Thesis of AI Investing: Why I'm Short AI Agents
What use cases don't require 99.9% accuracy?
Happy Sunday and welcome to Investing in AI. Since we’ve had a lot of new signups recently, I’ll reiterate that I’m Rob May, CEO at Neurometric, and Partner at HalfCourt Capital, which is a multi-stage tech investing firm with a thesis driven AI approach to investing. I also run the AI Innovator’s podcast, so check it out if you have a chance.
At HalfCourt, we’ve gone deep on agents to try to form a thesis on where to invest. We’ve only made one agentic investment so far - in WayFound, which is a platform for agent governance and compliance.
While we are very very bullish on agents long term, in the short term, we think agents aren’t ready for prime time for most use cases. Why? We call it the Three 9’s problem.
The terminology comes from Service Level Agreements (SLAs) for cloud computing platforms. It describes their uptime. Three 9s is .999% uptime. Five nines is .99999 % uptime. You get the picture. The less downtime your application can tolerate, the more 9s you need in your SLA.
Where it applies to agents is different - we are looking at the performance of the underlying AI models that run the agent processes. If you have an AI agent that is 92% accurate, that sounds great right? But if that agent must perform 5 steps to get a task done, and the model is 92% accurate at each task, you have to multiply .92 five times to get the overall accuracy of the workflow. That number is 65%.
Now take an example of an agent that is going to book travel on your behalf. If you are flying from the New York area, like I usually am, and you tell the agent, “book me a flight to Denver”, then that agent has to make 5 decisions for you:
What airport to fly from.
The tradeoff between price and flight time.
The tradeoff between price and direct vs layover.
The seat you want
The level of ticket you want (refundable, etc)
If the model is 92% accurate at representing your preferences, it means you will be happy with 65% of the agent’s decisions. Which means you will be unhappy with slightly less than half your travel decisions. As much as booking travel sucks, changing travel is an even worse experience. If I was using this travel bot, I don’t think I would last very long as a customer.
What that means for our investing thesis in agents is two things.
The first is - agentic companies will be successful in direct proportion to the three 9s issue. Use cases that don’t require three 9s, and aren’t frustrating at that level, will be the first to be seriously adopted. This is why things like generating images as part of a marketing workflow is a good use case. If the combined performance of a marketing campaign AI agent is 65%, that’s still fine, and productive, and not frustrating. As an investor, you want to think through these use cases. Where can a relatively poorly performing agent still be useful? Invest there first.
The second thing it means is - how do you find tactics to improve agents to Three 9s for the use cases that require it? There are so many use cases where, unless the agent is really excellent, it’s too annoying and best just to do the task yourself. These tasks are often the most annoying to do as a human so it is very tempting to adopt an agent, but the agent performance has to be very very high.
As a firm, we haven’t found very many situations for either of these use cases yet, and when we have, they tend to be crowded spaces where it is difficult to predict who will win and why they will win. I expect over the next two years many of these agent firms will go out of business but also the right user preferences will start to emerge to make these opportunities more clear. My advice as an investor is to stay on the sidelines until that happens, but keep watching the space because when it hits, it will hit quickly and everyone will pivot to the use cases and techniques that are working.
Thanks for reading, and as always, would love your feedback if you have a different take on the agent ecosystem.
The three 9s issue will be even more relevant in enterprise adoption of agents than it will be in consumer adoption of them. One reason I expect enterprise adoption of AI--beyond merely using it to generate marketing copy, etc.--will be much slower than the accelerationist types expect.
I recently read somewhere to treat these agents like a college intern, and that seems about right to me. "This person [agent] doesn't know exactly what I need, may make some odd assumptions, and I am going to have to check their work closely - so what tasks are still net positive for me to assign to them?"