Data Traction: How To Evaluate AI Companies With Little Revenue - Part 2
Happy Sunday and welcome to Investing in AI. I’m Rob May, a Partner at PJC, where I focus on seed stage AI and robotics investments. I’m also a participant in the ScalingAI AngelList syndicate for growth AI companies, which you can join if you want to make some growth AI investments.
Be sure to check out our Investing in AI podcast as well. This past week I interviewed Rob Aitken from ARM about the cambrian explosion in AI hardware.
— Interesting Links —
How GPT-3 And AI Will Destory the Internet. ReadWrite.
The Business Value of Clustering Algorithms. Venturebeat.
Elon Musk Has No Idea What He Is Doing With Tesla Bot. IEEE Spectrum.
OpenAI Codex Demo - Live. Youtube.
Next-Gen Insurers Are Going To Need Way More AI Horsepower. Nextplatform.
— Commentary —
If you missed part one of this post series, you can find it here.
AI companies take more time to build than the SaaS companies we have become so used to. When the normal metrics like Sales, Churn, CAC, aren’t there yet, or are bad numbers, what can you look at to figure out if a company is heading in the right direction?
When I look at pre-public-launch AI startups that only have beta customers, or sometimes no customers, I tend to focus mostly on data and model related metrics. If I can see a model getting better faster, at a reasonable cost, with some emerging defensibility, it might make sense to invest ahead of the traditional metrics and take the risk that those metrics materialize or don’t. Some of the metrics I evaluate are below.
Marginal Performance Value of Additional Data – The amount that some standard chunk of data (1 new customer, 1K new images, etc) improves a ML model, on average. I’ve seen companies with 30% accurate models and human-in-the-loop get to 95% accuracy models over a couple of years.
Automation Rate – What percentage of a human task is automated by a model? If it’s growing, that’s a good thing. Competitors will need time to catch up.
Human-in-the-loop rate – What percentage of model outputs need to be touched by a human? If it’s declining, that means the margins will improve.
Data Drift Rate – How long (if at all) does it take for the data set to drift so far it’s irrelevant? This is usually difficult to measure exactly, but you can get a feel for it.
Model/Market Fit – How well does the model perform for some core group of customers? If it fits well, they will eventually pay.
Workflow Modification Rate – This is how much of the workflow has to be modified for the customer to adopt a ML solution. You can think of it as internal operational friction to adoption. The lower it is, the faster the adoption will happen. The higher it is, the more defensible and category-creating the company could be. So this has to be evaluated in the context of other variables.
Model Performance Rate – How well does the model perform on real world data sets? If there are public data sets to use it on, this could be great for marketing.
Data Augmentation Expense – What cleaning, annotation, and augmentation has to be done to data to get it to a format ready to train a model? Some expense here is a good investment in defensibility if it’s hard to do. But too much expense means you may run out of cash before you build a business.
The economics of how you get data, build models, and how those models perform against customer needs are key factors in building AI-first companies. Evaluating them like SaaS businesses is just wrong. For many AI focused investors, this provides us a chance to invest in great businesses that others pass on because they don’t understand. But there is risk as many of these metrics aren’t well understood by either investors or entrepreneurs. The best practices are emerging, but will still take time to solidify into a new paradigm of how to evaluate AI companies.
Thanks for reading.
@robmay