Happy Sunday and welcome to Investing in AI! I’m Rob May, co-founder of the AI Innovator’s Community. If you want to join our investing syndicate, it’s not up are running here. Our first deal is a very cool AI agent platform called Wayfound. You can apply to invest in the deal at this link.
I’m also looking for beta testers for a new startup I’m invested in - Silvershield. It’s a tool for adults to manage their aging parent’s phone/email/etc in a world where deepfakes and AI scams are getting more serious. Please give it a try if you are in that demographic.
My thought for today is on how slow it is for enterprise AI companies to sell products. We’ve seen this at BrandGuard, and many other companies I’ve invested in. It isn’t a lack of interest. It’s a fear of what might happen with your data when you adopt an AI tool.
A normal enterprise procurement process is slow. We’ve seen an average of 3 months on top of that just to evaluate an AI tool. Why? There isn’t a standard evaluation checklist and people don’t know all the questions they should ask. So what happens is the technical group asks for demos, architecture diagrams, etc, and asks a ton of questions as they think of them. It reminds me of selling cloud software in 2009, when businesses didn’t know what questions to ask of cloud providers.
The cloud computing market eventually standardized on key questions and an evaluation process for buyers. Part of that was the use of third party verifications and certifications like SSAE-16, ISO-27001, and SOC2. What we need is a similar certification process for AI companies.
There are some groups trying this but nothing seems to have any traction yet. While the market is probably a little too early to settle into a single certification, I think something will evolve in the next 24 months that is like a SOC2 for AI models. It’s the only way large enterprises can have certainty and offload risk of AI adoption.
This isn’t just an issue that affects the initial buying cycle though. As we use models to automate more and more tasks, how do you audit them to make sure they are doing what we expect? Take a tool like Waze, which helps drivers find the fastest route to their destination using AI. How do you actually know it’s providing the shortest path? I sometimes ignore Waze and turn somewhere else and the estimated time to arrival drops two minutes. What if a similar thing happened for an AI powered business workflow?
If you have a fully AI powered resume screening tool, how do you it’s working to give you the candidates you want? If you have an AI powered sales tool, how do you know it’s really giving you the best recommendations of which deals to focus on? Models suffer from data drift, and their performance often deteriorates over time. How will you know when that is happening unless you have a process to audit and test it against something else?
Compliance, governance, audit, and certification functions will be necessary to make AI adoption happen at scale for these reasons.
So what types of issues will these certifications likely cover? I think a few key things:
Data uses for training models for individual customers and across all customers
Established ownership rights for various levels of model outputs
Disclosures of third party models, APIs, and other tools that may have data access
Indemnification standards for various violations
Data governance workflows and processes - both technical and human driven
Oversight and clarification of any Human-In-The-Loop workflows
Ongoing audit and testing processes
Possible parallel workflows or constant A/B testing of algorithms to see which ones perform better.
This is a really interesting issue to me, having experienced it on multiple levels in my own startup and others. While everyone is watching the key tech breakthroughs in foundation models, it’s really the gaps in compliance and fit with existing workflows that are holding back AI adoption. Watching the state of those things is a better gauge to where applied AI is going and a better guide of where to invest, in my opinion.
If you are a big company executive working on this, and have ideas for how these standards may develop, let me know. I’d love to hear your ideas.
Thanks for reading.
The challenge in moving from deterministic traditional AI models to non-deterministic Gen AI models is that the output can vary with the same prompt. Any Gen AI model assurance must cater for this behaviour.