This newsletter is 100% human written by me - no AI, no ghostwriters.
Happy Sunday and welcome to Investing in AI. I’m Rob May, CEO at Nova. I also run the AI Innovator’s Podcast. If you know someone who would be a great guest, let me know.
Today I want to attempt to tie together 3 concepts I’ve written about in the past and why they concern me that AI might make things worse that anticipated. The first is this idea that I wrote about 10 years ago in Venturebeat, that the internet is killing innovation. It’s based on research that shows how the internet narrowed the scope of scholarship, not expanded it.
The second is this idea that I’ve covered in various newsletters that friction can be a good thing. Tech has this obsession with lowering the friction to any action but sometimes pauses, challenges, or frictions to do something good. Those frictions, however small, mean we take it more seriously.
And the third idea I’ve written about is that AI isn’t just dangerous because it could come take over the world someday (I’m a skeptic on that). It’s dangerous because it could reveal things about humanity that we don’t want to believe because it can assimilate information at scales we can’t and (hopefully) without the cognitive biases we have. This is dangerous because we humans like to believe we are uniquely special, despite quite a lot of evidence that we are not.
Now that I’ve set the baseline with those points, I want to begin with this notion that started circulating about 2 years ago that LLMs were just “stochastic parrots.” The idea emerged as part of the debate about whether these models are intelligent, and the stochastic parrot comment was meant to explain why they aren’t very close to intelligence.
The issue I want to address is - are humans doing more stochastic parroting than we think, and will LLMs actually make it worse? Here is my thinking...
First of all, I think too many people don’t really understand what they are talking about. This may have always been the case, but I blame a lot of it on the Internet. With information being so easy (low friction) to find, it’s very easy to go 2 minutes deep on a topic whereas it wasn’t previously, and that can be misleading.
Take an issue that doesn’t come up in casual conversation very often, something like LIFO accounting. In 1992 if you mentioned LIFO accounting, and someone knew that it stood for “Last In First Out” inventory accounting, they probably also knew some other things about LIFO. Why? Because you couldn’t Google it and in 2 minutes understand the definition. Because there was friction in finding information, it was uncommon to go 2 minutes deep on anything. If you went to any level of depth it usually had to be more worthwhile because the time investment to find a resource to explain it was higher.
But knowing what LIFO stands for doesn’t mean you understand it. In the accounting world the choice of LIFO has many implications. Trained accountants understand these. For example, are there more tax benefits to LIFO if your raw materials are going up in price every year, or down in price every year? Really understanding a topic means you can manipulate it, understand how it connects to other ideas, and evaluate it situationally. I rarely see that in people, even in people deeply trained in an area - they are often just deeper stochastic parrots.
The way this plays out in the workforce is that people that graduated before the internet was a big deal are often overly impressed by partial knowledge because they assume if you know anything about a topic, it’s more than 2 minutes worth of depth. They assume this because at the time they graduated college it was hard to go just 2 minutes deep on something and know “about” it without really understanding it. So I think a lot of stochastic parrots have been promoted over the years because they seem smart about things that they don’t really understand, but really they’ve just been gaming a system that didn’t account for the post-internet ability to appear more knowledgeable than they are.
I’ve interviewed so many people who have VP or C level in their titles who, when asked about how they would run a department, they give me a stochastic parrot answer. Basically, they have seen it done a certain way and they can replicate that way. Years ago when I hired my very first VP of Sales, I interviewed a lot of stochastic parrots. I’d ask how they would sell a product like ours and the answer that came back was usually something like “well we did it this way at my previous company and that worked well.” The guy I finally hired gave me a totally different answer. He said “tell me about your customer and how they buy.” Then he proceeded to point out characteristics of the product and the buyer that would dictate the sales model we should use. He said things like “if buyers don’t know much about the space and rely on advisors, a channel model with those advisors might work best.” Or “if buyers have a lot of social pressure to buy and reference each other, splitting the sales team by geography might work best so they know a customer nearby.” He wasn’t parroting back to me a sales model with a few changes. He really understood sales and how to apply different models to different use cases.
This is an important point I want to make. Maybe stochastic parroting is more human-like than we think, and thus is indeed a big step on the spectrum to intelligence, because in my experience hiring, I’d say 75% of the workforce is doing some form of stochastic parroting instead of actual thinking.
What this means for the future scares me, because LLMs could make it worse. Back to the idea of lowering the friction to things and getting homogenized outputs - I worry LLMs will accelerate that trend. Here’s an example based on how we teach calculus.
If you ever took calculus, you know how they teach it? They still start by showing you the original definition of a derivative and show you how to work through it.
Then you learn the power rule and you think why the hell did I have to learn that stupid more complicated rule first when the power rule is so easy? It’s because math teachers don’t just want you to know how to calculate a derivative, they want you to understand what a derivative is. This is an important distinction.
Calculus is almost 400 years old, but they still teach it to you starting with the foundation of what it means. They use the slower, higher friction way, because if they don’t, you may not really understand it at the right level of depth, even though you could solve some easy derviatives with the power rule, but can’t really apply it to situations where you don’t have an example problem.
Now back to LLMs, and in particular ChatGPT. This is dangerous. It’s dangerous because it could make everyone seem, on the surface, a little smarter than they are.
I’ve been a big fan of the thought process first described in the book Prediction Machines, that the economic complement to prediction is judgment, and when the price of prediction drops, the value of judgment goes up. What does that mean in a world where becoming a stochastic parrot is easier and easier? How you can you have judgment in an area if you haven’t done the hard work to really understand it? If we all rely on AI more and more, do we lose some of our understanding, and will our judgment be worse?
If you’ve read the book Anti-Fragile by Nassim Taleb, I’d tie all this back to one of the ideas in that book - that small failures in systems help keep it robust against larger failures. When AI takes all the small failure modes out of various decision making because we can trust the AI with those minor things, could the outcome be that we are even more stochastic parrots ourselves, our judgment is worse, and the risks of a bigger blow up of some kind are building under the hood? As we start to rely on AI to make more recommendations for work, politics, and life, and these recommendations save us from minor troubles, are we setting ourselves up to be unprepared for bigger troubles - for the novel things not in the training data set of the AI - because we aren’t prepared and our judgment is weakened?
In summary, I think the internet created more stochastic parrots by homogenizing thought and allowing perceived depth in a topic without really understanding it. As a result, ChatGPT might be saying more about humans than we realize. We want to believe we are special and ChatGPT’s stochastic parroting isn’t intelligence, but I’d argue it’s the way a lot of humans actually operate. And I think that it could possibly get worse, if we rely too much on these models to make decisions for us and don’t still dig in and spend the time to really learn and understand. More areas of knowledge (business in particular) need to be taught like Calculus is - understanding it from the core. Hiring processes need to penalize stochastic parrots. And we need to find ways to continue to build judgment in humans so that we can use AI in the right ways, and not let it make us dumber.
I concur with many of the general points made, but we all build upon the work from others and we may not always need to understand underlying systems to the same depth. Case in point computer science / coding. I’m not sure it’s useful at this point to teach assembly code. For the vast majority of developers, memory management may not be that important with modern garbage collection. Heck, even simply knowing how to apply data structures (lists, hashtables, etc) is probably sufficient.
That’s not true for EVERY software engineer role, but lots of people can build meaningful careers and not need to graduate from a top tier Computer Science program. So the levels of abstraction in knowledge probably benefit society overall - otherwise the barriers to advancement become too high.
So parroting could certainly be a symptom of an underlying societal disease, but maybe it’s also enabling sufficient, optimized knowledge transfer that society (“the market”) actually requires.
This is a great point and I should clarify that I dont' expect people to know all the details of everything. But, too many people don't really understand anything at any level beyond the surface depth.
Yep, I often feel that way as well! Just want to point out that “abstraction” can be useful for society. It’s probably OK for one to have surface knowledge in some areas, but they should be an expert where they want to specialize. Perhaps the real problem you identify is when people claim to be an expert/leader and ONLY have surface knowledge in that domain!
that may precisely be what 'judgment' is - knowing when the abstraction is sufficient and when it is not.
in an ongoing discussion i have with a friend about impact on [fill-in-the-blank] from ChatGPT, i have referenced the 'on the shoulders of giants' concept, and the various levels of insight one has depending on how many 'layers down' they are personally knowledgable of.
the calculus example seems pretty relevant, but may highlight the difference between knowledge for the sake of understanding and knowledge for the sake of expediency.
We’ve been using the “inch deep” stochastic parrot technique for 500 years - in the form of books. The main difference in the internet age is that someone can take 1000 hours to research and write a book, which someone else takes 10 hours to read, and then boils it down to a 10-minute blog post, and we believe that in reading for 10 minutes, we’ve gained the original 1,000 hours of work (or at least the 10 hours of reading).
I tend to feel that we have largely become stochastic parrots in many areas - and generative AI has just read more blog posts than the average human.
I’m not convinced that the internet has made it easier for people to skim a topic without knowing it’s depth (aka the Imposter Problem). On contrary I think the internet makes it easier for non experts to spot when someone else is an imposter.
Scammers, imposters, liars, snake oil salesmen... these are nothing new. The internet does give them a bigger platform but it also makes it easier to validate their claims independently.
The world was rife with bad business decisions for most of human history and the scale and impact was well beyond anything we could even wrap our brains around today. It was common to have misinformation affect public policy for decades or even centuries prior to the Information Age.
So while I think we all get annoyed more easily - because many of us are cognitively capable of self educating and identifying misinformation quickly now- we shouldn’t mistaken that for thinking that there’s actually more misinformation than in the past. It just that in the “old days” even the most well informed of us wouldn’t necessarily have enough knowledge access to identify misinformation as it was presented to us. We’d just live our lives out never quite sure if that superstition about the mirror was true or not, or whether the Earth was really flat or round. And then die. And our children would live the same problem for their lifetime too.
Nowadays the information is there for those of us who can think. The bad news is that plenty of people still choose to believe misinformation when it suits their narrative, even when it’s clearly debunked. But before the internet, those people still existed. We just couldn’t differentiate them from everyone else.
Rob, the perfect application of this piece is the management consulting industry, where so many people who actually work on projects (as opposed to real senior management) are stochastic parrots bopping from project to project from one complex domain to another.
I concur with many of the general points made, but we all build upon the work from others and we may not always need to understand underlying systems to the same depth. Case in point computer science / coding. I’m not sure it’s useful at this point to teach assembly code. For the vast majority of developers, memory management may not be that important with modern garbage collection. Heck, even simply knowing how to apply data structures (lists, hashtables, etc) is probably sufficient.
That’s not true for EVERY software engineer role, but lots of people can build meaningful careers and not need to graduate from a top tier Computer Science program. So the levels of abstraction in knowledge probably benefit society overall - otherwise the barriers to advancement become too high.
So parroting could certainly be a symptom of an underlying societal disease, but maybe it’s also enabling sufficient, optimized knowledge transfer that society (“the market”) actually requires.
This is a great point and I should clarify that I dont' expect people to know all the details of everything. But, too many people don't really understand anything at any level beyond the surface depth.
Yep, I often feel that way as well! Just want to point out that “abstraction” can be useful for society. It’s probably OK for one to have surface knowledge in some areas, but they should be an expert where they want to specialize. Perhaps the real problem you identify is when people claim to be an expert/leader and ONLY have surface knowledge in that domain!
that may precisely be what 'judgment' is - knowing when the abstraction is sufficient and when it is not.
in an ongoing discussion i have with a friend about impact on [fill-in-the-blank] from ChatGPT, i have referenced the 'on the shoulders of giants' concept, and the various levels of insight one has depending on how many 'layers down' they are personally knowledgable of.
the calculus example seems pretty relevant, but may highlight the difference between knowledge for the sake of understanding and knowledge for the sake of expediency.
great article and commentary!
We’ve been using the “inch deep” stochastic parrot technique for 500 years - in the form of books. The main difference in the internet age is that someone can take 1000 hours to research and write a book, which someone else takes 10 hours to read, and then boils it down to a 10-minute blog post, and we believe that in reading for 10 minutes, we’ve gained the original 1,000 hours of work (or at least the 10 hours of reading).
I tend to feel that we have largely become stochastic parrots in many areas - and generative AI has just read more blog posts than the average human.
I’m not convinced that the internet has made it easier for people to skim a topic without knowing it’s depth (aka the Imposter Problem). On contrary I think the internet makes it easier for non experts to spot when someone else is an imposter.
Scammers, imposters, liars, snake oil salesmen... these are nothing new. The internet does give them a bigger platform but it also makes it easier to validate their claims independently.
The world was rife with bad business decisions for most of human history and the scale and impact was well beyond anything we could even wrap our brains around today. It was common to have misinformation affect public policy for decades or even centuries prior to the Information Age.
So while I think we all get annoyed more easily - because many of us are cognitively capable of self educating and identifying misinformation quickly now- we shouldn’t mistaken that for thinking that there’s actually more misinformation than in the past. It just that in the “old days” even the most well informed of us wouldn’t necessarily have enough knowledge access to identify misinformation as it was presented to us. We’d just live our lives out never quite sure if that superstition about the mirror was true or not, or whether the Earth was really flat or round. And then die. And our children would live the same problem for their lifetime too.
Nowadays the information is there for those of us who can think. The bad news is that plenty of people still choose to believe misinformation when it suits their narrative, even when it’s clearly debunked. But before the internet, those people still existed. We just couldn’t differentiate them from everyone else.
Rob, the perfect application of this piece is the management consulting industry, where so many people who actually work on projects (as opposed to real senior management) are stochastic parrots bopping from project to project from one complex domain to another.
Rob - great read. Do you think this is in line with the trend to name certain parts of the economy BS jobs? It feels like a similar idea