Technically Sentient: The Coming World of Synthetic Media
What Happens When the Marginal Cost of Interesting Content Creation is Zero?
Welcome to Technically Sentient! I’m Rob May, a Partner at PJC and former AI entrepreneur, and I write this newsletter to provide an overview and perspective of things happening in the AI market.
In 2015 I began writing Technically Sentient. That newsletter was rolled up into Inside.com and changed to InsideAI, and we started doing news too. Then it went daily. It’s been an awesome ride to 30,000 subscribers but the pace and format don’t match with how I’ve been thinking about AI, so while I continue to write commentary over there for now, I decided to go back here to a 2x per month newsletter that is more in-depth, analytical, and less news focused.
Also note - at PJC we are running an AI buyer’s pitch contest. Sign up here to pitch your AI company to customers, instead of VCs.
— Interesting Articles —
There is a great twitter thread on the reflexivity between AI algorithms and choice. Twitter.
A Guide For Responsible Use of Machine Learning APIs. Medium.
Winners and Losers at the Edge. SemiEngineering.
Graphcore, one of the most mature AI chip architectures, is going to a 3nm fabrication process, which is surprising. Anandtech.
— Research Papers of Note —
Grounded Language Learning, Fast and Slow. Link.
Fairness in the Eyes of Data: Certifying ML Models. Link.
Synthetic To Real Unsupervised Domain Adaptation. Link.
— Commentary —
I stumbled upon synthetic data from a robotics investment I made in 2016. The company was trying to build a robot to clean bathrooms and it turns out this is a very hard problem to solve. In fact, it turns out that training robots to be “smart” is a hard problem because there are a nearly infinite amount of situations they could find themselves in, and training them usually happens in the slow physical world, not the fast digital world. Having a robot learn to pick up various glasses may take weeks of training because, if the robot needs 10,000 examples to learn, maybe it can only pick up one glass every few seconds, so, it takes a while.
Contrast that with digital simulation, where a robot could learn 10,000 examples in seconds, digitally. Well, simulation has a long history in technology. When I started my career as a FPGA and ASIC designer, we always ran simulations of the chips and how they would work before actually programming them. Simulations were great, but the real world always uncovered some behavior the simulation didn’t.
Similarly, “synthetic data” is the term for data used to train robots via simulation. Combined with real world training, for most robotic use cases, the robots learn faster and have a higher level of accuracy, but they need both. But the rise of synthetic data has led to a more interesting meta trend - the rise of synthetic media.
The best known use case of synthetic media is Deepfakes, and my friend Rob Toews wrote about how deepfakes may wreak havoc on society. But there are also good use cases of synthetic media, like HourOne - a company that automates the creation of synthetic video characters, or Primer.ai - a company that auto-generates topical reports.
Instead of getting into the tech behind synthetic media, I want to start with the assumption that it keeps improving, and that the marginal cost of creating high quality content, video, audio, and text, continues to approach zero. And then I want to ask what happens?
When a bot can interview someone on a podcast, produce and post the whole thing, and market it, with almost no human intervention, then how much content gets produced, at what pace?
When anyone with a computer can create a video showing anything they want, in minutes - Joe Biden kissing Melania Trump, Putin and Trump in a fistfight - what kind of crazy content will we see? How will we know what to believe?
In 2007, some investors approached me about building a hyperlocal media company. I declined because, as I told them at the time, there are a limited number of ad dollars in the world, and if the content available to advertise across grows faster than those ad dollars then it will suck to be in the content business unless you are one of the biggest. I think that turned out to be largely true.
What does it mean, then, when we are on the cusp of a content explosion that not only means more content faster than ever before, but unlike the previous explosion this content may be very high quality, and often very confusing?
Attention is a constrained resource, and with more content competing for the same amount of attention (or, maybe the gross attention product of the world is slightly growing via multi-tasking and population growth and better internet access) then I expect the value of attention harvesting systems to go up. Systems that parse content and make recommendations will become even more valuable then they already are.
The value of distribution will also go up. If you have push channels into consumers, you will be in a powerful position. The media industry has flip flopped at times over business strategies that leverage the power of distribution or the power of content, sometimes integrating both via mergers, sometimes specializing in just one to be best of breed. But for this next wave, 2022- 2030, I think distribution wins.
The value of live events will go up - events that multiple people can simultaneously verify. Live events have mattered less because since the advent of Tivo and DVR, consumers have time shifted watching shows. For many times of content that will remain true, but for major events - or events where false content might matter, live broadcasting will become huge. I expect to see services that are livestreams, not of certain geographies or topics, but of important things happening right now that they can triangulate credibility because multiple people are sending in streams that all match. I see those things being a little bit like early cable news networks, covering whatever is happening right now, but with a different mission. It’s not the “nowness” that will matter, it’s the credibility the nowness provides. This will be a very lucractive business because of the aforementioned attention scarcity.
And finally, I expect digital worlds, metaverses, whatever you call them to explode. Eventually these will lead to a bot-economy where digital agents will transact and engage each other. First it will be on behalf of users but eventually they will become smart enough to do so on their own. So, if you look out to 2050 and beyond, there are probably videos created to target advertising to purely digital agents. Sounds crazy, I know, but, that’s where this is headed.
Those are my high level thoughts on synthetic media and the impact it has when marginal content creation costs approach zero. Soon I’ll do a deep dive on the tech and try to figure out how far out some of these things really are - technically. But if you are working on something in this space, please reach out. I’d love to chat.
Thanks for reading.
@robmay