OpenAI's moat
As open-source LLMs take away the advantage of ChatGPT, OpenAI is building new moats to protect its generative AI business.
In May 2023, a leaked document from Google revealed the challenges of private large language models (LLM) like ChatGPT and GPT-4. The main point of the memo was that neither Google nor OpenAI have moats and eventually, open-source models will conquer the market of private LLMs.
“While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly,” the document reads. “Open-source models are faster, more customizable, more private, and pound-for-pound more capable.”
In less than a year, most of the warnings made in the demo have turned out to be true. Open-source models are quickly catching up in quality, they are more flexible, and faster to train and fine-tune.
However, OpenAI is taking subtle steps to build moats and protect its LLM business as advances in the field level the playing grounds. But the strategy is not guaranteed to work.
How OpenAI’s moat was breached
When ChatGPT was released, the general perception was that models improved as they grew. With 175 billion parameters, GPT-3 required hundreds of gigabytes of GPU memory and huge investments to train and run. A few open-source LLMs that were released in 2022 were so big and unwieldy that few companies could run them.
The prohibitive costs of training and running LLMs itself was a moat that made them accessible to wealthy organizations. OpenAI used its first-mover advantage to establish itself as the baseline. GPT-3 and later ChatGPT and GPT-4 became the defacto go-to models for building LLM applications.
As other big tech companies scrambled to catch up and outspend each other, smaller players could only hope to buy access to models through APIs.
However, a study by DeepMind researchers in 2022 presented the idea that you did not need a huge model to achieve state-of-the-art results. The study, which became known as the Chinchilla paper, suggested that a small model trained on a very large dataset could match the performance of larger models. With 70 billion parameters, the Chinchilla model outperformed other state-of-the-art LLMs of the time, according to the researchers.
While DeepMind did not open-source Chinchilla, the training recipe triggered a new direction of research. In February Meta released Llama, a family of LLMs ranging from 7 to 65 billion parameters. Llama was trained on 1.4 trillion tokens as opposed to GPT-3’s 300 billion tokens.
Llama was resource-efficient and highly performant, and compared to ChatGPT on several key benchmarks. And Llama was open source, which meant organizations could directly run it on their servers at very low cost, even on a single GPU.
Following the release of Llama, a slew of other open-source models were released, each building on and improving the previous ones. Many came with licenses that allowed developers to create commercial products with them.
Model compression, quantization, low-rank adaptation, and other techniques developed throughout the year made it increasingly convenient for companies to adopt open-source models for their applications. New programming frameworks, low-code/no-code tools, and online platforms made it easier to customize and run LLMs on company infrastructure. 2024 already promises innovations such as high-performance LLMs running on edge devices.
To be fair, OpenAI still has an advantage in model performance. I still haven’t seen any model that matches GPT-4. But several open-source models already match and outperform GPT-3.5. It is only a matter of time before they reach GPT-4 and other state-of-the-art models.
The open-source models took away the advantage of big tech and commoditized the market. As the costs of switching drop, more and more companies will be incentivized to move from GPT-4 to low-cost open-source models. Even if these models don’t match GPT-4, most enterprises have specialized needs that can be met with a carefully fine-tuned model that has a fraction of the costs and matches other needs such as data ownership and privacy.
GPT Store, engagement, and integration
Without its infrastructure moat, OpenAI needed to move on to other fronts to ensure the defensibility of its business. And it has already made some strategic moves to build new moats.
An important part of this strategy is creating network effects around ChatGPT, its flagship product. GPT Store, first announced in November, launched last week. It is the AI version of Apple’s App Store, allowing users and developers to share their customized version of the LLM for others to use. Most GPTs are whimsical, but some of them will be very useful and improve productivity. OpenAI will also provide enterprise features, allowing organizations that sign up for the ChatGPT Team plan to have their own private GPT Store.
The basic idea is that, with enough critical mass, users will stick to ChatGPT and more users will sign up for the ChatGPT Plus plan to access the GPT Store. And developers will stick to the platform where their assistants will be exposed to more users. The mass usage will also create free publicity for the company as more content will be published about ChatGPT, further establishing it as the defacto place to go for LLM apps.
OpenAI is reinforcing the network effects with monetization. According to its website, in Q1 2024, “US builders will be paid based on user engagement with their GPTs.” This means they will be incentivizing maximum engagement to improve the stickiness of the product. But it will also have the adverse effect of replicating all the bad things about social media.
At the same time, the company will be reinforcing its data network effects to constantly improve its products. If you’re on the free plan, OpenAI will collect your data and use it to further train its models. If you’re on the Plus plan, your data will still be collected unless you opt out of the data collection program.
Another important effort to increase switching costs is to reduce the costs of running ChatGPT. In a recent interview with Bill Gates, OpenAI CEO Sam Altman said that the company had managed to reduce the costs of running LLMs by a factor of 40. Reduced costs will enable OpenAI to launch more features for both the free and paid tiers of users as open-source LLMs continue to catch up with ChatGPT.
OpenAI is also preparing itself for the future. There are talks about OpenAI working on its own device, probably built around its LLM. This will give it the power of vertical integration, like Apple’s iron grip on the iOS ecosystem. We are seeing what might be the beginning of a new paradigm shift in computing. As the field develops and new computing paradigms emerge, OpenAI will be ready to launch its vertical stack.