What GPT-3.5 Turbo's fine-tuning feature means for the LLM market
This week, OpenAI announced the release of fine-tuning for GPT-3.5 Turbo, the model behind the free version of ChatGPT.
The new feature will have important benefits for businesses and enterprises, enabling them to create their own version of ChatGPT trained on their own data. It can also help reduce the costs of using ChatGPT and open the way for its use in more areas.
This is a reflection of the fast-changing market for large language models (LLM).
Key findings:
LLMs like GPT-3 were meant to be able to perform many tasks with no training or few-shot learning and prompt engineering
However, in practice, vanilla LLMs often fail to perform robustly on many specialized tasks, which is why fine-tuning is crucial
OpenAI currently supports fine-tuning for GPT-3.5 Turbo 4k; similar features will be added to the 8k model and GPT-4 in the future
The benefits and cost savings of fine-tuning are not very straightforward and depend on your application and training data
But what is for sure is that the fine-tuning feature fills a big gap between GPT-3.5 and GPT-4, providing developers with more options
One good technique is to start with GPT-3.5 Turbo 16k or GPT-4 and then gather training data for your own fine-tuned model
The market for LLMs is fast changing—in a few months, we’ve gone from “one model to rule them all” to an ecosystem of specialized models
The growing number of open source LLMs that can be customized with little costs were starting to eat into OpenAI’s market
Fine-tuning will put OpenAI on par with some of its open source competitors, but some companies might still prefer running their own customized models
Read the full article on TechTalks.
Recommendations:
My go-to platform for working with ChatGPT, GPT-4, and Claude is ForeFront.ai, which has a super-flexible pricing plan and plenty of good features for writing and coding.
To learn how to create applications with OpenAI’s API, I recommend GPT-3: Building Innovative NLP Products Using Large Language Models by Sandra Kublik and Shubham Saboo
To learn about how ChatGPT works, I recommend Stephen Wolfram’s What Is Chatgpt Doing and Why Does It Work?
For more on ChatGPT: