It’s fascinating how in a few years, large language models (LLM) went from intriguing new deep learning models (the transformer architecture) to one of the hottest areas of AI research. Of special interest is the capacity of LLMs like OpenAI’s GPT-3 and DeepMind’s Gopher to generate long sequences of (mostly) coherent text.
Teaching GPT-3 to express its uncertainty
Teaching GPT-3 to express its uncertainty
Teaching GPT-3 to express its uncertainty
It’s fascinating how in a few years, large language models (LLM) went from intriguing new deep learning models (the transformer architecture) to one of the hottest areas of AI research. Of special interest is the capacity of LLMs like OpenAI’s GPT-3 and DeepMind’s Gopher to generate long sequences of (mostly) coherent text.