Zero-shot reasoning with large language models (with some caveats)
Large language models (LLM), neural networks trained on huge corpora of text (or other types of data) have become a hot topic of discussion in the artificial intelligence community, especially since a Google engineer claimed that one of the company’s LLMs was sentient.
On the one hand, large language models can perform wonderful feats, generating large sequences of text that are mostly coherent and create the impression that they have indeed mastered human language and its underlying skills.
On the other hand, numerous experiments show that LLMs are just parroting their training data and are only showing impressive results because they have been exposed to huge amounts of text and break as soon as they are presented with tasks and problems that require reasoning, common sense, and skills that are implicitly learned.
But a new study by researchers at the University of Tokyo shows that if you provide the LLMs with well-crafted prompts, you can steer them toward answering questions that require reasoning and step-by-step thinking. The researchers present a method called “zero-shot chain-of-thought” prompting, which uses a special trigger phrase in the prompt to force the LLM to go through the steps required to solve problems. And although simple, the method seems to work well frequently.
While other studies contest the notion of LLMs being able to reason, zero-shot CoT proves that, if you know how to query LLMs, they will be better positioned to provide a reasonable answer.
Read the full article on TechTalks.
For more on large language models: