"Planning like a graph" improves LLM performance in asynchronous tasks

LLMs perform very poorly at planning asynchronous task. But formulating the task as a graph can help improve their performance.

Aug 13, 2024

One of the great challenges of LLMs is asynchronous planning, tasks that have multiple steps, some of which can be done in parallel and others require sequential execution.

For example, baking a cake involves many steps such as preparing the dough, pre-heating the oven, baking the cake, preparing the frosting, cooling the cake, and adding the frosting. Some of these steps can be done in parallel (e.g., preparing the dough and preheating the oven) while others need to be done sequentially (e.g., baking the cake and adding the frosting).

A new study by researchers at the University of Oxford and other institutes shows that advanced prompting techniques improve the performance of LLMs in complex tasks. In particular, when composing tasks as a graph, LLMs become much better at optimizing their planning.

The researchers created Asynchronous WikiHow (AsyncHow), a benchmark with more than 1,600 real-life planning problems.

They tested closed and open LLMs on AsyncHow with four different prompting techniques: zero-shot, few-shot, Chain-of-Thought (CoT), and few-shot CoT. GPT-4 achieves the highest accuracy with few-shot CoT. But even that leaves much to be desired.

To further improve the performance, they suggest Plan Like a Graph (PLaG), a technique that instructs the model to first create a graph from the subgoals in the task and then try to plan it based on dependencies and timing. Several studies show graphs can improve the performance of LLMs on reasoning and planning tasks and PLaG is suitable for asynchronous planning, according to the findings.

PLaG can be applied to off-the-shelf models such as GPT-4 and improve their performance. But it still suffers when faced with out-of-distribution examples, which is in line with other findings on the limits of the reasoning capabilities of current LLMs.

Some promising directions to improve PLaG would be to integrate it with other techniques, such as reinforcement learning, and add more elements such as resource constraints and multimodality.

TechTalks

Discussion about this post