As these models are currently optimized to produce a statistically likely continuation of the given text (often referred to as “auto-regressive generation”), combining their ability to generate natural language with planning intentional communication, and, in particular, performing common sense reasoning and solving logical problems, is a major research focus in the AI community [1, 2]. If achieved, this could potentially allow for more autonomous, intelligent systems capable of performing tasks that currently require human-level planning and reasoning skills.
A large amount of this work is being performed by research labs in large tech companies and is not open to the public. Recently, a number of unverified claims about an alleged breakthrough in achieving Artificial General Intelligence by OpenAI researchers working on a project referred to as “Q*”, have been circulating in the media. While there are no publicly available details of the project, it is likely that OpenAI researchers, like many other AI labs, have been working on enhancing language models with reasoning capabilities, and, in particular, solving school-level mathematical problems as one of the first steps. However, given the time since their latest publication on the subject that demonstrated an important progress in this task , and the complexity of the problem, it is likely that the scale of the possible breakthrough achieved by Q*, despite its possible significance, is currently largely exaggerated. Nevertheless, given the importance of the problem and a high interest of the research community, more advances in the area are likely to be made in the future, possibly by combining language models with planning algorithms, such as Reinforcement Learning.
 Let’s Verify Step by Step  Large Language Models Still Can’t Plan