On Reasoning Models

Feb 14, 2025

Introduction

Artificial intelligence has transformed dramatically over the years. Initially, AI systems operated through pattern matching, leveraging rule-based logic to interpret inputs and produce structured outputs. These early systems, while effective in narrow domains, lacked adaptability and struggled with the complexity of real-world reasoning.

The advent of Large Language Models (LLMs) brought about a significant shift. These models learned from vast amounts of text, using statistical methods to generate human-like responses. Yet, despite their ability to produce fluent and contextually relevant text, LLMs primarily excel at pattern completion or next-token generation rather than deeper reasoning.

Now, we are moving into a new phase of AI: Reasoning Language Models (RLMs) and Large Reasoning Models (LRMs). These models are designed not just to predict words but to think through problems systematically. By integrating reinforcement learning, structured inference, and dynamic reasoning efforts, these models go well beyond text generation to engage in logical decision-making. OpenAI’s o-models (o1 and o3 variants), DeepSeek-R1, and Thinking Mode for other models exemplify this shift, demonstrating how AI can optimize its own reasoning process while maintaining efficiency.

The Shift from Pattern Recognition to Structured Reasoning

For decades, AI development has followed a trajectory of increasing complexity:

Pattern Matching and Symbolic AI: Early systems relied on fixed rules, enabling structured but rigid decision-making.
Statistical Machine Learning and LLMs: These models introduced vast improvements in text fluency but lacked structured, multi-step reasoning, and relied directly on data to inform the iterative process of content generation, or deductive process of prediction.
Reasoning AI: The latest generation brings these together and incorporates reinforcement learning, structured inference, and methods like Monte Carlo Tree Search (MCTS) to solve problems logically, much like a human would, in the middle of content generation by neural networks, much like an LLM would.

Understanding Reasoning AI

Building reasoning models requires a structured approach that enables AI to engage in logical problem-solving rather. This paper from ETH Zurich wonderfully designs a blueprint for significant components and data flow through a reasoning model. A well-designed blueprint, therefore, consists of several essential components that define how reasoning models process and evaluate information.

Core Components for Reasoning

Representation of Knowledge – Effective reasoning AI requires structured data representation within an inference pipeline. This can leverage knowledge graphs, embeddings, and symbolic representations that allow the model to organize and retrieve relevant information efficiently.
Reasoning Scheme – In order for the model to plan and proceed with the reasoning, it needs a scheme defining how each step is connected in the process, and if it needs to build structures like chains, trees or graphs as it continues to solve an input task, and generates a response at each step.
Search and Planning – Reasoning models employ algorithms like Monte Carlo Tree Search (MCTS) to evaluate multiple solution pathways, and use operators to generate, aggregate or refine the decision, and build policy or value models to orchestrate them.

These components are evident in state-of-the-art reasoning models, from DeepSeek-R1’s reinforcement learning framework to OpenAI’s o3-mini, which dynamically adjusts reasoning effort based on task complexity. The integration of these elements is key to making AI act like a problem solver instead of a token generator.

Basic Reasoning Structures

The emergence of reasoning models has been fairly gradual and several methods were developed to communicate with LLMs in ways to build a reasoning flow. For instance, Chain-of-Thought (CoT) continues to be a heavily used method of using LLMs in production environments where the task needs a certain implicit structure leading to the conclusion. Success of CoT also inspired more complex structures within this process like Tree-of-Thoughts and Graph-of-Thoughts changing the approach from a sequence to more dimensions and flow control. Mostly released in early 2024, these models forged the path for deeper inclusion of structures within the model’s operation.

OpenAI’s Advances in Reasoning Models

In September 2024, OpenAI introduced o1 models as the first step towards end-to-end models with implicit thinking or reasoning. This was followed up by o3, charting a clear path that OpenAI is taking towards reasoning. These models are showing significant improvements on objectives such as STEM reasoning and problem solving.

In addition to achieving reasoning objectives, this family of models also incorporates dynamic adjustment of computational power being used for the level of analysis, making reasoning available for relatively low latency tasks as well. The productization of these models is a key enablement in businesses relying on complex AI operations such as deeper analysis of financial or medical content, reasoning in the processes where an AI task needs to decide among multiple options, and eventually feeding into the AI Agents as the flow control is combined with deeper reasoning.

DeepSeek-R1: Reinforcement Learning for Self-Improving AI

DeepSeek R1, an open source reasoning model, released earlier this year creating a major buzz in the reasoning space. This availability of open source reasoning models can potentially be one of the most important milestones in the progress of AI models as we move towards agents and more integrated objectives.

In addition to that, R1 also brought a paradigm shift in the amount of compute required to train such a model, and leverages distillation and Group Relative Policy Optimization (GRPO). The other key strength in R1 is that unlike static models that rely on pre-trained knowledge, DeepSeek-R1 learns from its own successes and failures, adapting over time. A particularly intriguing feature of DeepSeek-R1 is its "Aha Moment" capability—where the model recognizes inconsistencies in its reasoning and self-corrects. This marks a shift toward AI systems that can not only generate responses but also critique and refine their own reasoning.

Why This Matters: The Future of AI Reasoning

The move toward reasoning models is not just an incremental improvement—it’s a fundamental change in how AI systems operate and can be used for complex tasks. The implications are vast:

AI becomes more reliable

By shifting from purely autoregressive models to incorporate structured reasoning, AI models become less prone to hallucinations and can generate outputs that are logically grounded. This reliability is crucial in fields like medicine, finance, and legal analysis, where AI needs to evaluate facts rather than merely predict text sequences.

AI agents become more intelligent

With structured reasoning, AI assistants can plan, strategize, and solve problems in a way that mirrors human thinking. This means AI copilots for engineering, research, and financial modeling will be able to provide insights that go beyond surface-level text generation, and AI agents can decide next steps in execution with deeper reasoning. OpenAI’s Deep Research is already carving a path in this direction.

AI can help with deeper unsolved tasks

Reasoning models, with their ability to think through the process, are gaining popularity in the space of mathematics and physics. The assistance possible with these models in proving theorems can potentially help with the unsolved problems, or at least make significant advances in our understanding of the space.

Conclusion

The release of ChatGPT in November 2022 marked a transformative moment in AI research, establishing a new foundation for innovation. LLaMA further accelerated this progress by open-sourcing its architecture, fostering broader accessibility and experimentation. Since then, the development of reasoning models and R1’s open-source advancements have been among the most significant breakthroughs, driving AI capabilities forward at an unprecedented pace.

Whether these models are applied to small-scale AI tasks or integrated into entirely new agentic workflows, their growing reliability and structured improvements make them invaluable. The rapid evolution of this technology excites me, and I look forward to witnessing the next major open-source contribution that will shape the future of AI.

Rakshit’s Substack

Discussion about this post