CoT prompting is an effective way to get larger models to solve complex tasks beyond the scope of simple instructions. This deep dive helps you develop an intuition, discusses different techniques, and helps figure out the applications for CoT.

March 24, 2024

- Chain of Thought prompting or CoT is a way to teach a model how to solve complex problems by breaking them into smaller subproblems.
- CoT is effective in getting correct outputs to complex problems which were not possible with a simple prompt or even zero shot learning.

Large Language models (or LLMs) have emerged as powerful tools capable of doing a wide range of tasks, ranging from instruction following, summarization, text generation, explanations, code generations, and so on. However, where they tend to struggle is when it comes to complex tasks which require multi step reasoning or bringing together information from disparate pieces, or following a logical sequence of events.

A good (though slightly inaccurate) mental model to understand how LLMs work is the theory of pattern recognizition. You give enough examples of any task/topic to a model, it can generalize it for similar tasks and can accomplish what the users expect. However, these models do not generalize or build upon specific tasks by themselves. Hence, the struggle to follow logical reasoning, multiple steps etc. where the models do not have a good enough output.

You probably did not learn it in high school explicitly, but if you are asked to add two 15 digit numbers, you will be able to get to the final answer with a pen-pencil and some patience. You start by adding the numbers in unit's place and move leftward step by step until you arrive at a final answer. Problems like these are easy for humans to solve provided they follow the right approach. That is, breaking the bigger problem into smaller subproblems and walking through each solving them one at a time to arrive at the larger solution.

Of course, while these are easy for humans, an LLM would struggle to add two 10 or 15 digit numbers. Because it has not been given enough examples on how to do that. [1]. What if we teach an LLM to follow the same intuitive logic of breaking these seemingly complex problems into smaller subproblems and use them to arrive at an answer?

At it's core, Chain of Thought is a reasoning technique to enable LLMs to solve complex tasks by teaching them how to break those into smaller subproblems. The techniques, emergent properties, and evaluations were first discussed in a 2022 paper by Google research.

CoT encourages LLMs to externalize their thought process by generating a sequence of intermediate steps or "thoughts" before arriving at a final answer. This approach is inspired by the way humans often reason through complex problems by breaking them down into smaller, more manageable pieces and following a logical chain of inferences. This approach not only improves the model's performance on complex tasks but also enabled models to solve problems that initially seemed beyond their reach.

*Rest couple of paras might seem like a CS theory class. But this is the best way to explain how to implement CoT.*

**Computational Complexity Theory** is a branch of computer science that studies the inherent complexity of computational problems and the resources (such as time and space) required to solve them. It provides a framework for classifying problems based on their computational difficulty and analyzing the efficiency of algorithms designed to solve these problems. Yes, all the space complexity and time complexity some of you studied in undergrad is part of complexity theory.

Different techniques in CoT utilizes the core concepts in Complexity theory to enable a model to break the complex problem into smaller problems. These techniques leverage the insights and help a model get to the right answer quickly and efficiently.

In Computational Complexity Theory, the Divide and Conquer paradigm is a widely used technique for solving complex problems by breaking them down into smaller subproblems, solving each subproblem independently, and then combining the solutions to obtain the final result.

For CoT reasoning, this technique can be applied by prompting the LLM to break down a complex problem into smaller, more manageable steps. Each step can then be tackled individually, and the intermediate solutions can be combined to arrive at the final answer. Our 15 digit addition would be under a divide and conquer technique.

An example problem would be:

`Problem: Given an unsorted array of integers, find the kth smallest element in the array.`

Example: Input: arr = [7, 10, 4, 3, 20, 15], k = 3

*you see why it sounded like a theoretical CS class*

Dynamic Programming in Complexity Theory involves breaking down a complex problem into smaller overlapping subproblems, solving each subproblem once, and storing the solutions for future reuse. Eg: most common use case is matrix multiplication in CS tasks.

For CoT prompting this technique can be applied by prompting the LLM to identify and solve subproblems that may be encountered multiple times within a larger problem. The solutions to these subproblems can then be cached and reused, improving efficiency and reducing redundant computations.

Example problem in Math would be: Consider asking an LLM to find the nth number in Fibonacci sequence.

**A non strictly math example could be something like:** You are planning a road trip from New York to Los Angeles, and you want to make the most of your journey by visiting interesting cities along the way. You have a list of cities you'd like to visit, along with the distances between each pair of cities. Your goal is to find the shortest route that allows you to visit all the desired cities. (Put this as a prompt in any model and ask it to generate a CoT prompt for this to get the right answer.)

Greedy Algorithms in Complexity Theory follow the principle of making locally optimal choices at each step, with the hope of finding a global optimum. Greedy algos are widely used in an LLMs pretraining stage where you are trying to find the local minima by finding the checkpoint where loss is zero.

For CoT reasoning, this technique can be applied by giving examples to an LLM to make locally optimal decisions at each step of the reasoning process, based on the current state and available information.

An example problem where this could be utilized is: Suppose you are a forest officer tasked with clearing all the roads and pathways inside the forest. You start by going in one direction and start with clearing the one which needs it more in your field of you. May not be the worst road in the forest overall but in your path.

*this is the last one*

Approximation Algos sacrifice the guarantee of finding an exact optimal solution in favor of finding a solution that is provably close to the optimal one, but can be computed more efficiently.

In CoT, this technique can be applied by prompting the LLM to generate approximate solutions when finding the exact solution is too complex or time-consuming.

Example problem where this could be used: You are travelling from A to B within a limited budget for gas. Given fuel efficiency and distance estimate the total cost of the trip.

These techniques provide structured approaches for breaking down complex tasks which help users guide an LLM to the right answer. If you are faced with a specific problem, going for each of these techniques or pointing a model to think in this direction (and use that as another prompt) enables you to generate the output you desire.

As we have been talking about in the previous section, prompting is the primary way you can induce step by step thinking in the models and help them solve complex problems. This can be achieved by either giving examples in the prompt itself or just mentioning that a model needs to approach the problem in a step by step manner. For instance, a prompt could look like: `Question: If there are 3 bookshelves, each with 5 shelves, and each shelf can hold 8 books, how many books can be placed on the bookshelves in total?`

`Chain of Thought:`

1) There are 3 bookshelves

2) Each bookshelf has 5 shelves

3) Each shelf can hold 8 books

4) To find the total, we multiply: 3 bookshelves x 5 shelves x 8 books per shelf

5) 3 x 5 x 8 = 120

`Therefore, the total number of books that can be placed on the bookshelves is 120. `

Supplement it with a similar question you are trying to help a model to answer.

If you do not have examples, use a text like this:

`For the following complex question, break down your reasoning step-by-step to arrive at the final answer. First, identify the key information and constraints in the question. Then, outline a high-level approach to solve the problem. Next, go through the solution process methodically, explaining each step. Use principles, equations, or background knowledge as needed to support your reasoning. If you get stuck, consider alternative approaches. Finally, summarize your conclusion and double check it for accuracy and completeness. Let's begin:`

`[Insert complex question here]`

Note, you can specify the technique (eg: use Greedy) in the prompt itself. That way a SOTA model would understand how to arrive at a solution quickly.

While prompting can effectively guide the LLM to generate chains of thought, fine-tuning the model on a dataset of examples can further improve its performance and ensure more consistent and reliable behavior. This involves collecting a dataset of question-answer pairs, along with the desired chain of thought sequences, and using them to fine-tune the model's parameters.

This helps save you tokens in prompt and adds a layer of predictability in a model. With enough examples, model tends to generalize these kind of solutions for even the non CoT problems.

CoT reasoning can significantly enhance LLMs' ability to solve complex mathematical problems by breaking them down into a series of logical steps, similar to how humans approach such tasks.

Many real-world problems require commonsense reasoning, which involves drawing inferences from general knowledge and understanding implicit relationships. CoT reasoning can help LLMs better navigate these challenges by externalizing their thought processes.

By encouraging LLMs to generate explicit chains of thought, CoT reasoning contributes to the interpretability and transparency of these models, making it easier to understand their decision-making processes and rationale.

CoT reasoning can be leveraged in educational settings to provide more insightful and human-like explanations, enhancing the learning experience and fostering a deeper understanding of complex concepts.

In domains such as biology, chemistry, and physics, where complex reasoning and hypothesis testing are crucial, CoT reasoning can assist LLMs in generating more structured and logically sound explanations and predictions.

Ensuring that LLMs consistently generate coherent and logically sound chains of thought remains a challenge, particularly for more complex tasks or domains with extensive domain-specific knowledge.

Developing robust evaluation metrics and benchmarks to assess the performance of CoT reasoning models is essential for driving progress and enabling fair comparisons between different approaches.

Extending CoT reasoning to multimodal tasks, where reasoning involves integrating information from different modalities (e.g., text, images, videos), is an exciting frontier with numerous practical applications.

Enabling LLMs to engage in interactive and iterative reasoning processes, where they can receive feedback, ask clarifying questions, and refine their chains of thought, could further enhance their reasoning capabilities and make them more human-like.

You can use CoT in your model in multiple ways. Just ask a model to think step by step, give an example of how a similar problem is broken down, or finetune it on similar problems and solutions. You can use various tactics from Complexity Theory which would produce interesting results depending on the problem.

Pro-tip: if you are ever struggling to get the right prompt, ask a model to generate/suggest the thought process for a subproblem. Pass that in your prompt to solve the main problem.

**One word answer - Self-Attention. **

Because architecture wise, self-attention mechanism enables a transformer to feed generated tokens output back to the input, ie recycling those generations. With CoT, the self-attention mechanism enables the model to effectively relate and connect the intermediate "thoughts" generated during the reasoning process. This allows the model to maintain coherence and consistency throughout the chain of thought, ensuring that each step builds upon the previous ones and contributes to the overall logical flow. Furthermore, the parallelized nature of the self-attention mechanism allows the model to process and reason about multiple intermediate steps simultaneously, potentially leading to more efficient and effective reasoning compared to traditional sequential models.

[1] To be precise, because of how these models tokenize numbers, it will be harder for them to do it even with examples. Open AI tokenizes numbers from left to right, while Anthropic does it from right to left. Based on that, addition is easy for Anthropic given examples, and not for Open AI.

Fast, efficient, and in-context information to make every employee a super performer.

Provide true intelligent search to your employees. Anyone can find anything they need to complete their task, by just asking for it.

Like Every employee has an EA.

Like Every employee has an EA.

Get direct answers, summaries, and citations, all personalized for who is asking the question, and from your company's knowledge. In 1/10 of the time. And no hallucinations.

A plug and play solution that enables enterprises to meet scaling and life-cycle management requirements easily and cost effective manner.