Chain-of-Thought Prompting vs Extended Thinking: When to Use Each in 2026

CoT prompting and extended thinking aren't the same thing — and knowing which to use is one of the highest-leverage skills for AI practitioners in 2026.

Insight

2026-04-20

The Prompt That Changed How I Think About AI Reasoning

Run the same complex brief through Claude Sonnet 4.6 twice — once with a detailed chain-of-thought prompt spelling out every reasoning step, and once with a simple direct instruction. In many cases, the simpler prompt produces a better output. Not because the reasoning matters less, but because the model is already handling it internally.

This is the central confusion facing AI practitioners in 2026. Chain-of-thought (CoT) prompting was the go-to technique for improving reasoning quality for two years. Then extended thinking arrived, followed by adaptive thinking, and suddenly the rules changed. Most practitioners are still using CoT by default — sometimes helping their outputs, sometimes actively constraining them.

Understanding when to use each approach is now one of the highest-leverage prompting skills you can develop.

What Is Chain-of-Thought Prompting?

Chain-of-thought prompting is a technique where you explicitly instruct a model to reason through a problem step by step before delivering a final answer. Instead of asking "What's the best marketing angle for this product?", you ask "First identify the target audience. Then analyse the competitive landscape. Then define the core value proposition. Based on those three steps, recommend the best marketing angle."

The technique works because it transforms how the model processes your request. When forced to reason step by step, models catch logical errors in their own chain, consider more perspectives, and produce more defensible conclusions. Research from Google DeepMind and Wei et al. (2022) established that CoT prompting improved performance on multi-step reasoning benchmarks by 40–60% compared to direct-answer prompting across multiple model families.

For practitioners using GPT-4 or early Claude models in 2023 and 2024, CoT was the most reliable lever for improving outputs on complex analytical tasks — strategic documents, competitive analyses, multi-step content planning. It required no technical setup. You changed how you wrote the prompt, and the quality improved noticeably.

What Is Extended Thinking — and How Is It Different?

Extended thinking is a model-native reasoning mode where the AI performs multi-step deliberation internally before producing any visible output. The key distinction from chain-of-thought prompting: you don't write the reasoning instructions. The model generates its own reasoning strategy for each specific query, runs it internally, and then produces the response.

Anthropic introduced extended thinking with Claude 3.7 in 2025. OpenAI's o1 and o3 model families operate on a similar principle — trained to reason before responding rather than prompted to do so. The structural difference is significant:

--- CoT prompting: You write reasoning steps into the prompt. The model follows your template. The thinking is part of the output.

--- Extended thinking: The model reasons internally in a separate thinking phase. Your prompt doesn't need to specify the steps. The reasoning may surface as a "thinking" block depending on the interface, but it isn't driven by your instructions.

Extended thinking is architecturally distinct from CoT because the reasoning happens at inference time within the model's own logic — not by following a template you've written. The model isn't executing your reasoning plan. It's generating its own.

What Changed in 2026: Adaptive Thinking in Claude 4.x

Adaptive thinking is the next evolution beyond extended thinking and the default mode in Claude Sonnet 4.6 and Opus 4.6. Rather than applying deep reasoning to every query (which adds latency and cost), adaptive thinking lets the model calibrate how much deliberation each specific question actually requires.

In practice, Claude dynamically decides whether your request needs two seconds of reasoning or thirty. A simple factual lookup gets a direct answer. A complex strategic analysis triggers deeper multi-step reasoning. According to Anthropic's internal evaluations, adaptive thinking reliably outperforms fixed extended thinking on benchmark performance — it applies the right amount of reasoning effort rather than maximum effort regardless of need.

This matters for how you prompt. With adaptive thinking as the default in Claude 4.x, the model is already doing significant reasoning work on complex tasks before you see a word of output. The question shifts from "how do I make the model reason better?" to "when should I guide the reasoning structure, and when should I step back?"

When Chain-of-Thought Prompting Still Wins

Even with adaptive thinking handling reasoning automatically on most tasks, there are specific situations where explicit CoT instructions produce better results than leaving the model to reason freely.

When your reasoning must follow a specific framework. If you need an analysis that follows your company's decision matrix — say, a RICE scoring template for prioritising features, or a specific competitive analysis structure your team uses — explicitly laying out the steps ensures the model follows your structure rather than generating its own. Adaptive thinking produces correct reasoning, but not necessarily in the sequence or format your stakeholders expect.

When you need reasoning visible inline in the output. Extended thinking in Claude 4.x surfaces a separate "thinking" block, but this isn't always visible in consumer interfaces. If you need the reasoning to appear inline — so a colleague can review the logic alongside the conclusion — CoT prompting still delivers this cleanly and predictably.

When using models without native extended thinking. GPT-4o, Gemini 1.5 Pro, Mistral, and most open-source models do not have extended thinking modes. For these models, CoT remains your primary mechanism for improving reasoning quality on complex tasks. The technique hasn't become obsolete — it's just become model-specific.

When latency is a real constraint. Extended and adaptive thinking add latency. On tasks where speed matters — live content iteration, rapid idea generation, real-time analysis in a meeting — explicit CoT with a constraint ("reason through this briefly, then answer") often produces usable results faster than waiting for deep reasoning to complete.

When to Let the Model Think for Itself

Extended and adaptive thinking consistently outperform user-written CoT in situations where the reasoning complexity exceeds what you can practically structure in a prompt.

Complex multi-step problems you can't fully map in advance. When a task involves chains of logic that are too long or non-linear for a prompt template — debugging a multi-layer workflow, evaluating competing strategic options with many dependencies, planning a complex project — extended thinking produces more coherent outputs. The model generates the right reasoning path for the specific problem rather than following a template that may not fit.

When your CoT prompt is constraining rather than guiding. Over-specified CoT prompts can force incorrect reasoning paths. If you instruct the model to reason A → B → C when the actual logic requires A → C → D, the model follows your instructions rather than correcting you. Extended thinking doesn't have this problem — it finds the right path independently. If you're getting rigid or slightly-off outputs from a detailed CoT prompt, try removing the reasoning instructions and letting adaptive thinking handle it.

Agentic tasks and tool-use loops. When Claude is operating as an agent — using tools, executing multi-step plans, running iterative processes — adaptive thinking manages planning logic internally. Adding explicit CoT reasoning instructions to agentic workflows often creates confusion rather than clarity. Let the model plan its own execution path.

A Practical Decision Framework for 2026

Here is a decision rule that works across models and task types.

Use explicit CoT prompting when:

--- You need reasoning that follows your specific structure or framework (not the model's)

--- You need the reasoning to appear inline in the output for review or collaboration

--- You're using a model without native extended thinking (GPT-4o, Gemini, open-source)

--- Speed matters and you need to constrain how long the reasoning phase takes

Rely on extended or adaptive thinking when:

--- The problem is genuinely complex and you can't anticipate the full reasoning path in advance

--- You're using Claude 4.x or OpenAI o-series models where thinking is native

--- You're running agentic workflows where the model needs to plan its own execution

--- Your current CoT prompt feels like it's constraining outputs rather than improving them

The core reframe: for Claude 4.x users, explicit CoT is a tool for shaping the structure and visibility of reasoning — not for triggering reasoning in the first place. The model handles that on its own.

Try It Now: Two Prompts to Test This Week

Run both of these on the same complex task — a strategic recommendation, a content brief, a competitive analysis. Compare the outputs and notice what's different.

Prompt Version A — Explicit Chain-of-Thought (structured output):

You are a senior marketing strategist. Work through the following steps before recommending anything:

Step 1 — Identify the target customer segment and their top 3 pain points.

Step 2 — Analyse the competitive landscape: who else addresses this, and how?

Step 3 — Define the unique value proposition for this product.

Step 4 — Based on your analysis in steps 1–3, recommend the three strongest marketing angles and explain why each works.

Task: [Your product or service brief here]

Prompt Version B — Lean prompt with adaptive thinking (depth without structural constraints):

You are a senior marketing strategist. Analyse this brief and recommend the three strongest marketing angles, with your reasoning for each.

Task: [Your product or service brief here]

Version A produces a cleanly structured response following your exact logical sequence — ideal when the structure itself matters. Version B often surfaces insights that don't fit a pre-defined framework — better when you don't yet know what the right questions are. Neither is universally superior. The skill is knowing which the task actually needs.

The Real Shift: Knowing When Not to Prompt

The highest-leverage prompting insight for 2026 is recognising that advanced models are doing significant reasoning work before you see any output. The productive question is no longer "how do I make the model think harder?" It's "when should I guide the thinking, and when should I step aside?"

Explicit chain-of-thought prompting remains a precision instrument — particularly when you need visible, structured reasoning that follows your framework. But it's no longer the default. With adaptive thinking built into Claude Sonnet 4.6 and Opus 4.6, over-specifying reasoning steps can actively constrain outputs that would otherwise be more creative, more accurate, or more relevant to your actual need.

懂AI，更懂你 — UD相伴，AI不冷。 Knowing when to guide the model and when to trust it is one of the most underrated skills in the AI practitioner toolkit. The better you understand how reasoning actually works inside these models, the less prompting you'll need to do to get excellent results.

Ready to Find Out How Your AI Skills Stack Up?

Understanding reasoning modes is one marker of an advanced AI practitioner. But how does your overall AI skill set compare — across prompting, workflow design, tool selection, and output quality? The UD AI IQ Test gives you a personalised benchmark in under 10 minutes, with specific recommendations for where to level up. The UD team will walk you through every step — from reading your results to building the techniques that close your gaps.

Take the Free AI IQ Test Now

其他人也看了

Vision AI for Practitioners: How to Extract Data from Any Document, Screenshot or Image How to Build Your First AI Agent Workflow Without Writing a Single Line of Code Microsoft Copilot vs Google Gemini vs Claude: How Enterprise Leaders Should Choose in 2026 What Is Agentic AI? The Enterprise Leader's Guide to Autonomous AI Systems in 2026 What Is Multimodal AI? Strategic Applications for Enterprise Operations in 2026

UD Blog

Unveiling Perspectives and Delivering Insights Related to Tech