Chapter 03 — Advanced Strategies

Overview

These strategies compound core styles (Chapter 02) to increase reliability, factual grounding, or exploration depth. Use them selectively—each adds latency, cost, or complexity.

Self‑Consistency: Majority voting across sampled reasoning traces.
ReAct: Alternating natural language reasoning with external tool calls.
Tree‑of‑Thoughts: Search over branching reasoning states, pruning weak ones.
RAG: Inject retrieved context before generation to constrain hallucination.
Adversarial / Debate: Force perspectives to surface blind spots.

Self‑Consistency

Run the same structured reasoning prompt K times with temperature > 0, collect answers + rationales, then select by majority, scoring, or aggregation.

# Pseudocode
answers = []
for i in range(K):
  r = call_model(prompt, temperature=0.7)
  answers.append(parse_answer(r))
final = majority(answers)

Tradeoff: Linear cost; diminishing returns after ~7 samples.

ReAct (Reason + Act)

Model emits alternating Thought / Action / Observation steps calling tools.

You are an agent. Use Thought / Action / Observation.
Tools: search(query), calc(expr)
Question: How many hours between local noon in Paris and 3pm in Tokyo next Friday?
Thought: Need time difference. I'll search.
Action: search("Paris Tokyo time difference")
Observation: Paris UTC+1, Tokyo UTC+9.
Thought: Tokyo 15:00 vs Paris ? 15 - (9-1) = 7.
Final Answer: 7 hours.

Key: Enforce strict action schema; server must supply real observations.

Tree‑of‑Thoughts (ToT)

Generalizes Chain‑of‑Thought to search. Branches = alternative partial solutions; controller keeps top B.

frontier = [root]
for depth in range(D):
  pool = []
  for state in frontier:
    children = propose_next(state, N)   # model call
    scored = score(children)
    pool.extend(scored)
  frontier = select_top(pool, B)
return best(frontier)

Heuristics: constraint satisfaction, numerical plausibility, partial reward models.

Retrieval‑Augmented Generation (RAG)

Retrieves external passages to ground generation and cite sources.

Index documents (embeddings + metadata).
Rewrite / condense user query (optional).
Retrieve top k (diversify with MMR).
Assemble constrained prompt with citations.
Generate; fallback if insufficient evidence.

[SYSTEM] Cite only from provided sources as [S#]. If unknown, say you cannot answer.
S1: ...
S2: ...
Question: ...
Answer:

Common issue: Query drift → add query expansion or multi-vector retrieval.

Adversarial / Debate prompting

Opposing roles enumerate strongest arguments, then a synthesis phase reconciles.

Topic: Adopt Rust for backend?
Role A (Advocate): list 3 strongest benefits.
Role B (Skeptic): list 3 strongest risks.
Synthesis: Balanced recommendation citing each side.

Tip: Seed explicit divergent priors to avoid echoing.

Strategy selection table

Need	Strategy	Why
Reduce stochastic error	Self‑Consistency	Majority dampens outliers
Need tools / APIs	ReAct	Structured tool invocation
Branching reasoning	Tree‑of‑Thoughts	Explores alternatives
Factual grounding	RAG	Inject external sources
Surface bias	Debate	Contrasting priors

Failure modes & mitigations

Latency inflation: Too many calls → baseline first.
Tool hallucination: Model fabricates outputs → strict server substitution.
Branch explosion: Exponential ToT growth → cap depth & beam.
Context stuffing: Low-quality retrieval → passage re-ranking.
Homogeneous perspectives: Debate agents agree → seed divergence.

Mini checklist

Baseline core prompt measured?
Added cost justified by accuracy / safety gain?
Automated verifier or tests in loop?
Can sampling / branches be parallelized?
Logging reasoning artifacts for audit?