Knowledge pack — guides, templates, examples

Chapter 03 — Advanced Strategies

Overview

These strategies compound core styles (Chapter 02) to increase reliability, factual grounding, or exploration depth. Use them selectively—each adds latency, cost, or complexity.

  • Self‑Consistency: Majority voting across sampled reasoning traces.
  • ReAct: Alternating natural language reasoning with external tool calls.
  • Tree‑of‑Thoughts: Search over branching reasoning states, pruning weak ones.
  • RAG: Inject retrieved context before generation to constrain hallucination.
  • Adversarial / Debate: Force perspectives to surface blind spots.

Self‑Consistency

Run the same structured reasoning prompt K times with temperature > 0, collect answers + rationales, then select by majority, scoring, or aggregation.

# Pseudocode
answers = []
for i in range(K):
  r = call_model(prompt, temperature=0.7)
  answers.append(parse_answer(r))
final = majority(answers)

Tradeoff: Linear cost; diminishing returns after ~7 samples.

ReAct (Reason + Act)

Model emits alternating Thought / Action / Observation steps calling tools.

You are an agent. Use Thought / Action / Observation.
Tools: search(query), calc(expr)
Question: How many hours between local noon in Paris and 3pm in Tokyo next Friday?
Thought: Need time difference. I'll search.
Action: search("Paris Tokyo time difference")
Observation: Paris UTC+1, Tokyo UTC+9.
Thought: Tokyo 15:00 vs Paris ? 15 - (9-1) = 7.
Final Answer: 7 hours.

Key: Enforce strict action schema; server must supply real observations.

Tree‑of‑Thoughts (ToT)

Generalizes Chain‑of‑Thought to search. Branches = alternative partial solutions; controller keeps top B.

frontier = [root]
for depth in range(D):
  pool = []
  for state in frontier:
    children = propose_next(state, N)   # model call
    scored = score(children)
    pool.extend(scored)
  frontier = select_top(pool, B)
return best(frontier)

Heuristics: constraint satisfaction, numerical plausibility, partial reward models.

Retrieval‑Augmented Generation (RAG)

Retrieves external passages to ground generation and cite sources.

  1. Index documents (embeddings + metadata).
  2. Rewrite / condense user query (optional).
  3. Retrieve top k (diversify with MMR).
  4. Assemble constrained prompt with citations.
  5. Generate; fallback if insufficient evidence.
[SYSTEM] Cite only from provided sources as [S#]. If unknown, say you cannot answer.
S1: ...
S2: ...
Question: ...
Answer:

Common issue: Query drift → add query expansion or multi-vector retrieval.

Adversarial / Debate prompting

Opposing roles enumerate strongest arguments, then a synthesis phase reconciles.

Topic: Adopt Rust for backend?
Role A (Advocate): list 3 strongest benefits.
Role B (Skeptic): list 3 strongest risks.
Synthesis: Balanced recommendation citing each side.

Tip: Seed explicit divergent priors to avoid echoing.

Strategy selection table

NeedStrategyWhy
Reduce stochastic errorSelf‑ConsistencyMajority dampens outliers
Need tools / APIsReActStructured tool invocation
Branching reasoningTree‑of‑ThoughtsExplores alternatives
Factual groundingRAGInject external sources
Surface biasDebateContrasting priors

Failure modes & mitigations

  • Latency inflation: Too many calls → baseline first.
  • Tool hallucination: Model fabricates outputs → strict server substitution.
  • Branch explosion: Exponential ToT growth → cap depth & beam.
  • Context stuffing: Low-quality retrieval → passage re-ranking.
  • Homogeneous perspectives: Debate agents agree → seed divergence.

Mini checklist

  • Baseline core prompt measured?
  • Added cost justified by accuracy / safety gain?
  • Automated verifier or tests in loop?
  • Can sampling / branches be parallelized?
  • Logging reasoning artifacts for audit?