Knowledge pack — guides, templates, examples

Chapter 05 — Self-Refinement

Overview

Self-Refinement is an internal loop where the model critiques and iteratively improves its own output until quality converges or a round limit is reached.

  • Pros: Improves quality without external labels.
  • Cons: May reinforce errors; increased latency.

Loop pattern

  1. Generate initial draft.
  2. Critique: list categorized issues + prioritized checklist.
  3. Refine: apply checklist + provide change summary.
  4. Terminate if no material issues OR max rounds reached.

Code evolution example (palindrome)

# Round 0
 def is_palindrome(s):
     return s == s[::-1]

 # Critique:
 # - Case sensitive
 # - Punctuation unaffected
 # - No docstring or examples

 # Round 1
 def is_palindrome(s):
     """Return True if s is a palindrome ignoring case & non-alphanumerics."""
     cleaned = ''.join(ch.lower() for ch in s if ch.isalnum())
     return cleaned == cleaned[::-1]

 # Round 1 critique:
 # - Add type hints
 # - Add test example(s)

 # Round 2
 def is_palindrome(s: str) -> bool:
     """Check palindrome ignoring case & non-alphanumerics.
     Example: 'Racecar!' -> True"""
     filtered = ''.join(c.lower() for c in s if c.isalnum())
     return filtered == filtered[::-1]

Prompt skeleton

[TASK]
 Produce initial draft for: <spec>.

 [CRITIQUE]
 List issues under: correctness, completeness, style, edge cases.
 Provide a MAX 6 item improvement checklist (actionable, ordered).

 [REFINE]
 Apply checklist fully. Output revised draft + CHANGELOG.
 If no improvements possible, emit: NO MATERIAL ISSUES.

Failure modes & mitigations

  • Echoing flaws: Introduce external tests / validators.
  • Over-processing: Stop after N stable rounds.
  • Vague checklist items: Enforce verb + object + constraint.
  • Imaginary improvements: Require change summary diff.

Checklist

  • Round cap set?
  • Rubric categories defined?
  • External validator present?
  • Actionable checklist enforced?
  • Stop condition logged?