Chapter 01 — Introduction
What is prompting?
Prompting is the deliberate craft of shaping the input you give a large language model (LLM) so that its output is reliable, useful, and aligned with your intent. A prompt can include: an instruction, role/persona framing, style constraints, examples (few‑shot), formatting requirements, evaluation criteria, and optional external context (retrieved documents, user data, structured facts).
Good prompting reduces downstream editing, lowers cost (fewer retries), and makes later automation (post‑processing, validation, programmatic chaining) easier.
Core principles
- Explicit over implicit: State format, perspective, style, success criteria.
- Constrain output surface: JSON keys, bullet limits, token caps, schema expectations.
- Progressive disclosure: Decompose complex goals into smaller sequential calls.
- Grounding: Provide authoritative context (retrieval, domain facts) to lower hallucination risk.
- Iterate with feedback: Use critique → refine loops for higher factual + stylistic quality.
- Determinism where needed: Lower temperature for evaluation / parsing; higher for ideation.
When prompting fails
- Ambiguity: The model guesses (often confidently) when goals or constraints are underspecified.
- Format drift: Free‑form natural language instead of machine‑readable JSON / tables.
- Reasoning shortcuts: The model jumps to an answer without unpacking steps (use CoT / LtM).
- Context dilution: Long, unordered dumps push key facts past the attention budget.
- Hallucination: Unsupported claims when retrieval / grounding is absent.
Prompt lifecycle
- Frame: Clarify task, audience, output channel, success definition.
- Draft: Produce a minimal but explicit instruction (baseline zero‑shot).
- Enrich: Add examples (few‑shot) or structure (schema) as failure modes appear.
- Instrument: Add evaluation prompts (critique / self‑refine) or automated tests.
- Scale: Externalize variable parts (placeholders) and template safely.
- Monitor: Track drift (format error rate, hallucination rate, latency, cost).
Quick taxonomy (preview)
You will encounter these categories throughout the guide:
- Foundational styles: Zero‑shot, Few‑shot, Role / Persona.
- Reasoning enhancers: Chain‑of‑Thought, Decomposition, Self‑Consistency.
- Structural / control: JSON schemas, delimiters, function / tool specs.
- Advanced orchestration: Maieutic, Self‑Refinement, Least‑to‑Most, ReAct, Tree‑of‑Thoughts, RAG.
Evaluation mindsets
- Precision vs. Coverage: Tight constraints reduce creativity; loosen for ideation phases.
- Factuality vs. Novelty: Retrieval + low temp for accuracy; diversify when exploring.
- Latency vs. Depth: Rich reasoning chains improve reliability but increase cost/time.
A minimal quality checklist
- Does the prompt state the task, constraints, and output format explicitly?
- Is any required context included or retrieved (no hidden dependencies)?
- Can a machine validator decide pass/fail on the output?
- Are reasoning steps requested when correctness depends on logic?
- Is temperature / sampling appropriate for the goal?