GEO · ~7 min
An engine doesn't cite your page. It cites one chunk of it. Two structural moves decide whether that chunk is a tight, quotable answer — or a blurred average.
RAG systems chunk your content (typically 256–512 tokens), embed each chunk, and at query time rank chunks by cosine similarity to the question. The returned chunk is what gets cited — not the page.
When a section opens with a direct answer, the chunk's dominant semantic signal is that answer — strongly similar to queries about the topic. Open with context, caveats, or "in this article we'll explore…" and the embedding averages across preamble and answer, producing a weaker signal for any single query.
The pattern: open every H2 with a 40–60 word self-contained answer, then elaborate. Long enough to be citable on its own; short enough to dominate the chunk before elaboration dilutes it.
The same averaging penalty applies at the page level. A 2,000-word page covering five subtopics produces five blended embeddings, each weaker than a dedicated page's. One concept per page makes the page map cleanly to one chunk — the top passage is about exactly that concept, not a mix of tangents.
NVIDIA's 2024 benchmark found page-level chunking gave the highest average retrieval accuracy, with 256–512 token chunks best for factoid queries. Keep H2 sections to 200–400 words — enough context, still semantically tight.
Descriptive headings carry weight too: "How RAG Systems Score Sections" is a retrieval anchor; "Overview" is zero
semantic load. And headings enable deep links — an engine can cite page.md#how-rag-chunks, not just the page.
Answer-first optimizes for chunk-based vector retrieval. It adds little when a tool embeds whole documents, uses keyword/BM25 search, or pastes the entire page into context (no retrieval step). And over-atomizing hurts: a multi-step workflow split across three pages may retrieve only step 2 with no setup. The rule is one meaningful concept per page — not one sentence.
Retrieval practice — recall, don't peek
Question 1The recommended length for an answer-first section opener is…
Question 2Opening a section with preamble instead of the answer…
Question 3A single page covering five subtopics tends to produce…
Question 4Answer-first structure adds little benefit when the tool…
Question 5 · spaced recall from Lesson 02Perplexity's retrieval differs from ChatGPT's in that it…