0
Act 3

Application

2 / 9

Chunking Strategies

Act 3 Β· ~4 min

Theory

RAG pipelines index documents as chunks because embedding models have token limits and dense vectors work best over focused passages.

Size trade-off: smaller chunks match queries precisely but risk losing surrounding context; larger chunks carry richer context but dilute the embedding signal. A common starting range is 256–512 tokens β€” tune against retrieval metrics for your corpus.

StrategyHow it splitsWhen to prefer
Fixed-sizeEvery N tokensSimple documents; fast baseline
Sentence/paragraphNatural language boundariesProse with clear meaning units
SemanticEmbedding-based topic shiftsLong, heterogeneous documents
RecursiveParagraphs β†’ sentences β†’ charactersStructured docs; balances size and structure

Overlap (10–20% of chunk size) shares tokens at boundaries so a sentence split across chunks appears in both β€” preserving continuity retrieval would otherwise miss.

Metadata per chunk β€” source file, position, section title β€” enables filtered retrieval and source citation in the final answer.

Next: chunks become embeddings in a vector store; vector search determines which surface for a given query.