Give It Your Docs (RAG)

Act 2 · ~6 min

Theory

The most useful pattern beyond plain chat is giving the AI your documents and asking it to answer from them.

The technical name is RAG — retrieval-augmented generation. The plain idea: search first, then answer.

Without RAG

"What's the cancellation window in my gym contract?"

The model guesses from what gym contracts usually say. Could be 30 days, could be 90, could be wrong for yours entirely.

With RAG

Give it the PDF. The retriever finds the cancellation clause. The model answers using that clause and quotes the line.

You check the quote against the PDF in 5 seconds.

What's happening inside:

Chunking — documents get split into pieces. Most tools land around half a page.
Retriever — ranks chunks by relevance and hands the top few to the model. Usually via embeddings (next lesson).
Grounding — "answer only from the provided text" keeps the model from drifting back to general memory.

You meet RAG every day. "Chat with your PDF." Notion AI answering from your workspace. Same pattern: fetch the relevant slice of your material, then write.

Specific document → use RAG. General knowledge → plain chat. And ask for the quote.

Understanding

Give It Your Docs (RAG)

Theory