Skip to content
← All explainers

Plain-language explainer

RAG, explained interactively

What is retrieval-augmented generation (RAG)?

RAG is how an AI answers from your documents instead of only its training. When you ask a question, the system searches your content for the most relevant passages, pastes them into the model's context, and asks the model to answer using them. The model never memorized your data. It reads the retrieved text at answer time. That is why RAG can cite sources and stay current, and why most RAG failures are really retrieval failures: if the right passage was not fetched, the model cannot use it.

Do not just read it. Operate the mechanism yourself in a short interactive lesson.

See it work: RAG: retrieval as a callback to similarity β†’

Free, no code, no signup.

What people get wrong

  • RAG means the model was trained on your data. It is not; it reads retrieved text at the moment of answering.
  • A wrong RAG answer means a weak model. Far more often the retrieval missed the right passage.
  • RAG replaces fine-tuning. They solve different problems: RAG adds knowledge, fine-tuning shapes behavior.

Where you see it in real products

  • Support bots answer from a help center using RAG.
  • Internal 'chat with your docs' tools retrieve from a private knowledge base.
  • AI search engines fetch web pages, then write an answer that cites them.

Related explainers

Part of See How AI Works, a free interactive course, where you learn how modern AI works by operating it, not watching videos.