← Back to Series
RAG Architecture Series Part 4 of 6

When Your RAG System Needs to Know Where to Look, or How Things Connect

Parts 2 and 3 covered upgrades within a single retrieval pipeline. This part covers two structural changes to what the system does before or during retrieval - often confused with each other despite solving completely different problems.

The diagram shows both patterns: Modular RAG routing queries across distinct data domains, and Graph RAG traversing entity relationships that vector search cannot follow.

Modular RAG: routing across domains

A router classifies query intent first, then dispatches to whichever index, retriever, or structured data source is best suited. A query about parental leave goes to the HR policy index. A query about a specific error code goes to engineering documentation. A query asking "how many tickets did we close last month" goes to a text-to-SQL path against a structured database - not to a document index at all.

Modular & Graph RAG Architecture

Click to enlarge

This earns its complexity when you have multiple genuinely distinct data domains - HR, Legal, Engineering, Finance - each with different content types and different access rules. The router itself is usually lightweight: a small classifier or a single LLM call cheap relative to the retrieval it's directing.

Graph RAG: reasoning over relationships

At ingestion time, an LLM extracts entities and relationships from documents and builds a knowledge graph. At query time, retrieval can traverse these relationships - not just match on semantic similarity. This matters for questions vector search structurally cannot answer: "how does Policy A affect Department Y?" Vector search finds documents mentioning either, but has no mechanism for understanding the relationship runs through Department X. Graph traversal can follow that path explicitly.

Cost note: Graph RAG's ingestion cost isn't one-time - it's recurring every time your corpus changes. Justify it when relationship-aware queries are a meaningful share of real traffic, evidenced by eval data - not because relationship reasoning sounds like a natural next level.

-->

Let's Connect

Interested in discussing AI architecture, LLMOps, or production agent systems?

Get in Touch