We created the following visual which illustrates the difference between traditional and Graph RAG.
Today let’s understand how it works.
On as side note we’ve already covered Graph RAG in much more detail with implementation in our Part 7 of our RAG crash course, read it here: Graph RAG deep dive.
Imagine you have a lengthy document, such as a biography of an individual (X), where each chapter discusses one of his accomplishments, among other details.
For example:
Chapter 1: Discusses Accomplishment-1.
Chapter 2: Discusses Accomplishment-2.
...
Chapter 10: Discusses Accomplishment-10.
Now, I want you to understand the next part carefully!
Lets say you've created a traditional RAG over this document and use it to summarise all these accomplishments.
This might not be possible with traditional retrieval as it must requires the entire context...
...but you might only be fetching some top-k relevant chunks from the vector db.
Moreover, since traditional RAG systems retrieve each chunk independently, this can often leave the LLM to infer the connections between these chunks. (provided the chunks are retrieved).
Graph RAG solves this problem.
The idea is to first create a graph (entities & relationships) from the documents and then do traversal over that graph during the retrieval phase.
Lets see how Graph RAG solves the above problems.
First, a system (typically an LLM) will create the graph by understanding the biography (unstructured text).
This will produce a full graph of nodes entities & relationships, and a subgraph around accomplishments will look something like this:
X → <accomplished> → Accomplishment-1.
X → <accomplished> → Accomplishment-2.
...
X → <accomplished> → Accomplishment-N.
When summarizing these accomplishments, the retrieval phase can do a graph traversal to fetch all the relevant context related to X's accomplishments.
This context, when passed to the LLM, will produce a more coherent and complete answer as opposed to traditional RAG.
Another reason why Graph RAG systems are so effective is because LLMs are inherently adept at reasoning with structured data.
Graph RAG instills that structure into them with their retrieval mechanism.
On a side note, even search engines now actively use Graph RAG systems due to their high utility.
We’ve already covered Graph RAG in much more detail with implementation in our Part 7 of our RAG crash course, read it here: Graph RAG deep dive.
Moreover our full RAG crash course discusses RAG from basics to beyond:
Thanks for reading.
P.S. For those wanting to develop “Industry ML” expertise:
At the end of the day, all businesses care about impact. That’s it!
Can you reduce costs?
Drive revenue?
Can you scale ML models?
Predict trends before they happen?
We have discussed several other topics (with implementations) in the past that align with such topics.
Here are some of them:
Learn sophisticated graph architectures and how to train them on graph data: A Crash Course on Graph Neural Networks – Part 1.
So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here: Bi-encoders and Cross-encoders for Sentence Pair Similarity Scoring – Part 1.
Learn techniques to run large models on small devices: Quantization: Optimize ML Models to Run Them on Tiny Hardware.
Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust: Conformal Predictions: Build Confidence in Your ML Model’s Predictions.
Learn how to identify causal relationships and answer business questions: A Crash Course on Causality – Part 1
Learn how to scale ML model training: A Practical Guide to Scaling ML Model Training.
Learn techniques to reliably roll out new models in production: 5 Must-Know Ways to Test ML Models in Production (Implementation Included)
Learn how to build privacy-first ML systems: Federated Learning: A Critical Step Towards Privacy-Preserving Machine Learning.
Learn how to compress ML models and reduce costs: Model Compression: A Critical Step Towards Efficient Machine Learning.
All these resources will help you cultivate key skills that businesses and companies care about the most.
Hey Avi, I’ve sent you a DM with a great proposal for your newsletter. Would love for you to take a look when you get a chance!