0:00
/
0:00

[Hands-on] Build Your Own AI Avatar With Human-like Memory

(100% open-source, works in real-time)

NaiveRAG is fast but dumb.

GraphRAG is smart but slow.

This open-source solution fixes both.

RAG systems have a fundamental problem: They treat documents as isolated chunks. No connections. No context. No understanding of how things relate.

Graph RAG addresses this, but traditional graph databases become painfully slow for real-time applications.

What if you could combine the speed of vector search with the intelligence of knowledge graphs?

That’s exactly what we have built and shared in the video above.

A real-time AI Avatar that uses a knowledge graph as its memory. You talk to it directly, and everything happens in real-time.

Watch the video for the full demo and code walkthrough. We’ve open-sourced everything.

To power this, we used Zep’s knowledge retrieval system (open-source).

What makes it fast:

↳ Smart retrieval algorithms that avoid full graph search.

↳ Fine-tuned Qwen3 and Gemma models with <10ms embedding and <50ms reranking.

↳ S3 + hot caching for dense vector and BM25 search instead of traditional vector databases.

Zep’s open-source framework Graphiti is available under Apache 2.0, so you can easily self-host it.

You can find Zep’s GitHub repo here →

You can find the code here →

Thanks for reading!

Discussion about this video

User's avatar

Ready for more?