If you are building with LLMs, you absolutely need traceability.
Today, let us show you a demo with Opik—an open-source, production-ready end-to-end LLM evaluation platform.
It allows developers to test their LLM applications in development, before a release (CI/CD), and in production.
Here’s an example with CrewAI below:
All you need to do is this:
Put your LLM logic inside a function.
Add the
@track
decorator.
Done!
After this, Opik will track everything within your AI application, from LLM calls (with cost) to evaluation metrics and intermediate logs.
Moreover, you can easily self-host Opik, so your data stays where you want.
It integrates with nearly all popular frameworks, including CrewAI, LlamaIndex, LangChain, and HayStack.
If you want to dive further, we also published a practical guide on Opik to help you integrate evaluation and observability into your LLM apps (with implementation).
It has open access to all readers.
Start here: A Practical Guide to Integrate Evaluation and Observability into LLM Apps.
Thanks for reading!