DOXA audits an existing RAG pipeline from its traces, runs Bayesian optimization over chunking, retrieval, and prompts against a golden set, then promotes the winner through staged canary gates with automatic rollback. This simulation replays a representative tuning run.
▸ doxa connected to client pipeline via trace export (LangSmith format) ▸ golden set loaded: 120 question–answer pairs ▸ press Run auto-tune to start the loop▮
Prefer to watch?
A full run, recorded.
Screen recording of this demo completing end-to-end — same thing you can run interactively above.