AgentOps
The 'Black Box' recorder for autonomous coding agents.
Access is available on request for partners.
The Thesis
Market Context
With the rise of Agentic Engineering (Devin, Cursor), there is a vacuum in 'Agent APM' (Application Performance Monitoring). Traditional tools like Datadog cannot trace non-deterministic reasoning chains. AgentOps fills this gap.
Hypothesis
As we deploy autonomous engineers, debugging non-deterministic code becomes impossible without a replay engine. We needed a way to 'rewind' an agent's thought process step-by-step.
Technical Challenges
High-Throughput Log Ingestion
Agents generate massive verbose logs (internal monologue). Standard SQL databases were too slow. We migrated to ClickHouse to handle 1GB/s log ingestion streams from multiple concurrent agents.
Visualizing Non-Linear Logic
Agents often branch or loop. We had to build a custom DAG (Directed Acyclic Graph) visualizer using React Flow that supports real-time streaming updates via WebSockets.
System Design
- 01. Ingestion: Rust High-Performance Sidecar
- 02. Storage: ClickHouse (Time-Series Logs)
- 03. Analysis: Background Workers (Python/Pandas)
- 04. Frontend: Next.js + React Flow + WebSockets
Outcomes
Reduced agent regression testing time by 65%. Now monitors all Mavik Labs production workloads, processing 50M+ tokens/day.
Research Roadmap
Single-agent tracing
Multi-agent swarm support
Public API & SDK Release
Other Experiments
Let's build
something real.
No more slide decks. No more "maybe next quarter".
Let's ship your MVP in weeks.