theta labs / observability layer

Observe every agent step before it becomes a production issue.

Theta gives multimodal systems one operational layer for traces, screenshots, tool execution, evaluation drift, and incident review so operators can see what happened, why it happened, and where reliability breaks down.

1 schema
for traces, screenshots, tool calls, and eval signals
Live replay
for browser and multimodal runs that need visual context
Searchable
from raw event streams to operator-ready incident review
Self-hosted
for teams that need observability without losing data ownership
Software preview / theta.png
Theta observability dashboard showing a traced agent run, metadata, and review tools.

Review live runs, browser evidence, metadata, and tool activity from a single operator surface.

LLM responses, chain boundaries, and tool spans
Browser screenshots, clicks, and UI drift
Evaluations, incidents, and regression clusters
Self-hosted review for operators and platform teams

stack coverage

OpenAIAnthropicPlaywrightBrowser-useOpenTelemetryMCP

Observability layer

One operating model for agents that touch more than text.

Theta brings traces, screenshots, tool activity, and evaluation context into one operating model for agent teams running production systems.

01 / Capture

Instrument the runtime once

Send SDK spans, raw events, browser artifacts, and evaluation metadata through one ingest path that survives model and framework changes.

02 / Normalize

Resolve every signal into one run record

Theta aligns text completions, screenshots, tool activity, and policy events into a shared timeline instead of scattering context across logs.

03 / Operate

Move from trace inspection to incident action

Operators can search failures, compare runs, review multimodal evidence, and understand regressions without rebuilding the story by hand.

Operator surfaces

From raw run data to decisions operators can defend.

Theta is positioned as an observability layer, not a trace graveyard. The important shift is from seeing events to understanding what happened, why it happened, and what to do next.

Run replay

Inspect the exact sequence of model output, tool activity, browser state, and user-visible evidence on one surface.

Reliability views

Track regressions across experiments, model switches, and deployment changes before they harden into user-facing incidents.

Search and routing

Start with intent-level search, pivot into exact runs, then hand the same context to evaluation or incident workflows.

Controls

Keep data ownership, exportability, and review discipline while still giving operators a clean investigative workflow.

Integration path

Attach Theta to the stack you already run.

The landing page keeps the integration story concrete: standard SDK entry points, browser instrumentation, and export paths for teams that need observability without surrendering control of the data plane.

Signal sources

Python and Node SDKs
OpenTelemetry aligned event streams
Playwright and browser-use sessions
Raw HTTP ingest for custom runtimes
MCP server instrumentation
JSON and Parquet export paths
Pythoninstrumentation
from theta_observability import trace

with trace(name="checkout-agent") as run:
    run.event("browser.open", url="/cart")
    run.event("browser.click", target="buy-button")
    run.attachment("screenshot", "./theta.png")
    run.metric("task_success", 1.0)
Nodeinstrumentation
import { trace } from "@theta/observability";

await trace("research-agent", async (run) => {
  await run.event("llm.response", { model: "gpt-5.4" });
  await run.event("tool.call", { name: "browser.snapshot" });
  await run.metric("latency.bucket", "p95");
});