Harness, Runtime, Platform
Where Kitaru fits — and doesn't — in an agent stack.
Agent tooling spans four layers. Confusion between them is where most "is Kitaru a competitor to X?" questions come from.

Model layer — the LLM itself. A compute unit over a context window, picked per-call or per-agent: OpenAI, Anthropic, Google, open-weights, fine-tuned in-house.
Harness layer — the loop around the model. Prompts, tools, model loop, context management, structured outputs, in-turn memory. Picked per-agent or per-team.
Runtime layer — how the agent survives and executes over time. Checkpoints, replay, resume, wait states, versioned deployments, invocation routing, artifact + state handling, execution placement.
Platform layer — how the organization governs. Auth, entitlements, interceptors, observability, product UI, policy. Usually lives in your existing stack.
Kitaru sits in the runtime layer. It is not a harness and it is not a packaged platform. It gives platform teams the durable execution primitives they attach to the harness their app teams picked and the platform their org already runs.
Where Kitaru is — and isn't
Pydantic AI / Pydantic AI Harness
Harness
Typed, ergonomic Python agent logic
Claude Agent SDK
Harness
Claude-native autonomous coding / tool loops
OpenAI Agents SDK
Harness
Hosted-tool agents on the OpenAI stack
LangGraph
Harness + runtime (in its own model)
Graph-native agents with built-in checkpointer
Deep Agents
Harness (on LangGraph)
Opinionated multi-agent pattern
LangSmith Deployment
Runtime + platform (packaged)
Adopting the LangChain-hosted stack
Temporal
Runtime (general-purpose)
Polyglot, deterministic workflow engine
DBOS
Runtime (general-purpose)
Postgres-backed durable workflows
Kitaru
Runtime (Python-agent-shaped)
Framework-agnostic durable execution primitives
The overlap
Several tools in the runtime row are real alternatives to Kitaru. Worth naming the overlap before drawing the distinction.
LangGraph has its own checkpointer, resume, and time-travel — powerful inside its graph/state-machine model. Kitaru's difference is that
@checkpointwraps ordinary Python boundaries independent of any harness.LangSmith Deployment delivers durable execution + sandboxes + auth proxy as a packaged platform. Kitaru ships just the runtime primitives so platform teams bring their own auth, sandbox provider, and governance.
Temporal is a battle-tested polyglot durable workflow engine. Kitaru is Python-first, agent-shaped (first-class
kitaru.llm(),kitaru.wait(), artifact lineage), with a simpler single-service deployment.DBOS is a Postgres-backed durable workflow library with deterministic workflow bodies. Kitaru flows are plain Python with no determinism requirement; state and artifacts live in your own cloud bucket, not Postgres.
Two worldviews
Neither is universally better. They optimize for different buyers.

What Kitaru owns vs integrates with
Platform teams rightly push back on tools that try to own everything. What Kitaru actually takes responsibility for:
Checkpoint / replay / resume
Yes
Core product
Flow versioning and invocation routing
Yes
Core product
Execution placement per checkpoint
Yes, as config
@checkpoint(runtime="isolated") today; richer policy evolving
Sandbox implementation
No
Provide adapters; don't mandate a vendor
Secrets storage
Partly
Alias-linked secret resolution for kitaru.llm(); integrate with your secret manager
Auth to invoke flows
Yes
Workspace keys / service accounts; no per-deployment tokens
Enterprise entitlements / RBAC
No
Integrate with your platform
Network egress policy
No
Determined by the execution target your stack provides; Kitaru does not enforce it
Interceptors / guardrails
No
Harness or your platform owns this
Observability
Partly
Runtime metadata, logs, artifact lineage; integrate with your tracing
Data compliance policy
No
Policy stays with your platform; Kitaru does not mandate one
The line to remember:
Durability without execution policy is not enough for production agents — but Kitaru should make policy attachable to execution boundaries, not mandate the policy itself.
Concrete split in code
A Python research agent, with each layer doing its part:
Harness decides how
plan,retrieve,synthesizereason.Kitaru runtime decides what is durable, what can replay, what waits, where each checkpoint runs.
Your platform decides who can invoke
research_agent, which stack it runs on, and what gets logged where.
When Kitaru is the wrong size
If your whole org standardizes on LangGraph + LangSmith, Kitaru adds less. Use what you have.
If you are building one agent for yourself and never leave your laptop, a harness alone is enough.
If you want a hosted, all-in-one agent platform and don't need to self-host anything, a packaged platform is the better buy.
When Kitaru fits
Application teams across your org pick different harnesses (Pydantic AI, Langchain's Deep Agents, Claude Agent SDK, internal).
Infra must be self-hosted (regulated industry, on-prem requirements, sovereignty).
The platform team wants runtime primitives, not a packaged platform that replaces the one they already operate.
Deployment must plug into existing Kubernetes, secret manager, observability, and data policy — not live in someone else's control plane.
Durable execution needs to be independent of any single framework's worldview.
Shorthand
Harnesses define behavior. Kitaru defines durable execution. Platforms define governance.
Or the even shorter version:
Use a harness to build the agent. Use Kitaru when that agent becomes a durable, versioned, self-hosted production workload.
Related
Last updated
Was this helpful?