Execution Management
Inspect execution status, fetch runtime logs, resolve waits, and manage lifecycle actions
KitaruClient is the programmatic API for managing and inspecting executions outside your flow functions.
KitaruClient and the CLI use your current Kitaru connection context. If you want to inspect executions from a deployed Kitaru server, connect first with kitaru login ... or provide KITARU_* connection variables in the current environment.
Create a client
import kitaru
client = kitaru.KitaruClient()The client uses your current Kitaru connection/project context.
Inspect a single execution
execution = client.executions.get(exec_id)
print(execution.exec_id)
print(execution.flow_name)
print(execution.status) # running/waiting/completed/failed/cancelled
if execution.pending_wait:
print(execution.pending_wait.name, execution.pending_wait.question)
if execution.failure:
print(execution.failure.origin, execution.failure.message)Execution details include:
start/end timestamps
stack name
summary metadata
checkpoint calls
pending wait details (
execution.pending_wait)execution failure details (
execution.failure) when status isfailedcheckpoint retry/failure attempt history (
checkpoint.attempts)artifacts
frozen execution spec (when available)
List and query executions
Execution statistics
Use execution statistics when you want counts, trends, or health checks without fetching every individual execution first. This is the difference between asking "show me the last 20 executions" and asking "how many executions failed this week?" Kitaru sends the aggregate question to the active Kitaru runtime and returns a small grouped result. Each group always includes an execution count. You can also ask for numeric metrics, such as average duration or the sum of a numeric execution metadata key.
The CLI exposes the same surface:
Text output is intentionally small:
When you request metrics, text output adds one column per metric:
Supported groupings are:
status→ public Kitaru status (running,waiting,completed,failed,cancelled)flow→flow_idstack→stack_idtag→ tag valuetime:hour,time:day,time:week,time:monthmetadata:<key>→ the value stored for that execution metadata key
Supported metric sources are:
durationstep_countcached_step_countoutput_artifact_countmetadata:<key>through a metric spec that setssource="metadata"andmetadata_key="<key>"
Supported aggregations are avg, sum, min, and max.
CLI metric specs use this format:
<name>:<source>:<avg|sum|min|max>for built-in sources<name>:metadata:<metadata_key>:<avg|sum|min|max>for metadata
Grouping by metadata:<key> includes the matching metadata values in the statistics output. Only use it for metadata keys whose values are safe to show to whoever can read the CLI, SDK, or MCP response.
Metadata metrics read numeric execution metadata. If the metadata value is stored as text or as a nested object, the active Kitaru runtime cannot aggregate it as a number. Store the value as an integer or float when you want to use it in statistics.
LLM usage and cost metadata
When an execution makes LLM calls through kitaru.llm() or the supported agent adapters, Kitaru records canonical llm_usage_v1 metadata on the checkpoint that made or reused the provider work. One usage record usually means one provider interaction or one adapter-level graph/agent invocation, depending on which adapter produced it. When FlowHandle.wait() or FlowHandle.get() observes the execution finishing, Kitaru reads those checkpoint records and writes two execution-level views:
llm_usage_summary_v1is the inspection view.kitaru executions getand the Python client parse it intoexecution.llm_usage_summary. It tells you what happened in one execution. Itsusage_record_count,incurred_usage_record_count, andreused_usage_record_countfields count Kitaru usage records, not raw provider API calls.Flat numeric metadata keys such as
kitaru_llm_display_cost_usd_v1andkitaru_llm_total_tokens_v1are the statistics view. Kitaru execution statistics can sum or average these because they are top-level numbers, not nested objects.
Cost fields are intentionally split:
actual_cost_usdmeans the provider reported a cost. Claude Agent SDK exposes this viatotal_cost_usd.estimated_cost_usdmeans Kitaru used an adapter cost calculator. OpenAI Agents and LangGraph can report this when you configure their calculator hook.display_cost_usduses actual cost for a record when present, otherwise estimated cost. Treat it as observability, not as a billing invoice.
Direct kitaru.llm() records token counts and latency, but it does not invent a cost number. If the provider call does not return a real cost source, cost stays empty and the execution summary increments records_without_cost_count.
Useful statistics queries:
In v1, terminal LLM summaries are written when the SDK observes completion via FlowHandle.wait() or FlowHandle.get(). A remote execution that finishes but is never observed through those paths can still have per-checkpoint llm_usage_v1 records, but it may not have llm_usage_summary_v1 or the flat kitaru_llm_*_v1 statistics keys yet. executions.get stays read-only and does not backfill missing summaries.
Supported filters are flow, status, stack, tags, and max_groups. Multiple tag filters mean "executions that have all of these tags". When max_groups truncates a time-grouped result, Kitaru keeps the newest time rows and still displays the rows from oldest to newest.
flow and stack groupings currently return IDs (flow_id and stack_id), not display names. This avoids guessing when a flow or stack has been renamed or deleted. You can still filter by a flow or stack name.
The current statistics surface supports grouping by time and metadata, but not filtering by time range or metadata values yet. If you need "last 7 days" or "only executions where customer_tier=enterprise", fetch/list those executions separately or add a stable tag for that cohort before querying statistics.
Agent and operations summaries should use this same general surface. For example, an assistant can ask for daily volume first, then drill into only the unhealthy cohort:
Fetch runtime logs
Runtime log retrieval requires a server-backed connection. For CLI options, follow mode, grouped output, and retrieval caveats, see View Execution Runtime Logs.
Resolve wait input
On local interactive runs, the runtime prompts for input in the same terminal. For non-interactive or timed-out executions, resolve the pending wait externally:
If the execution does not continue automatically after input (e.g. the original runner already exited), call resume(...):
Retry, replay, and cancel
Execution convenience methods
Execution objects returned by client.executions.get(...) also expose convenience methods that call back into the same client:
These are equivalent to calling client.executions.retry(exec_id) etc. — they return a new Execution snapshot rather than mutating the existing object.
Inspect or abort waits programmatically
List all pending wait conditions for an execution:
Abort a pending wait instead of continuing it:
Browse and load artifacts
You can also filter artifact lists:
Manage executions from the CLI
Query executions through MCP
If you want assistant-native tooling (Claude Code, Cursor, etc.), install and run the MCP server:
Then use tool calls like:
kitaru_executions_list(status="waiting")kitaru_executions_statistics(group_by=["status"])kitaru_executions_input(exec_id=..., wait=..., value=...)(MCP requires explicitwait)get_execution_logs(exec_id=...)kitaru_artifacts_get(artifact_id=...)kitaru_status()
If the execution does not continue automatically after wait input is resolved (e.g. the original runner already exited), use the CLI or SDK resume(...) call. MCP does not currently expose a separate resume tool.
See the full setup guide at MCP Server.
Try the examples
For the broader catalog, see Examples.
Last updated
Was this helpful?