Gemini Interactions
Make Gemini Interactions API turns replayable and observable with Kitaru checkpoints, including Antigravity managed-agent runs
The Google Gemini Interactions API gives you a hosted interaction runtime: you send a request to Gemini or a Google-managed agent, Google runs the interaction, and you receive an interaction response.
Kitaru does not replace that runtime. It adds an outer durable workflow boundary around it:
one stable Gemini interaction response = one Kitaru checkpointThat boundary is useful when a Gemini interaction is one step in a larger flow. Imagine this workflow:
collect input → ask Gemini to analyze it → write report → wait for approval → publishIf Gemini finishes the analysis and the later write report checkpoint fails, Kitaru can replay the flow and reuse the completed Gemini interaction result instead of calling Google again. You keep the interaction ID, output text, status, usage when available, safe step summaries, and Kitaru artifacts that help you inspect what happened.
The important word is stable. The adapter only saves a successful checkpoint when Gemini reaches a durable status:
completed: Gemini produced a final response.requires_action: Gemini reached a handoff point and needs your code, a human, or another system to do something before the interaction can continue.
In-progress background states are not saved as successful checkpoints. Poll them again by interaction ID instead of treating them as finished work.
The mental model
Think of Gemini Interactions as the hosted worker and Kitaru as the outer workflow recorder.
Gemini still runs the inside of the interaction:
Gemini interaction
├─ Gemini model or Google-managed agent planning
├─ interaction steps
├─ built-in tools, web, code execution, or hosted MCP work
├─ Antigravity sandbox files and environment reuse
└─ completed / requires_action responseKitaru records the outside:
So Kitaru can honestly say:
“This Gemini interaction reached a stable response. On replay, I can return that saved boundary result without asking Google to run the same interaction again.”
Kitaru cannot honestly say:
“I can rewind to Gemini's third internal step, restore Google's hosted sandbox, and replay only the remaining tool work.”
A concrete failure story:
On replay, Kitaru can return the saved GeminiInteractionResult without starting another Antigravity job. But Kitaru did not snapshot Google's remote sandbox filesystem. If a file, report, or decision must be durable in your own workflow, return it from the interaction or write it in a later Kitaru-owned checkpoint.
What you get
The adapter gives existing Gemini Interactions users:
one durable Kitaru checkpoint around each stable Gemini interaction response
replay-skip for completed Gemini turns inside larger Kitaru flows
a typed
GeminiInteractionResultwith status, interaction IDs, output text, model/agent, environment ID when reported, usage, timing, and warningssafe step summaries from
interaction.stepsso your flow can respond to handoff points without storing raw provider payloads by default. Step summaries keep metadata such as type, status, and call IDs;text_previewis only filled for clearly identified final assistant/model output.optional streaming methods for watching a long Gemini interaction while it is running, without changing the final checkpoint result Kitaru saves
a clear
requires_actionpath for function results and human approval gatesexplicit
cache_identitysupport when the same runner/request can point at different Google projects, regions, credential aliases, or client setupsredacted request manifests, output, usage, event-log, and run-summary artifacts by default
opt-in raw provider payload capture when you need deeper debugging
a convenience Antigravity request preset used by the adapter example
Google's Interactions API is still a preview/Beta-style surface. Treat schemas, agent names, and hosted-agent behavior as more likely to change than stable Gemini text-generation APIs.
Usage and cost statistics
When save_usage=True (the default), the Gemini adapter records two related things after a stable provider response:
That canonical statistics record uses Gemini token fields such as prompt_token_count / promptTokenCount, candidates_token_count / candidatesTokenCount, cached_content_token_count / cachedContentTokenCount, and thoughts_token_count / thoughtsTokenCount when Google reports them. The token counts then roll up into execution-level LLM usage summary fields and flat statistics keys after your code observes the terminal execution with FlowHandle.wait() or FlowHandle.get().
The canonical usage record's status describes the provider call, not the next thing your application must do. So a Gemini interaction that returns GeminiInteractionResult.status == "requires_action" still writes a canonical LLM usage record with status="completed": Google returned a stable response, and that response is asking your code to send a follow-up turn.
Gemini tracking is token-only in this adapter version. Google reports token usage metadata for these calls, but Kitaru does not receive a provider-reported per-call dollar cost from Gemini Interactions and does not estimate Gemini price locally here. That means Gemini records normally have:
So if an execution includes Gemini calls, records_without_cost_count increasing is expected. It means “this provider call had no dollar-cost value attached,” not “usage tracking failed.” Token counts can still be present and aggregated.
Set save_usage=False when you do not want Gemini usage persisted. That disables both the raw usage artifact and the canonical llm_usage_v1 metadata record for that interaction, even if Gemini-specific events are still enabled.
Install
Add the Gemini extra. Include local if you want the local Kitaru server and dashboard:
Initialize the project once:
For real Google calls, authenticate one of two ways.
Most users set a Gemini API key (the AI Studio / Developer API backend):
If your organization blocks raw API keys, use Application Default Credentials (ADC) through Vertex AI instead. ADC is picked up automatically after you log in with gcloud, so you set the backend, project, and region rather than a key:
On Vertex AI the Interactions API currently serves agent interactions (agent=, such as Antigravity), not raw model interactions, and only in the global location. Use model= interactions with an API key, and agent= interactions on Vertex. Antigravity defaults to background=True because some Vertex / Chiliagon managed-agent paths require it. If a preview endpoint explicitly rejects background mode, pass background=False as a foreground escape hatch.
Migrating an existing Gemini Interactions or Antigravity managed-agent project? The zenml-io/kitaru-skills package includes /kitaru:kitaru-gemini-interactions-migration for wrapping stable interaction responses and checking polling, requires_action, function-result, and Google-owned-internals boundaries. See Agent Skills.
Minimal flow
Use GeminiInteractionRequest.start(...) for a fresh interaction. Exactly one of model= or agent= must be set.
The checkpoint name is derived from the runner name. In this example, the adapter-created checkpoint is named:
On replay, if that checkpoint is already complete and cache/replay rules allow reuse, Kitaru serves the saved GeminiInteractionResult. Gemini is not called again for that completed interaction boundary.
How a run works, step by step
When runner.run_sync(GeminiInteractionRequest.start(...)) executes inside a Kitaru flow, this is the concrete sequence:
For poll requests, step 3 is client.interactions.get(...) instead. That matters for background jobs: polling an existing interaction checks the job you already started; it does not create a duplicate remote job.
Streaming
Use streaming when you want to watch a Gemini interaction while Google is still working. The replay promise stays the same:
A foreground model streamed run looks like this:
A background managed-agent streamed run uses a slightly different provider route:
So if the flow later replays from a completed checkpoint, the saved final result may come back immediately. Gemini is not called again, and fresh live stream events should not be expected on that checkpoint cache hit. Think of the stream like the window in a train: useful while the train is moving, but the ticket Kitaru saves is still the final arrival record.
Synchronous streaming
Async streaming
Use run_stream_sync(...) from normal synchronous flow code. If you are already inside an event loop, use await runner.run_stream(...); the sync method refuses to run inside an active event loop so it does not hide an asyncio deadlock.
Live event constants
The streaming live-event names are exported from kitaru.adapters.gemini:
Their values are:
GEMINI_STREAM_STARTED
gemini_interactions.stream.started
Kitaru started draining a Gemini stream.
GEMINI_STREAM_EVENT
gemini_interactions.stream.event
A safe normalized stream update, such as interaction status, step start/stop, text delta metadata, tool-argument metadata, thought metadata, media metadata, provider error, or done.
GEMINI_STREAM_COMPLETED
gemini_interactions.stream.completed
The adapter finalized a stable provider result and is returning it to the surrounding checkpoint. This terminal event is flushed.
GEMINI_STREAM_FAILED
gemini_interactions.stream.failed
The provider stream, finalization, artifact capture, strict event persistence, or cancellation failed. This terminal event is flushed.
GEMINI_STREAM_EVENT_KINDS contains all four names. GEMINI_STREAM_TERMINAL_EVENT_KINDS contains the completed and failed names.
Privacy defaults
Live stream events are best-effort observability. If event publishing fails, the Gemini provider result is still allowed to finish. By default, live events avoid raw content. They can include safe metadata such as:
adapter name, runner name, public surface, and
scope="interaction"event type, event ID, interaction ID, status, and category
step index, step ID, step type, tool name, and call ID
whether usage was present
By default, live events do not include raw prompts, full assistant content, tool arguments, function results, thought-summary text, raw SDK objects, image/audio/document/video payloads, sandbox files, or secrets.
If you deliberately want short streamed model-output text deltas in live events, opt in on the capture policy:
The text deltas are clipped for display. Tool arguments and thought content still stay out of the default live payloads. This setting also changes the streaming cache identity, so Kitaru does not silently mix checkpoints created with different stream-display policies.
The runnable examples/integrations/gemini_interactions_agent/ script is more visual by design: --stream enables clipped text-delta display by default so you can confirm streaming in a terminal. Pass --hide-text-deltas there if you want the example watcher to show event labels only.
Stable result semantics
The public return value is still GeminiInteractionResult. Streaming metadata is stored under result.metadata["stream"] and includes the event count, counts by provider event type, last event ID when reported, final status, truncation flags, accumulated step count, the reconstruction policy, which observation route was used, and whether same-id fallback polling was needed. These are inspection details, not new top-level result fields.
Kitaru only completes normally for:
completedrequires_action
A streamed requires_action result is the same handoff point as a non-streamed one: save the checkpoint, run your local code or wait for a human, then send a matching GeminiInteractionRequest.function_result(...) turn.
Poll streaming
GeminiInteractionRequest.poll(...) is also streamable when the installed google-genai SDK exposes interactions.get(..., stream=True):
That streamed poll still checks the existing remote interaction. It does not create a duplicate Gemini job. Background streaming also depends on streamed get(...) when timeout_s is set, because Kitaru creates the background job without stream=True and then watches the returned id. If no timeout is set, Kitaru does not open an unbounded background stream: it creates the job, checks the same id once, and raises with a GeminiInteractionRequest.poll(...) continuation instruction if the status is still not stable. If the installed SDK does not support streamed get(...), Kitaru raises a feature-availability error when you call the stream method; importing kitaru.adapters.gemini still works.
Antigravity streaming caveats
GeminiInteractionRequest.antigravity(...) can be used with run_stream_sync(...) on the same Interactions streaming surface as other agent interactions. It is still preview-shaped and still one coarse checkpoint: Google owns the managed agent loop, sandbox, tool execution, environment reuse, and internal event schema.
The adapter's Antigravity preset now keeps store=True and defaults to background=True. With timeout_s set, Kitaru creates one background interaction, observes that same id with streaming when the backend supports it, and falls back to polling that same id if the stream is unavailable or incomplete. Without timeout_s, Kitaru creates/checks the same id once and asks you to continue with GeminiInteractionRequest.poll(...) if it is still running. Google's public Antigravity surfaces have not been consistent about background behavior during preview, so pass background=False only when a specific endpoint explicitly rejects background mode. Keep tasks non-destructive unless you intentionally want the managed agent to edit or execute things.
Requests and continuation
Start a new interaction
Use GeminiInteractionRequest.start(...) when you want a fresh Gemini turn:
Continue an interaction
Google's Interactions API can continue server-side history with previous_interaction_id. When you continue an interaction, re-specify the model or agent and any tools, system instruction, generation config, or response format that should apply to the new turn.
Set store=True when you need continuation. If store=False, Google should not be expected to keep enough server-side state for a later previous_interaction_id turn.
Continuation and Kitaru replay are related but different:
Gemini interaction_id
Google's handle for continuing or polling an interaction.
Kitaru checkpoint
Kitaru's durable workflow boundary for skipping completed workflow work.
Gemini hosted environment
Google's runtime/sandbox state. Kitaru records IDs and summaries, not a filesystem snapshot.
Kitaru artifact
Data Kitaru saved around the boundary for audit/debugging.
Antigravity managed-agent runs
GeminiInteractionRequest.antigravity(...) is the convenience path for the Google Antigravity managed-agent preview. It uses the adapter's centralized Antigravity agent ID, defaults environment="remote", forces store=True, defaults background=True, and adds non-sensitive preview metadata.
This is still one coarse interaction checkpoint. Kitaru records the stable Antigravity response and capture envelope. Google still owns the hosted agent loop, sandbox, web/code/tool execution, and environment reuse.
The runnable adapter example includes Antigravity support so you can exercise that environment path explicitly:
Use Antigravity mode intentionally. It may be slower, costlier, and more preview-shaped than a simple model interaction. Set timeout_s when you want Kitaru to wait for a background Antigravity job to finish or reach requires_action. Without timeout_s, Kitaru creates/checks the background job without waiting forever; if the status is still in progress, it raises with a GeminiInteractionRequest.poll(...) continuation instruction for the same interaction id. If a preview endpoint explicitly rejects background mode, pass background=False for that endpoint only.
requires_action and function results
requires_action and function resultsA Gemini interaction can return status="requires_action". That is not a final answer; it is a durable handoff point.
A typical flow looks like this:
The result's steps list summarizes the interaction steps so your flow can find a function call ID or action cue without putting raw prompt/tool payloads into top-level metadata.
Keep human-in-the-loop waits at flow scope. For example, let the flow inspect the requires_action result, call kitaru.wait() if a person must decide, then send the later function_result interaction after the wait resumes. That keeps the pause visible to Kitaru instead of hiding it inside a provider-owned turn.
Polling background interactions
For background interactions, avoid accidentally starting duplicate remote jobs. If an interaction is already created and you need to check it later, use GeminiInteractionRequest.poll(interaction_id=...). Poll requests call client.interactions.get(...); they do not call client.interactions.create(...).
If a background interaction has not reached completed or requires_action, the adapter raises KitaruRuntimeError instead of saving an unfinished response as a successful checkpoint. Continue polling the same interaction_id; do not retry by creating a fresh background request unless you intentionally want another remote job.
You can also poll with runner.run_stream_sync(...) or await runner.run_stream(...) when your installed Google SDK supports streamed interactions.get(..., stream=True). That lets you watch the existing background interaction progress without creating a second remote job.
Cache identity
The cache key uses the request, runner name, Kitaru strategy, installed google-genai SDK version, and optional cache_identity. Streaming calls also include a stream-specific cache surface plus the stream-display/reconstruction policy. Kitaru does not inspect live Google client internals such as project, region, or credentials.
That means this situation is risky without cache_identity:
From Kitaru's view, those can look identical. If they should not share replay or cache behavior, give them different identities:
Use cache_identity whenever project, region, credential alias, endpoint, or other client configuration changes the meaning of the same logical request. It must be a stable, non-secret string such as "project/region"; do not pass a live client object or anything whose repr() can change between processes.
Result shape
KitaruGeminiInteractionsRunner.run_sync(...) returns a GeminiInteractionResult with fields such as:
statusinteraction_idprevious_interaction_idoutput_textwhen Gemini exposes textmodeloragentenvironment_idwhen Google reports onesteps, a list ofGeminiInteractionStepSummaryrecords derived primarily frominteraction.stepsusagewhen reported by the SDKpoll_count,duration_ms,sdk_version, and non-sensitivemetadataartifact names for captured request manifest, output, usage, event log, and run summary
artifact names for raw input, raw interaction, and raw steps only when those captures are explicitly enabled
warningsfor best-effort capture or SDK-shape compatibility issuesstream inspection metadata under
metadata["stream"]when you usedrun_stream(...)orrun_stream_sync(...)
The adapter treats interaction.steps as the primary response shape. If an older SDK exposes outputs instead, Kitaru can summarize those for compatibility and adds a warning. If the SDK omits output_text, Kitaru only derives fallback output from a clearly identified final assistant/model step. Prompt, tool, sandbox, or ambiguous timeline text is not merged into output_text.
Failed Gemini interactions raise an exception instead of returning a successful GeminiInteractionResult. If the failure happens inside a Kitaru checkpoint, the adapter records best-effort failure metadata before the exception propagates.
Capture policy
By default, Kitaru saves the boundary data that is useful for replay inspection and audits:
redacted request manifest
output text
usage when reported
event log
run summary
safe step summaries on the returned result. Their
text_previewfield is disabled for prompt, tool, sandbox, and ambiguous timeline content by default.
It does not save raw prompts, raw interaction payloads, or raw step payloads unless you opt in. It also does not save raw stream transcripts; streaming live events are not saved as a durable transcript by this adapter version.
Turn on raw capture only when you need it for debugging or audit review:
Treat raw provider payloads as conversation data. They may contain prompts, retrieved snippets, tool arguments, model output, or other sensitive material. The redacted request manifest removes common secret-like keys such as API keys, authorization headers, tokens, credentials, cookies, and passwords, but raw payload artifacts are not a secret store.
Capture failures are non-fatal by default because Google may already have succeeded by the time Kitaru tries to save artifacts. Retrying after a strict capture failure can duplicate provider-side work, so only enable fail_on_artifact_capture_error=True or fail_on_event_persistence_error=True when you understand that trade-off.
Checkpoint strategy
The Gemini Interactions adapter currently supports one public strategy:
"interaction" is also the default. It means one stable Gemini interaction response becomes one Kitaru checkpoint.
Strategies such as per-step, per-tool, web-call, code-execution, hosted-MCP, or Antigravity-internal checkpointing are not exposed because Kitaru does not own those call bodies. Google runs them inside its hosted interaction runtime. The adapter would be lying if it claimed it could replay those internals independently.
A future client-tool mode may be possible for tools that your Python process executes itself: Gemini reaches requires_action, Kitaru runs the local tool body, and a later interaction sends the matching function result. That would only cover the local tool body Kitaru actually runs. It would still not make Google-owned managed-agent internals replayable.
Recommended durability pattern
Put the Gemini runner call directly in the flow body so the adapter can create its own checkpoint around the interaction. Put side effects that must be durable in separate Kitaru checkpoints after Gemini returns.
That gives you a concrete sequence:
If a later checkpoint fails, Kitaru can reuse the completed Gemini interaction result instead of calling Google again.
Constraints and gotchas
Stable statuses only. The adapter only saves successful checkpoints for
completedandrequires_action. Background work that is still running should be polled again by interaction ID.Coarse durability. One Gemini interaction response is the replay unit. Google-owned internal steps, hosted tools, hosted MCP work, web/code execution, and Antigravity sandbox mutations are not separate Kitaru checkpoints.
requires_actionis a handoff point. Use it to move work back into your flow: run local code, callkitaru.wait()for a human, then send a matchingfunction_resultinteraction.Use
cache_identityfor cross-client disambiguation. If the same runner name and request can point at different projects, regions, credential aliases, endpoints, or client setups, set an explicit identity.Raw provider capture is off by default. Safe summaries and redacted manifests are captured by default; raw prompts/interactions/steps require an explicit opt-in. Streaming text deltas are also hidden by default and must be enabled with
include_stream_text_deltas=True.Streaming is observability, not replay state. Stream events help you watch a live provider call. On a checkpoint cache hit, Kitaru may return the saved final result without replaying the live stream.
Antigravity environment support is adapter-example level. The adapter gives you a preset for the Google-managed Antigravity environment path, but Google still owns the remote environment lifecycle and behavior.
Runnable example
Run the educational integration example:
To inspect the example without making a Gemini API call:
The example includes:
--helpfor smoke tests--dry-runfor a no-network preview--streamto userun_stream_sync(...)for real calls and preview stream metadata in dry runs--mode modelusinggemini-3.5-flash--mode antigravityas an explicit slower/costlier managed-agent demo
Troubleshooting
“Why did replay not resume inside Antigravity's internal work?”
Because Kitaru's boundary is the stable Gemini interaction response. Antigravity internal planning, sandbox files, hosted tools, and environment reuse happen inside Google's runtime. Kitaru can reuse the saved result of a completed interaction; it cannot restore an arbitrary midpoint inside Google's hosted agent loop.
“Why did a background interaction raise instead of saving a checkpoint?”
It had not reached completed or requires_action yet. Poll the same interaction_id again with GeminiInteractionRequest.poll(...). Do not create a fresh background interaction unless you deliberately want another Google job.
“Why do I need cache_identity?”
cache_identity?”Kitaru can see the request and runner name, but it does not inspect live Google client internals. If two clients use the same logical request but point at different projects, regions, credentials, or endpoints, cache_identity tells Kitaru those are different replay/cache worlds.
“Where are the raw Gemini payloads?”
They are off by default. Enable save_input=True, save_raw_interaction=True, or save_steps=True in GeminiInteractionCapturePolicy when you deliberately want raw provider artifacts.
“Can I checkpoint every Gemini internal step?”
Not in this adapter version. Kitaru only checkpoints work it can replay honestly. Gemini-hosted internal steps are observations from Kitaru's point of view, not Python call bodies Kitaru can rerun independently.
Related guides
Last updated
Was this helpful?