Active Graph v1.1.0: Better Runtime Resilience, Better Operator Tools
v1.1.0 tightens the surfaces that matter once an agentic system leaves the demo stage: bounded LLM retries with audit, CLI inspect tools, fork-time pack overrides, OpenTelemetry metrics, OpenAI tool-call parity, and stricter docs and release gates.
- release
- activegraph
- runtime
- observability
- announcement
Active Graph v1.1.0 is now live on PyPI.
This release is focused on readiness: making long-running, event-sourced agent systems easier to operate, inspect, replay, fork, and integrate into production telemetry stacks. It is not a rewrite. It is a tightening pass across the surfaces that matter once an agentic system leaves the demo stage: failures, forks, traces, metrics, provider parity, and release discipline.
TL;DR
- Bounded LLM retries for transient failures, surfaced as
llm.respondedevents in the log so retry behavior stays auditable and replay-safe.- New CLI inspection:
activegraph inspect <store> --memoand--searchfor debugging real runs.- Forks can override pack settings at fork time, recorded as
pack.settings_overriddenevents in the audit trail.OpenTelemetryMetricsexporter for OTLP-native stacks, alongside the existingPrometheusMetrics.- OpenAI provider now sits behind the same shared LLM/tool loop as Anthropic.
- Tighter release discipline: drift gates for CLI flags, executable doc snippets, reason-code docs, and tagged-release version correspondence.
Install or upgrade:
pip install --upgrade activegraph
Runtime Retries Are Now First-Class
LLM provider failures are messy in real systems. Networks hiccup. Providers rate-limit. A transient outage should not look the same as a legitimate empty extraction.
In v1.1.0, Runtime now performs bounded retries for transient LLM failures before emitting the terminal behavior.failed path. The retry attempts are not hidden. Failed attempts appear in the event log as error-shaped llm.responded events, and successful provenance points at the attempt that actually produced the object.
That means retry behavior remains auditable, replay-safe, and visible in traces.
The replay cache also got stricter in the right way: failed transient llm.responded attempts are ignored when rebuilding the LLM cache, so a provider outage cannot poison future replay.
Better CLI Inspection
The CLI picked up two small but important operator tools:
activegraph inspect <store> --memo
activegraph inspect <store> --search "customer concentration"
inspect --memo renders memo objects in the same human-readable format used by activegraph quickstart.
inspect --search searches event ids, event types, actors, and payload JSON. This is the sort of command you want when you are debugging a real run and vaguely remember the thing you are looking for, but not where it happened.
Forks Can Now Override Pack Settings
Forking is one of Active Graph's core ideas: preserve the parent run, branch from a point in the event log, and ask "what would have happened if..."
v1.1.0 adds:
activegraph fork <store> \
--run-id <parent> \
--at-event <event-id> \
--set diligence.confidence_threshold_for_review=0.9
Overrides are recorded as pack.settings_overridden events. They are not side-channel configuration. They are part of the audit trail.
Pack loading applies and validates those overrides before post-fork execution resumes, which keeps the fork both reproducible and inspectable.
OpenTelemetry Metrics
Active Graph already shipped PrometheusMetrics. v1.1.0 adds OpenTelemetryMetrics for teams using OTLP-native stacks.
Install with:
pip install "activegraph[opentelemetry]"
Then:
from activegraph.observability import OpenTelemetryMetrics
rt = Runtime(graph, metrics=OpenTelemetryMetrics())
Counters and histograms map to native OpenTelemetry instruments. Synchronous gauges are represented with UpDownCounter deltas so the runtime's simple Metrics protocol stays intact.
OpenAI Tool-Call Parity
The OpenAI provider now supports the shared LLM/tool loop.
OpenAIProvider translates framework tool definitions into OpenAI Chat Completions function tools, extracts returned tool_calls into Active Graph's shared ToolCall shape, and echoes assistant/tool messages back through the conversation history correctly.
In practice, this means Anthropic and OpenAI now sit behind the same runtime tool contract for Active Graph behaviors.
Docs And Drift Gates
v1.1.0 also tightens the project's documentation and release machinery.
The release adds checks for:
- CLI reference flags matching implemented CLI behavior
- executable Python snippets in docs
- reason-code documentation
- tagged-release version correspondence
Docs now consistently describe direct object patches as patch.applied, not the previously documented but non-emitted object.patched event. The failure model now explicitly explains why strict replay divergence and dispatch-time contract failures raise exceptions instead of becoming behavior.failed.
There is also a new reason-code reference page for the stable reason= vocabulary used in failure events and tool responses.
Migration Notes
No store schema migration is required.
If you treat framework event types as a closed list, allow the new event type:
pack.settings_overridden
The new OpenTelemetry backend is optional. Existing Prometheus and custom Metrics implementations continue to work unchanged.
Why This Release Matters
The theme of v1.1.0 is operational confidence.
Agent systems need more than clever behavior dispatch. They need durable traces, explainable failures, safe forks, observable runtime health, and provider abstractions that do not collapse as soon as tools enter the loop.
Active Graph v1.1.0 moves that surface forward: fewer hidden retries, better inspection, auditable configuration experiments, OpenTelemetry support, OpenAI tool parity, and tighter release gates.
PyPI: https://pypi.org/project/activegraph/1.1.0/
GitHub release: https://github.com/yoheinakajima/activegraph/releases/tag/v1.1.0/
← back to blog