← back to blog
Yohei Nakajima

Active Graph v1.1.0: Better Runtime Resilience, Better Operator Tools

v1.1.0 tightens the surfaces that matter once an agentic system leaves the demo stage: bounded LLM retries with audit, CLI inspect tools, fork-time pack overrides, OpenTelemetry metrics, OpenAI tool-call parity, and stricter docs and release gates.


Active Graph v1.1.0 is now live on PyPI.

This release is focused on readiness: making long-running, event-sourced agent systems easier to operate, inspect, replay, fork, and integrate into production telemetry stacks. It is not a rewrite. It is a tightening pass across the surfaces that matter once an agentic system leaves the demo stage: failures, forks, traces, metrics, provider parity, and release discipline.

TL;DR

  • Bounded LLM retries for transient failures, surfaced as llm.responded events in the log so retry behavior stays auditable and replay-safe.
  • New CLI inspection: activegraph inspect <store> --memo and --search for debugging real runs.
  • Forks can override pack settings at fork time, recorded as pack.settings_overridden events in the audit trail.
  • OpenTelemetryMetrics exporter for OTLP-native stacks, alongside the existing PrometheusMetrics.
  • OpenAI provider now sits behind the same shared LLM/tool loop as Anthropic.
  • Tighter release discipline: drift gates for CLI flags, executable doc snippets, reason-code docs, and tagged-release version correspondence.

Install or upgrade:

code
pip install --upgrade activegraph

Runtime Retries Are Now First-Class

LLM provider failures are messy in real systems. Networks hiccup. Providers rate-limit. A transient outage should not look the same as a legitimate empty extraction.

In v1.1.0, Runtime now performs bounded retries for transient LLM failures before emitting the terminal behavior.failed path. The retry attempts are not hidden. Failed attempts appear in the event log as error-shaped llm.responded events, and successful provenance points at the attempt that actually produced the object.

That means retry behavior remains auditable, replay-safe, and visible in traces.

The replay cache also got stricter in the right way: failed transient llm.responded attempts are ignored when rebuilding the LLM cache, so a provider outage cannot poison future replay.

Better CLI Inspection

The CLI picked up two small but important operator tools:

code
activegraph inspect <store> --memo
activegraph inspect <store> --search "customer concentration"

inspect --memo renders memo objects in the same human-readable format used by activegraph quickstart.

inspect --search searches event ids, event types, actors, and payload JSON. This is the sort of command you want when you are debugging a real run and vaguely remember the thing you are looking for, but not where it happened.

Forks Can Now Override Pack Settings

Forking is one of Active Graph's core ideas: preserve the parent run, branch from a point in the event log, and ask "what would have happened if..."

v1.1.0 adds:

code
activegraph fork <store> \
  --run-id <parent> \
  --at-event <event-id> \
  --set diligence.confidence_threshold_for_review=0.9

Overrides are recorded as pack.settings_overridden events. They are not side-channel configuration. They are part of the audit trail.

Pack loading applies and validates those overrides before post-fork execution resumes, which keeps the fork both reproducible and inspectable.

OpenTelemetry Metrics

Active Graph already shipped PrometheusMetrics. v1.1.0 adds OpenTelemetryMetrics for teams using OTLP-native stacks.

Install with:

code
pip install "activegraph[opentelemetry]"

Then:

code
from activegraph.observability import OpenTelemetryMetrics

rt = Runtime(graph, metrics=OpenTelemetryMetrics())

Counters and histograms map to native OpenTelemetry instruments. Synchronous gauges are represented with UpDownCounter deltas so the runtime's simple Metrics protocol stays intact.

OpenAI Tool-Call Parity

The OpenAI provider now supports the shared LLM/tool loop.

OpenAIProvider translates framework tool definitions into OpenAI Chat Completions function tools, extracts returned tool_calls into Active Graph's shared ToolCall shape, and echoes assistant/tool messages back through the conversation history correctly.

In practice, this means Anthropic and OpenAI now sit behind the same runtime tool contract for Active Graph behaviors.

Docs And Drift Gates

v1.1.0 also tightens the project's documentation and release machinery.

The release adds checks for:

  • CLI reference flags matching implemented CLI behavior
  • executable Python snippets in docs
  • reason-code documentation
  • tagged-release version correspondence

Docs now consistently describe direct object patches as patch.applied, not the previously documented but non-emitted object.patched event. The failure model now explicitly explains why strict replay divergence and dispatch-time contract failures raise exceptions instead of becoming behavior.failed.

There is also a new reason-code reference page for the stable reason= vocabulary used in failure events and tool responses.

Migration Notes

No store schema migration is required.

If you treat framework event types as a closed list, allow the new event type:

code
pack.settings_overridden

The new OpenTelemetry backend is optional. Existing Prometheus and custom Metrics implementations continue to work unchanged.

Why This Release Matters

The theme of v1.1.0 is operational confidence.

Agent systems need more than clever behavior dispatch. They need durable traces, explainable failures, safe forks, observable runtime health, and provider abstractions that do not collapse as soon as tools enter the loop.

Active Graph v1.1.0 moves that surface forward: fewer hidden retries, better inspection, auditable configuration experiments, OpenTelemetry support, OpenAI tool parity, and tighter release gates.

PyPI: https://pypi.org/project/activegraph/1.1.0/

GitHub release: https://github.com/yoheinakajima/activegraph/releases/tag/v1.1.0/


← back to blog