Evidence Compilation Before Semantic Memory: ActiveGraph on LongMemEval-S
Technical benchmark note: 85.6% QA accuracy and 86.2% turn answer-in-context at 2,462 mean context tokens, with deterministic non-generative ingestion.
- research
- benchmark
> activegraph / blog
Announcements, deep-dives, and research from the team building activegraph. RSS at /blog/rss.xml.
Technical benchmark note: 85.6% QA accuracy and 86.2% turn answer-in-context at 2,462 mean context tokens, with deterministic non-generative ingestion.
Append-only event log as source of truth, working graph as deterministic projection, behaviors react and emit, yields deterministic replay + cheap forking + end-to-end lineage. Discussion: https://x.com/yoheinakajima/status/2057812713045377055
v1.0 is out: a Python runtime where long-running agents share a reactive, event-sourced graph as their world — fork, replay, and audit included.