StreamRAG v2: Native Parser Daemons and Three-Tier Extraction for Real-Time Code Graphs
DOI:
https://doi.org/10.65138/ijtrp.2026.v2i2.15Abstract
We present StreamRAG v2, a major evolution of the StreamRAG real-time incremental code graph system for AI-driven code editors. Building on the v1 foundations—LiquidGraph, DeltaGraphBridge, and ShadowAST—v2 introduces five architectural advances: (1) Native Parser Daemons, persistent language-specific processes (Node.js, Rust, JVM) communicating via length-prefixed msgpack over Unix domain sockets, providing full AST parsing for 7 non-Python languages; (2) a Three-Tier Extraction Hierarchy where each language is served by the best available parser—native daemon, tree-sitter AST, or regex—with automatic fallback; (3) a V2 Subsystem comprising adaptive debouncing, context stabilization, fine-grained invertible graph operations, semantic path addressing, time-based file watching, and multi-session daemon support; (4) three Intelligence Subsystems—native parser caching via SHA256-keyed LRU, semantic similarity scoring with five weighted signals, and LLM-guided resolution for ambiguous edges; and (5) Accuracy Subsystems including cross-language convention resolvers, confidence-filtered cycle detection, structured direct/transitive impact separation, and workspace-aware dead code analysis. StreamRAG v2 achieves sub-0.05ms per incremental change, supports 1,020 test scenarios across 62 test files with 100% pass rate, and spans 20,000+ lines of production code across Python, TypeScript, Rust, and Java. Real-world benchmarks show v2 reduces token consumption by 47% and turns by 19% compared to unaugmented Claude, with a 96% token reduction on complex architecture traces. Compared to v1, v2 improves accuracy by +20pp (92% vs 72%), runs 2.3× faster, and uses 53% fewer tokens.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Krrish Choudhary (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.