Unifying 46 Repositories Into One Knowledge Graph
- •Ryosuke Tsuji built 'code-graph' using static analysis to map dependencies across 46 repositories.
- •The system uses tree-sitter, TypeScript Compiler, and Gemini to achieve over 99% accuracy in mapping boundary connections.
- •Daily cron jobs verify API, Event, and DB boundary connections to prevent hallucinations during AI-driven impact analysis.
Ryosuke Tsuji, CTO at airCloset, developed 'code-graph', a knowledge graph constructed via static analysis to manage dependencies across 46 disparate repositories. Built between January and March 2026, the tool serves as a verified data source for AI models, enabling accurate blast-radius assessment and impact analysis for production systems that mix multiple frameworks, including jQuery, AngularJS, Express, NestJS, TypeORM, and Redux Axios.
The project addresses common limitations of using LLMs directly on large-scale codebases, specifically context window constraints and the tendency for models to hallucinate when inferring cross-repository connections. By converting code into a graph, the system provides AI with verified facts about dependencies like API calls, database read/writes, and event subscriptions. Tsuji emphasizes that while tree-sitter (a library for parsing source code into syntax trees) handles standard structural connections, it lacks the type resolution necessary for accurate analysis. To compensate, the stack integrates the TypeScript Compiler API for variable resolution and Gemini for dynamic field-access inference.
Extraction of boundary nodes proved the most complex phase due to the variety of implementation patterns across legacy and modern services. The system utilizes 21 edge types to categorize connections, such as CALLS_API, EMITS_TO, and WRITES_TO. To ensure reliability, the development process focused on maintaining connection rates above 99%, as lower accuracy degrades significantly during multi-hop graph traversal. The resulting graph is maintained by a daily boundary-analysis cron job that runs at JST 7:00, which reconciles callers with handlers and detects connection drift. This structure ensures that when AI performs impact analysis, it operates on verified nodes rather than speculative inference, preventing the accidental omissions that frequently cause production incidents.