AxonAtlas Architecture & Intelligence
A comprehensive technical guide to the AxonAtlas Code Intelligence Platform — how it works, what it builds, and why it represents a fundamental shift in how engineering teams understand their software systems.
What is AxonAtlas
AxonAtlas is a semantic code intelligence engine that transforms raw source code repositories into a queryable, semantically-rich knowledge graph. It goes far beyond surface-level linting or traditional static analysis — AxonAtlas constructs a deep structural and semantic model of an entire codebase, resolving function calls, propagating types, detecting architectural communities, mapping execution flows, and identifying dead code — all within a unified graph that AI agents and developers can query in real time.
Think of AxonAtlas as a Software Knowledge Engine: an always-on intelligence substrate that understands not just what code exists, but how it connects, why it changes together, and where risk concentrates. Traditional tools show you a list of warnings. AxonAtlas shows you the living architecture of your system.
What AxonAtlas Does
- →Parses entire codebases using Tree-sitter grammars, extracting functions, classes, methods, interfaces, types, enums, and variables
- →Resolves inter-symbol dependencies using multi-tier reasoning — local scope, import resolution, receiver-method de-aliasing, and global fuzzy matching
- →Enriches the graph with worklist-based type propagation, Leiden community detection, execution process flow tracing, and temporal coupling from git history
- →Persists the knowledge graph into KuzuDB — a high-performance embedded graph database — supporting full-text search, vector similarity, and raw Cypher queries
- →Exposes intelligence through the Model Context Protocol (MCP), enabling AI agents, IDEs, and CI systems to query code structure in real time
Who AxonAtlas Serves
AI Coding Agents
Structured code context for accurate modifications — callers, callees, types, communities, blast radius. Agents make fewer mistakes when they can see the full picture.
Engineering Leaders
Architectural visibility — community maps, coupling hotspots, dead code quantification. Make data-driven decisions about technical debt and system health.
Developer Experience Teams
Onboarding acceleration — narrative symbol explanations, process flow visualization. New engineers understand the system in hours, not months.
Platform & DevOps
PR risk scoring, test impact analysis, incremental CI intelligence. Know exactly which tests to run and which reviews need extra attention.
Supported Languages & Frameworks
AxonAtlas provides first-class support for Python and TypeScript/JavaScript, with framework-aware entry point detection that understands the conventions of modern web and CLI frameworks:
| Ecosystem | Frameworks | Entry Point Detection |
|---|---|---|
| Python | Flask, FastAPI, Django, Click, Celery, APScheduler | Decorators, main(), __main__.py, argparse |
| TypeScript | Express, Next.js, Node.js | Route handlers, pages, layouts, event listeners |
Why Code Intelligence Matters
The software industry is undergoing a structural transformation. AI coding agents can now generate thousands of lines of code per hour — but this unprecedented velocity has created a new category of engineering risk: architectural drift at machine speed.
Traditional static analysis tools were designed for a world where humans wrote code slowly and reviewed it carefully. They answer narrow questions: “Is this variable unused?” “Does this type match?” But they cannot answer the questions that matter most in an agent-assisted world: “If I change this function, what breaks downstream?” “Which files always change together but have no import relationship?” “Is this code actually reachable from any entry point?”
Code Intelligence is the answer to this gap. It goes beyond syntax-level checks to build a living model of your system's architecture — one that understands relationships, dependencies, coupling patterns, and execution flows. It is the difference between having a spell-checker and having an editor who understands your entire narrative.
Static Analysis vs. Code Intelligence
The distinction between traditional static analysis and a code intelligence platform is fundamental. Static analysis operates on individual files or narrow scopes. Code intelligence operates on the entire system as a connected graph.
| Capability | Traditional Static Analysis | AxonAtlas Code Intelligence |
|---|---|---|
| Representation | Flat warning lists, AST dumps | Unified knowledge graph (11 node types, 17 edge types) |
| Semantic Depth | Single-pass name matching | Multi-tier call resolution with confidence scoring |
| Type Intelligence | No type propagation | Compiler-grade engines (Pyright/TSC) + heuristic fallback |
| Architectural Insight | None | Leiden community detection, process flow tracing |
| Change Intelligence | None | Temporal coupling (git history) + structural coupling |
| Incremental Updates | Full re-scan or dirty file only | Dependency-aware cascade — transitive importers re-analyzed |
| Agent Integration | CLI-only or JSON output | Native MCP protocol with 17 tools + 3 resources |
| Multi-Repository | Single-repo scope | Cross-repo symbol resolution, workspace-level intelligence |
The Developer Productivity Crisis
Studies consistently show that developers spend more time understanding existing code than writing new code. In large organizations, this ratio can reach 70:30 — seventy percent of engineering time spent reading, navigating, and comprehending code that already exists. This is the invisible tax on every feature, every refactor, every bug fix.
AxonAtlas directly addresses this productivity crisis by providing a continuously updated, queryable modelof the entire codebase. Instead of manually tracing call chains through dozens of files, an engineer (or an AI agent) can ask: “Show me every caller of this function, what community it belongs to, and what processes flow through it” — and receive an answer in milliseconds.
This is not incremental improvement. This is a category-level shift in how engineering teams interact with their own systems.
System Architecture Overview
AxonAtlas is organized into four clearly separated layers, each with distinct responsibilities. This layered architecture ensures that the intelligence pipeline, the storage engine, and the protocol interface can evolve independently without breaking contracts between them.
At the highest level, the MCP Protocol Layer handles all external communication — exposing 17 tools and 3 resources over both stdio and Streamable HTTP transports. Below it, the Intelligence Services layer provides the reasoning engine: query planning, impact analysis, narrative explanations, and risk scoring. The Semantic Core manages symbol indexes, type intelligence, and workspace-level coordination. And at the foundation, the Ingestion Pipeline and Storage Layer convert raw source code into a persistent, queryable knowledge graph.
┌──────────────────────────────────────────────────────────────┐
│ MCP Protocol Layer │
│ ┌──────────┐ ┌────────────┐ ┌────────┐ ┌──────────────┐ │
│ │ 17 Tools │ │ 3 Resources│ │Security│ │ HTTP + stdio │ │
│ └────┬─────┘ └─────┬──────┘ └───┬────┘ └──────┬───────┘ │
└───────┼──────────────┼─────────────┼───────────────┼─────────┘
│ │ │ │
┌───────┼──────────────┼─────────────┼───────────────┼─────────┐
│ ▼ ▼ ▼ ▼ │
│ Intelligence Services │
│ ┌───────────┐ ┌────────┐ ┌─────────┐ ┌─────────────┐ │
│ │ Query │ │ Impact │ │ Explain │ │ Review Risk │ │
│ │ Planning │ │Analysis│ │Narrative│ │ Scoring │ │
│ └─────┬─────┘ └───┬────┘ └────┬────┘ └──────┬──────┘ │
└────────┼────────────┼────────────┼───────────────┼───────────┘
│ │ │ │
┌────────┼────────────┼────────────┼───────────────┼───────────┐
│ ▼ ▼ ▼ ▼ │
│ Semantic Core │
│ ┌────────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Symbol │ │ Type │ │ Workspace │ │
│ │ Index │ │ Intel │ │ Manager │ │
│ └─────┬──────┘ └────┬─────┘ └─────┬──────┘ │
└────────┼──────────────┼──────────────┼───────────────────────┘
│ │ │
┌────────┼──────────────┼──────────────┼───────────────────────┐
│ ▼ ▼ ▼ │
│ Ingestion Pipeline (14 Phases) │
│ ┌────────┐ ┌──────┐ ┌───────┐ ┌────────┐ ┌──────────────┐ │
│ │Structure│→│Parse │→│Import │→│TypeFlow│→│ Resolution │ │
│ └────────┘ └──────┘ └───────┘ └────────┘ └──────┬───────┘ │
│ ┌──────────┐ ┌──────────┐ ┌─────────┐ ┌────────┴──┐ │
│ │Community │ │ Coupling │ │Process │ │ Dead Code │ │
│ └──────────┘ └──────────┘ └─────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────┘
│
┌────────┼─────────────────────────────────────────────────────┐
│ ▼ Storage Layer │
│ ┌────────────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ KuzuDB Graph │ │ Connection │ │ Repo Router │ │
│ │ Cypher Engine │ │ Pool + R/W │ │ (LRU Pool) │ │
│ └────────────────┘ └─────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────────┘Core Module Map
The system comprises 16 core modules, each with a single, well-defined responsibility. This modularity ensures that individual components can be tested, optimized, and replaced without cascading changes across the system.
| Layer | Responsibility |
|---|---|
| Data Model | 11 NodeLabels, 17 RelTypes, deterministic moniker IDs |
| In-Memory Graph | Thread-safe KnowledgeGraph with 6 secondary indexes |
| Persistent Storage | KuzuDB integration, FTS, vector search, batch operations |
| Multi-Repo Router | LRU connection pool, registry, path normalization |
| Ingestion Pipeline | 14-phase DAG orchestrator with topological validation |
| Phase Model | Declarative dependency DAG with Kahn's cycle detection |
| Call Resolution | 4-tier hierarchical resolver with confidence scoring |
| Type Propagation | Worklist-based Type Flow Graph |
| Community Detection | Adaptive Leiden algorithm (full/chunked/sampled) |
| Dead Code Analysis | Reachability analysis with false-positive reduction |
| Process Flows | Entry point detection + BFS flow tracing |
| Temporal Coupling | Git co-change + structural coupling hybrid |
| Incremental Engine | Dependency-aware cascade reparse |
| Intelligence Service | Query planning, impact analysis, probabilistic ranking |
| MCP Server | stdio + HTTP transport, 17 tools, 3 resources |
| Security | Supabase-backed secret validation, repo-level ACL |
The Intelligence Pipeline
At the heart of AxonAtlas is a deterministic, multi-phase directed acyclic graph (DAG)with 14 declared phases. This is not a simple sequential script — it is a formally validated pipeline where each phase declares its dependencies, and the execution order is verified at startup using Kahn's algorithm for cycle detection.
This design ensures that phase ordering bugs are caught before any analysis runs. Phases within the same parallel group are verified to have no transitive dependencies among themselves, enabling safe concurrent execution. Optional phases — such as coupling analysis, duplicate detection, and embedding generation — can be skipped without breaking downstream phases, providing graceful degradation when certain data sources (like git history) are unavailable.
structure
│
parsing
│
imports
│
type_flow
╱ │ ╲
calls heritage types ← parallel_group: "resolution"
╲ │ ╱
contracts
╱ │ │ ╲ ╲
communities │ │ coupling duplicates ← parallel_group: "semantic"
processes │
dead_code
│
embeddingsPhase-by-Phase Breakdown
Each phase builds on the outputs of its predecessors. The early phases establish the structural foundation — file nodes, parsed symbols, import relationships. The middle phases add semantic depth — type propagation, call resolution, class hierarchies. The later phases perform higher-order analysis — community clustering, process flow detection, dead code identification. Finally, the embedding phase generates vector representations for semantic search.
| # | Phase | Description | Key Output |
|---|---|---|---|
| 1 | Structure | Walk filesystem, create FILE and FOLDER nodes with CONTAINS edges | Repository tree |
| 2 | Parsing | Tree-sitter parsing: extract FUNCTION, CLASS, METHOD, INTERFACE, ENUM, VARIABLE, TYPE_ALIAS nodes | Symbol table |
| 3 | Imports | Resolve IMPORTS (file→file) and IMPORTS_SYMBOL (file→symbol) edges across modules | Import graph |
| 4 | Type Flow | Worklist-based type propagation across assignments, parameters, return types | Type annotations |
| 5 | Calls | Multi-tier call resolution (local → import → receiver → global fuzzy) | Call graph |
| 6 | Heritage | Resolve EXTENDS and IMPLEMENTS edges for class hierarchies | Inheritance tree |
| 7 | Types | Resolve USES_TYPE edges from symbols to referenced types | Type dependency graph |
| 8 | Contracts | Stitch CONSUMES_API edges from combined import + call evidence | API surface map |
| 9 | Communities | Leiden clustering with adaptive strategy (full/chunked/sampled) | Community nodes |
| 10 | Processes | Entry point detection + BFS flow tracing | Process flow nodes |
| 11 | Dead Code | Reachability analysis from entry points with false-positive reduction | is_dead flags |
| 12 | Coupling | Git co-change matrix + structural coupling hybrid | COUPLED_WITH edges |
| 13 | Duplicates | Semantic clone detection via structural similarity | SEMANTIC_CLONE edges |
| 14 | Embeddings | Vector embedding generation for symbol nodes | Embedding table |
Pipeline Guarantees
Determinism
Given identical input, the pipeline produces bit-identical graph output. Moniker IDs are content-addressed, not order-dependent. This makes it possible to verify index correctness by comparing fingerprints across machines.
Topological Validation
Execution order is validated against the declared DAG at startup. If a developer adds a new phase with an incorrect dependency, the system catches it immediately — before any analysis runs.
Parallel Safety
Phases within the same parallel_group are verified to have no transitive dependencies among themselves. The resolution group (calls, heritage, types) runs concurrently because each operates on independent edge types.
Graceful Degradation
Optional phases can be skipped without breaking the pipeline. If git history is unavailable, coupling analysis is skipped. If no embedding model is configured, the embedding phase is bypassed.
The Software Knowledge Graph
The AxonAtlas knowledge graph is a labeled property graph with a precisely defined schema of 11 node types and 17 relationship types. Every entity in the codebase — from a single variable to an entire architectural community — is represented as a typed node with properties. Every connection between entities — a function call, an import, a type reference, a temporal coupling — is represented as a typed, directed edge.
This graph is not an afterthought bolted onto existing analysis. It is the analysis. Every intelligence capability AxonAtlas provides — from impact analysis to community detection to dead code identification — is implemented as a graph traversal or graph algorithm over this unified model.
Node Types
The graph schema distinguishes between structural nodes (representing physical entities in the codebase) and synthetic nodes (representing discovered patterns or computed groupings). Structural nodes are created during the parsing phase. Synthetic nodes emerge from later analysis phases.
| Label | Category | Semantics | Example |
|---|---|---|---|
| FILE | Structural | Source file on disk | src/auth/service.py |
| FOLDER | Structural | Directory container | src/auth/ |
| FUNCTION | Structural | Standalone function | validate_token() |
| CLASS | Structural | Class definition | AuthService |
| METHOD | Structural | Class-bound method | AuthService.verify() |
| INTERFACE | Structural | TypeScript interface / Python Protocol | IAuthProvider |
| TYPE_ALIAS | Structural | Type alias or typedef | UserId = string |
| ENUM | Structural | Enumeration type | UserRole |
| VARIABLE | Structural | Module-level variable or constant | MAX_RETRIES |
| COMMUNITY | Synthetic | Detected architectural cluster | auth-core |
| PROCESS | Synthetic | Discovered execution flow | login → verify → issue_token |
Relationship Types
Relationships carry the semantic intelligence. Each edge type models a specific kind of connection with distinct semantics. Some relationships are directional (a function calls another function); others are symmetric (two files are coupledin git history). The diversity of edge types is what makes AxonAtlas queries so expressive — you can ask “who calls this function?” as easily as “which files are temporally coupled with this one?”
| Relationship | Direction | Semantics |
|---|---|---|
| CONTAINS | Folder → File | Structural containment |
| DEFINES | File → Symbol | Symbol provenance — which file owns this symbol |
| CALLS | Symbol → Symbol | Function or method invocation |
| IMPORTS | File → File | Module-level import dependency |
| IMPORTS_SYMBOL | File → Symbol | Named import binding (from X import Y) |
| EXTENDS | Class → Class | Class inheritance |
| IMPLEMENTS | Class → Interface | Interface conformance or protocol implementation |
| USES_TYPE | Symbol → Type | Type annotation reference |
| EXPORTS | File → Symbol | Public API surface exposure |
| MEMBER_OF | Symbol → Community | Cluster membership assignment |
| STEP_IN_PROCESS | Symbol → Process | Participation in an execution flow |
| COUPLED_WITH | File ↔ File | Temporal and structural co-change coupling |
| CONSUMES_API | Symbol → Symbol | Cross-boundary API contract (e.g., frontend → backend) |
| SEMANTIC_CLONE | Symbol ↔ Symbol | Near-duplicate code detection |
| CONVERGENT_FLOW | Process → Process | Overlapping execution paths (>50% step overlap) |
| DYNAMICALLY_REFERENCES | Symbol → Symbol | Runtime-resolved reference (decorators, __getattr__) |
| CROSS_REPO_REFERENCES | Symbol → Symbol | Cross-repository dependency across workspace |
Deterministic Identity — The Moniker System
Every node in the AxonAtlas graph receives a deterministic, content-addressable ID called a moniker. Monikers are stable across re-indexes, portable across machines, and globally unique within a workspace. This is the foundational invariant that makes incremental indexing, cross-repository resolution, and graph fingerprinting possible.
# Symbol moniker — encodes label, language, repo, module, and name function:python:local:src/auth/service:validate_token class:typescript:local:src/components/App:Dashboard # Structural moniker — stable file identity axon://file/src/auth/service.py
KuzuDB — The Storage Engine
The knowledge graph is persisted to KuzuDB, a high-performance embedded columnar graph database optimized for analytical queries. AxonAtlas wraps KuzuDB with a sophisticated concurrency layer: a read-many / write-exclusive lock enables concurrent read queries while serializing mutations, and a bounded connection pool prevents resource exhaustion under high query load.
The storage backend supports four distinct search modalities, each optimized for a different access pattern:
| Search Type | Algorithm | Use Case |
|---|---|---|
| Full-Text Search | KuzuDB FTS extension with BM25 scoring | Symbol name search, content search, documentation queries |
| Fuzzy Search | Levenshtein + Jaro-Winkler similarity | Typo-tolerant symbol resolution for agent queries |
| Vector Search | array_cosine_similarity() native in Cypher | Semantic code search via embeddings — find similar functions |
| Cypher Queries | Raw Cypher passthrough | Custom agent queries, complex graph traversals, ad-hoc analysis |
Semantic Intelligence Engine
Semantic intelligence is the capability that separates AxonAtlas from every other code analysis tool on the market. Where traditional tools stop at syntactic parsing — extracting function names and detecting import statements — AxonAtlas reconstructs the actual runtime behavior of the code through multi-tier reasoning about calls, types, and dependencies.
This is the layer that transforms a raw syntax tree into a semantic graph — where every function knows its callers and callees, every variable knows its type, and every class knows its place in the inheritance hierarchy. Without semantic intelligence, you would have a directory listing. With it, you have a navigable map of the entire system.
Multi-Tier Call Resolution
Call resolution is the most critical semantic operation in AxonAtlas. When the parser encounters a call expression like service.validate_token(), the resolver must determine which function is actually being invoked. Unlike a compiler, AxonAtlas operates without a symbol table — it must reconstruct call targets from incomplete information, context clues, and heuristic reasoning.
The resolver uses a hierarchical cascade with four tiers, each progressively broader in scope but lower in confidence. This design ensures that the highest-confidence resolution is always preferred, while still providing useful connections even when exact resolution is impossible.
┌─────────────────────────────────────────────────────┐
│ Tier 1: Local Scope Match (confidence: 1.0) │
│ ─── Look for the symbol in the same file │
│ ─── Direct name match in local definitions │
└──────────────────────┬──────────────────────────────┘
miss ↓
┌──────────────────────┴──────────────────────────────┐
│ Tier 2: Import Resolution (confidence: 0.9) │
│ ─── Follow import chain to target module │
│ ─── Resolve aliased imports (from X import Y as Z) │
└──────────────────────┬──────────────────────────────┘
miss ↓
┌──────────────────────┴──────────────────────────────┐
│ Tier 3: Receiver-Method Binding (confidence: 0.7) │
│ ─── Trace variable type → class → method lookup │
│ ─── Uses type flow results for receiver typing │
└──────────────────────┬──────────────────────────────┘
miss ↓
┌──────────────────────┴──────────────────────────────┐
│ Tier 4: Global Fuzzy Match (confidence: 0.5) │
│ ─── Search all known symbols for best-match name │
│ ─── Weighted by scope proximity and name distance │
└─────────────────────────────────────────────────────┘Each resolved call edge carries its confidence score as a property. This is critical for downstream consumers — an AI agent querying callers can distinguish between high-confidence direct calls and lower-confidence fuzzy matches, adjusting its behavior accordingly. Impact analysis uses these scores to weight blast-radius calculations, ensuring that uncertain connections do not produce false alarms.
Worklist-Based Type Propagation
Many symbols in dynamic languages like Python lack explicit type annotations. AxonAtlas addresses this through a worklist-based type propagation algorithm modeled after compiler dataflow analysis. The algorithm constructs a Type Flow Graph (TFG) — a directed graph where nodes represent type-bearing positions (variables, parameters, return values) and edges represent type-flow relationships (assignments, function returns, parameter passing).
The worklist iterates until a fixed point is reached — when no more types can be propagated without violating the monotonicity constraint. This convergence guarantee means the algorithm always terminates and produces the maximally precise result given the available information. The propagated types are then used as input to the call resolution engine, significantly improving receiver-method binding accuracy.
Compiler-Grade Type Intelligence
For codebases that use TypeScript or typed Python, AxonAtlas can optionally invoke external type-checking engines — Pyright for Python and the TypeScript Compiler (tsc) for JavaScript/TypeScript. These engines provide ground-truth type information that supplements the heuristic type flow system.
| Engine | Language | Data Provided | Integration Method |
|---|---|---|---|
| Pyright | Python 3.x | Resolved types, overloads, generics, Protocol conformance | JSON-RPC output parsing |
| tsc | TypeScript / JavaScript | Full type inference, module resolution, declaration merging | CLI output parsing |
| Heuristic TFG | All supported | Assignment-based propagation, return type inference | Internal worklist algorithm |
Advanced Analysis Capabilities
Beyond core call resolution and type inference, AxonAtlas provides four categories of higher-order analysis that transform the knowledge graph from a dependency map into a strategic intelligence layer for engineering decision-making.
Architectural Community Detection
Most codebases have an implicit architecture that diverges significantly from their folder structure. Functions in separate directories may be tightly coupled, while adjacent files may have no interaction at all. AxonAtlas uses the Leiden algorithm — the state-of-the-art in community detection — to discover the actual functional clusters in your code.
The implementation employs an adaptive strategy selector that automatically chooses the optimal algorithm configuration based on graph size. For small codebases (under 500 symbols), it runs the full Leiden algorithm. For medium graphs (500–5,000 symbols), it uses a chunked approach that processes subgraphs independently and merges results. For large graphs (5,000+ symbols), it uses a sampled approach that identifies communities from representative subsets and propagates membership to neighbors.
| Graph Size | Strategy | Resolution | Approach |
|---|---|---|---|
| < 500 nodes | Full | Adaptive (0.5–2.0) | Complete Leiden on full graph |
| 500–5,000 nodes | Chunked | Adaptive per-chunk | Partition → Leiden → Merge overlaps |
| 5,000+ nodes | Sampled | Fixed 1.0 | Random sample → Leiden → Propagate to neighbors |
Detected communities are materialized as COMMUNITY nodes in the graph, with MEMBER_OF edges connecting each symbol to its cluster. Community labels are automatically generated from the most prominent symbols within each group. This enables queries like: “Show me all symbols in the auth-core community” or “Which community does validate_token belong to?”
Dead Code Analysis
AxonAtlas identifies dead code through reachability analysis from detected entry points. The system traces all possible execution paths from known entry points (main functions, route handlers, CLI commands, test functions) through the call graph. Any symbol that is not reachable from at least one entry point is flagged as potentially dead.
Crucially, the analysis applies extensive false-positive reduction heuristics to avoid flagging symbols that are genuinely used but difficult to trace statically. Symbols referenced through decorators, dynamic dispatch, serialization frameworks, or reflection patterns are automatically excluded. Plugin registration patterns, event handler conventions, and protocol conformance are all recognized and respected.
Entry Points (detected) Known Alive
────────────────────── ────────────
├─ main() ├─ Decorated functions
├─ @app.route(...) ├─ __init__ methods
├─ @click.command(...) ├─ Protocol implementations
├─ test_*() ├─ __all__ exports
├─ layouts / pages └─ Event handlers
│
└─→ BFS Reachability through CALLS edges
│
├─ Reachable symbols → marked alive
└─ Unreachable symbols → flagged is_dead=true
│
└─→ False-positive filter applied
│
└─→ Final dead code set returnedExecution Process Flow Detection
AxonAtlas automatically discovers end-to-end execution flows by tracing call chains from detected entry points. Each discovered flow is materialized as a PROCESS node in the graph, with STEP_IN_PROCESS edges connecting the symbols that participate in that flow.
This capability is particularly valuable for understanding the critical paths through a system. Rather than manually tracing from a route handler through middleware, service layers, data access, and back — the entire chain is pre-computed and queryable. When two processes share more than 50% of their steps, AxonAtlas automatically creates CONVERGENT_FLOW edges, highlighting architectural hotspots where multiple critical paths converge on shared infrastructure.
Temporal Coupling Analysis
Temporal coupling reveals hidden dependencies that are invisible in the import graph. Two files may have no syntactic relationship — no imports, no shared types — yet they always change together in git history. This pattern is a powerful signal: it typically indicates shared business logic, implicit contracts, or coordinated data structures.
AxonAtlas mines the git commit history to build a co-change matrix, computing the Jaccard similarity between file change sets. Files that appear together in commits above a configurable threshold (default: 30%) receive COUPLED_WITH edges. This is hybridized with structural coupling — shared call targets and type references — to produce a composite coupling score that captures both behavioral and structural dependency.
The practical impact is significant. When an engineer modifies a file, AxonAtlas can immediately surface: “Based on the last 500 commits, this file has a 78% coupling with validators.py— you may want to review it as well.” This kind of insight is impossible from code analysis alone; it requires historical behavioral data.
Multi-Repository Intelligence
Real-world software systems are rarely contained within a single repository. A typical engineering organization operates with a frontend application, a backend API, shared libraries, infrastructure-as-code, and potentially dozens of microservices — each in its own repository with its own dependency graph and release cycle.
AxonAtlas is architected from the ground up for multi-repository, workspace-level intelligence. The WorkspaceManager coordinates multiple repository indexes, the RepositoryRouter manages per-repo database connections, and the GlobalSymbolIndex provides a unified view across all indexed repositories — enabling queries that span repository boundaries.
┌──────────────────────────────────────────────────────────┐
│ Workspace Manager │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ Repo │ │ Topological│ │ Global Symbol │ │
│ │ Registry │ │ Sort Order │ │ Index (cross-repo) │ │
│ └──────┬─────┘ └──────┬─────┘ └─────────┬──────────┘ │
└─────────┼───────────────┼───────────────────┼─────────────┘
│ │ │
┌─────────┼───────────────┼───────────────────┼─────────────┐
│ ▼ ▼ ▼ │
│ Repository Router │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LRU Connection Pool (bounded, per-repo backends) │ │
│ └──────┬────────────┬────────────┬────────────────────┘ │
│ │ │ │ │
│ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │
│ │ Repo A │ │ Repo B │ │ Repo C │ ← KuzuDB each │
│ │(backend)│ │(backend)│ │(backend)│ │
│ └─────────┘ └─────────┘ └─────────┘ │
└───────────────────────────────────────────────────────────┘The Global Symbol Index
The GlobalSymbolIndex aggregates per-repository symbol indexes into a unified lookup layer. When a query references a symbol by name, the index searches across all registered repositories and returns candidates ranked by relevance. This enables cross-repository operations that would be impossible with single-repo tools.
Consider a common scenario: your frontend application imports types from a shared library, and your backend API consumes the same library. With AxonAtlas, you can ask: “Who consumes the User type across all repositories?” — and the GlobalSymbolIndex will return callers from both the frontend and backend, even though they live in separate repos with separate databases.
Repository Router & Connection Pooling
Each repository gets its own isolated KuzuDB database file, ensuring clean separation of data. The RepositoryRouter manages an LRU-bounded connection pool that keeps frequently accessed repositories warm while automatically evicting stale connections. This design scales efficiently from a single-repo developer setup to a workspace with dozens of indexed repositories.
The router also handles path normalization — resolving symlinks, canonicalizing case-insensitive paths on Windows, and mapping relative paths to absolute repository roots. This seemingly mundane responsibility is critical for ensuring that the same repository is never indexed twice under different path representations.
Cross-Repository Impact Analysis
When the WorkspaceManager detects CROSS_REPO_REFERENCES edges, impact analysis automatically spans repository boundaries. Modifying a utility function in a shared library triggers blast-radius calculation that includes consumers in every repository that depends on it. The topological sort order ensures that upstream repositories are always indexed before their downstream dependents, maintaining consistency across the entire workspace.
MCP Protocol & Agent Integration
AxonAtlas is a native Model Context Protocol (MCP) server — the emerging open standard for connecting AI agents to external tools and data sources. This means any MCP-compatible AI system — Claude, GPT, Cursor, Windsurf, or custom agents — can query the AxonAtlas knowledge graph in real time, receiving structured, typed responses that are immediately actionable.
The MCP integration is not a bolt-on API wrapper. The server is built directly into the AxonAtlas runtime, sharing the same graph instances, symbol indexes, and intelligence services as the core pipeline. This eliminates serialization overhead and ensures that agent queries operate on the most current graph state. The server supports both stdio transport (for local IDE integrations) and Streamable HTTP (for remote and team-level deployments)(Soon).
Tool Catalog — 17 Intelligence Tools
Every tool is designed with agent ergonomics in mind. Inputs accept natural identifiers (symbol names, file paths) rather than internal IDs. Outputs are structured with consistent schemas that agents can parse without ambiguity. Error responses include remediation guidance, helping agents self-correct.
| Tool | Category | Description |
|---|---|---|
| query | Search | Hybrid keyword + vector semantic search across the knowledge graph |
| context | Inspection | 360-degree view of a symbol: callers, callees, types, community membership |
| explain | Narrative | AI-friendly narrative explanation of a symbol's role and relationships |
| impact | Analysis | Blast radius discovery: all affected symbols from a change, with depth control |
| call_path | Analysis | Shortest path between two symbols via BFS over CALLS edges |
| communities | Architecture | List or drill into detected communities with member symbols |
| coupling | Analysis | Show temporally and structurally coupled files for a given file |
| dead_code | Quality | List all unreachable symbols flagged as dead code |
| cycles | Quality | Detect circular dependencies using strongly connected components |
| file_context | Inspection | Comprehensive file summary: symbols, imports, coupling, community |
| detect_changes | CI | Map git diff hunks to affected symbols in the graph |
| review_risk | CI | PR risk scoring: blast radius, missing co-changes, boundary crossings |
| test_impact | CI | Find tests affected by code changes — trace callers to test files |
| cypher | Advanced | Raw Cypher query passthrough for custom graph exploration |
| graph_integrity | Operations | Structural health check: invariant verification, trust metrics |
| help | Meta | Contextual operational guidance based on task type |
| list_repos | Meta | List indexed repositories with stats and graph reachability |
Resource Endpoints
In addition to tools, AxonAtlas exposes three MCP resource endpoints that provide contextual data to agents during conversation initialization:
| Resource | URI Pattern | Data Provided |
|---|---|---|
| Repository Summary | axon://repo/{name}/summary | Node counts, edge counts, graph metrics, indexing status |
| Ingestion Status | axon://repo/{name}/ingestion | Pipeline state, phase completion, last run timestamp |
| Operational Manual | axon://help/playbook | Agent-facing operational guidance and workflow recommendations |
Security Architecture
The MCP server implements a multi-layer security architecture designed for team deployments where repository access must be controlled per-user and per-agent. Authentication is backed by a validated secret, providing JWT-based secret validation with per-repository access control lists (ACLs).
Secret Validation
Every MCP connection is authenticated via a validated secret. Unauthenticated connections receive a structured error with remediation instructions — not a silent failure.
Repository ACLs
Each authenticated user has a defined set of accessible repositories. Queries against unauthorized repositories return clear permission errors, preventing data leakage.
Structured Errors
All security failures produce typed, parseable error responses that agents can handle programmatically. No stack traces, no internal paths — clean error contracts.
Scalability & Performance Architecture
AxonAtlas is designed to scale from a single developer working on a personal project to an engineering organization with dozens of repositories containing millions of lines of code. Every layer of the system is engineered for predictable performance at scale.
The scalability architecture addresses three distinct dimensions: storage scalability (how the graph grows), query scalability (how fast intelligence is served), and operational scalability (how the system recovers and self-heals).
Concurrency & Thread Safety
The storage layer implements a read-many / write-exclusive lock (ReadWriteLock) that allows unlimited concurrent read queries while serializing write operations. This is the optimal concurrency model for a read-heavy workload — which is exactly the usage pattern for a code intelligence system, where graph mutations happen during indexing (infrequent) and queries happen continuously (frequent).
Connection Pooling
Each KuzuDB backend maintains a bounded connection pool. When all connections are in use, new queries queue rather than fail. Pool sizes are configurable per deployment — from 4 connections for a laptop to 32 for a team server.
LRU Repository Eviction
The RepositoryRouter uses an LRU cache to manage per-repo backends. When the pool reaches capacity, the least-recently-used repository's backend is gracefully closed and its resources released. Hot repositories stay warm automatically.
Batch Operations
Graph mutations (node creation, edge insertion) are batched into transactions. Instead of issuing thousands of individual Cypher statements, the backend constructs parameterized batch queries that insert hundreds of nodes in a single round-trip.
Secondary Indexes
The in-memory KnowledgeGraph maintains 6 secondary indexes: by-file, by-label, by-name, by-community, by-caller, and by-callee. These indexes provide O(1) access for the most common query patterns, eliminating the need for full graph scans.
Incremental Intelligence
Full re-indexing is expensive. For a large repository with thousands of files, a complete pipeline run can take minutes. AxonAtlas solves this with a dependency-aware incremental engine that re-analyzes only the files affected by a change — and their transitive importers.
When a file changes, the incremental engine identifies which symbols are impacted, traces the IMPORTS graph to find all transitive importers, schedules them for re-parsing, and runs only the affected pipeline phases. This cascade strategy ensures that downstream consumers of a modified symbol always receive updated information, while files unrelated to the change are left untouched.
Graph Integrity & Self-Healing
The axon_graph_integrity tool provides a comprehensive structural health check that verifies graph invariants, ingestion consistency, and trust metrics. This is the equivalent of a database CHECKDB — it ensures that the knowledge graph is internally consistent and trustworthy.
| Check | What It Verifies | Action on Failure |
|---|---|---|
| Orphan nodes | Every non-FILE node has at least one incoming edge | Flag for re-indexing |
| Dangling edges | Every edge connects two existing nodes | Remove stale edges |
| Schema conformance | All nodes have required properties (label, moniker, file_path) | Report violations |
| Community coverage | Every non-structural node belongs to at least one community | Re-run community detection |
| Index consistency | Secondary indexes match the canonical node set | Rebuild indexes |
Real-World Use Cases
AxonAtlas is designed to serve multiple personas across an engineering organization. Each use case leverages the same underlying knowledge graph — the intelligence is universal, but the value is specific to each role.
AI Agent Infrastructure
AI Agent Developers & Platform TeamsAI coding agents make better decisions when they can see the full picture. By integrating AxonAtlas as an MCP server, agents gain real-time access to call graphs, community boundaries, blast radius data, and dead code maps — enabling them to propose changes that respect the existing architecture rather than blindly modifying files.
Pull Request Risk Assessment
Engineering Leads & Code ReviewersEvery pull request is automatically assessed for architectural risk. AxonAtlas analyzes the diff, maps changed lines to affected symbols, calculates the downstream blast radius, identifies missing co-change files (files historically coupled but not included in the PR), and flags community boundary crossings that may indicate cross-team coordination needs.
Developer Onboarding
New Team Members & DevEx TeamsUnderstanding a large codebase is the primary bottleneck for new engineers. AxonAtlas provides narrative symbol explanations, community maps, and process flow visualizations that transform the onboarding experience. Instead of reading through thousands of files, a new engineer can query: 'Explain the authentication flow' and receive a structured response tracing every function in the chain.
Technical Debt Quantification
Engineering Managers & CTOsAxonAtlas transforms vague 'technical debt' discussions into data-driven conversations. Dead code analysis quantifies exactly how much unreachable code exists. Coupling analysis reveals which files are implicitly bound together. Community detection shows where the actual architecture diverges from the intended module boundaries.
CI/CD Intelligence
Platform Engineers & DevOpsIntegrate AxonAtlas into your CI pipeline to add intelligence to every build. Test impact analysis identifies exactly which test suites need to run based on the changed symbols — eliminating unnecessary test execution. Change detection maps diff hunks to graph symbols, providing precise change metadata for downstream tooling.
Security & Compliance Auditing
Security Engineers & Compliance TeamsThe knowledge graph enables precise security auditing at the architectural level. Trace data flows from user input to database queries. Identify all callers of sensitive functions (authentication, authorization, encryption). Map which communities handle PII data and ensure they follow compliance patterns.
Strategic Positioning
AxonAtlas occupies a unique position in the developer tooling ecosystem. It is not a linter. It is not a code search engine. It is not a project management tool. AxonAtlas is a Code Intelligence Platform — a category that is emerging as the critical infrastructure layer for AI-assisted software development.
The platforms that will define the next decade of engineering productivity are those that serve as the intelligence backbone for both human engineers and AI agents. AxonAtlas is built to be that backbone — a continuously updated, semantically rich model of your entire software system that every tool, every agent, and every engineer can query.
Competitive Differentiation
The following comparison maps AxonAtlas against adjacent tool categories. AxonAtlas is not a replacement for any single tool — it is the intelligence layer that makes all other tools more effective.
| Capability | Linters (ESLint, Pylint) | Code Search | AI Assistants | AxonAtlas |
|---|---|---|---|---|
| Representation | AST-level warnings | Text index + regex | Prompt context window | 11-type knowledge graph |
| Semantic Depth | Single-file rules | Cross-file text search | None (stateless) | Multi-tier resolution + type flow |
| Architectural View | None | Repository navigation | None | Leiden communities + process flows |
| Change Intelligence | None | Blame + history | None | Temporal coupling + blast radius |
| Agent Integration | CLI output parsing | API access | Built-in (limited scope) | Native MCP with 17 tools |
| Multi-Repository | Config per-repo | Cross-repo search | Single-file context | Workspace-level graph |
| Incremental Updates | File-level re-lint | Index rebuild | N/A | Dependency-aware cascade |
The Intelligence Stack
AxonAtlas fits into the modern developer toolchain as the intelligence layer between source code and the tools that operate on it. Consider the following layered view:
“AxonAtlas represents a new category of developer infrastructure — not a tool that replaces existing workflows, but an intelligence layer that makes every existing tool smarter. When your CI system knows the blast radius, when your agent knows the community structure, when your reviewer knows the coupling score — that is the value of a Code Intelligence Platform.”
— AxonAtlas Architecture Team
