Persistent Structural Memory for AI: The Architecture Behind Infigraph

In my previous article, I wrote about what I called Code Blindness – the hidden operational cost of forcing AI assistants to repeatedly rediscover the structure and architectural relationships that already exist inside our codebases.

Today’s coding assistants can inspect local files, trace explicit imports and painstakingly piece together relationships to answer familiar engineering questions:

– Who calls this function?
– What breaks if I alter this API route?
– Which services depend on this component?
– What is the true blast radius of this change?

These aren’t difficult questions because the code is hard to read. They’re difficult because the relationships that answer them aren’t explicitly available. Every new AI session reconstructs them from source code, only to discard that understanding when the conversation ends.

That observation eventually led us to build Infigraph – an attempt to turn software structure into reusable, local infrastructure.

When we recently open-sourced the project, one question came up repeatedly:

“What makes Infigraph different from the other code intelligence and code graph projects already out there?”

It’s a fair question.

The ecosystem already has tools for code search, static analysis, architecture visualization and AI-assisted development. Some focus on helping engineers navigate code. Others generate knowledge graphs for LLMs, visualize architecture or build richer retrieval pipelines.

We weren’t trying to build another code intelligence tool.

We were trying to build a local-first, persistent structural memory layer that AI assistants could query directly instead of repeatedly reconstructing software relationships from source code.

That objective influenced nearly every architectural decision we made – from how code is parsed, to how relationships are extracted and stored, to how AI agents retrieve information.

Looking back, those decisions weren’t independent optimizations. They were consequences of a single design principle:

If software structure changes far more slowly than AI conversations, then structural knowledge should be treated as infrastructure and not something rebuilt from scratch for every prompt.

This article walks through the engineering decisions that followed from that principle, the tradeoffs we accepted and the lessons we learned while building Infigraph.

The System Blueprint

Before discussing the individual architectural decisions, it’s worth understanding how the pieces fit together. At a high level, Infigraph continuously transforms a codebase into a persistent structural representation that AI assistants can query directly. Instead of rediscovering relationships for every conversation, those relationships become shared infrastructure.

The graph doesn’t replace the language model. It changes the question the language model has to answer. Rather than treating every prompt as an isolated reasoning exercise, Infigraph treats structural understanding as persistent infrastructure.

The important observation isn’t the individual technologies. It’s where the computational effort moves.

Traditional AI workflows spend most of their effort reconstructing architecture from source code every time a question is asked. Infigraph moves that work to indexing time. Parsing source code, resolving symbols, understanding imports and discovering relationships happen once – when the repository is indexed. Every subsequent question becomes a retrieval problem instead of a reconstruction problem.

That architectural shift immediately imposed a new set of engineering requirements. We needed:

A storage engine optimized for relationship traversal rather than document retrieval
A retrieval layer that could combine graph queries with traditional search
A parsing architecture capable of understanding modern polyglot codebases without becoming language-specific

The next three sections explain how those requirements shaped Infigraph’s architecture.

Decision #1: Represent Code as a Persistent Graph

The first architectural decision was to decide how software itself should be represented.

A software system isn’t just a collection of source files. It’s a network of explicit relationships. A function call is an explicit relationship. An import statement expresses a dependency. A class hierarchy defines inheritance. Module boundaries already exist whether an AI model discovers them or not. We needed a statically discoverable representation where relationships were first-class citizens.

That naturally led us to a graph.

Instead of storing source files as isolated text, Infigraph persists a connected topology of software entities and the relationships between them.

Once relationships become explicit, architectural questions stop being text-search problems. Familiar engineering questions become a graph traversal. The system isn’t reading raw source code inside an LLM reasoning loop to find callers. It is traversing an index that already knows they exist.

Once we committed to representing software as a graph, the next question became much more practical:

What kind of graph engine could support interactive AI workflows without becoming another server-side dependency?

That question shaped our next architectural decision.

Decision #2: Persist Structural Memory Locally

We could have stored the graph in a traditional client-server database. We could have relied on a managed graph service. Or we could have generated structural context on demand through cloud-hosted retrieval pipelines.

All of those approaches work.

But they conflicted with one of our architectural constraints from the very beginning:

Structural knowledge should live alongside the repository, not behind another network boundary.

That single constraint influenced far more than our storage engine. It shaped the entire architecture.

If AI assistants increasingly become part of the developer’s inner loop, structural knowledge should be available with the same characteristics developers already expect from their source code:

local
immediately accessible
private
independent of cloud connectivity

That immediately narrowed our design space. We needed a graph engine that was:

embedded rather than server-based
lightweight enough to ship with the developer environment
optimized for large relationship traversals
capable of answering structural queries within an interactive AI workflow

That led us to KuzuDB, an embedded, columnar graph database designed around analytical graph workloads rather than transactional business operations. The workload wasn’t updating records, it was traversing relationships. A columnar storage engine aligns well with that access pattern because it can efficiently scan relationship data without repeatedly loading complete records.

The architectural layout shift here is central to performance. The difference isn’t the graph model – it’s the storage layout:

Performance Benchmarking

When traversing deep, multi-hop dependency tracks across half a million nodes, we rarely need to unpack full, heavy row configurations. Benchmarks were run on representative repositories and consistently observed substantially lower traversal latency for deep dependency walks.

The important result wasn’t the absolute latency. It was that the storage layout aligned far better with the traversal-heavy workload of AI-assisted development.

Like every architectural decision, it came with tradeoffs. KuzuDB is a younger ecosystem than some of the established graph platforms. We consciously traded ecosystem maturity for an embedded architecture that better matched the interaction model we were trying to enable. Looking back, that tradeoff shaped much more than storage. Once structural memory became local and inexpensive to traverse, the next challenge was no longer storage.

Decision #3: Retrieve Structural Context Before Reasoning

Persisting structural knowledge solved only half the problem. The remaining challenge was retrieving the right structural context quickly enough that an AI assistant never needed to fall back to reading large portions of the repository.

At first, it seemed tempting to rely on a single retrieval strategy. Keyword search is excellent when an engineer already knows the exact symbol they’re looking for. Semantic search is better when they describe an idea rather than an identifier. Graph traversal is indispensable when the question is fundamentally about relationships. But, none of these approaches is sufficient on its own.

Different questions require different retrieval strategies.

Instead of trying to force every question through a single search engine, Infigraph combines multiple retrieval mechanisms, each optimized for a different type of query.

We built a local-first, parallel hybrid retrieval pipeline where each engine contributes a different signal:

BM25 (Exact Retrieval): Fast, deterministic lookup for symbols, filenames, identifiers and keywords
Semantic Retrieval (Model2Vec): A bundled 29 MB embedding model retrieves conceptually similar code without relying on external embedding APIs
Regex Retrieval: Captures explicit syntactic conventions, decorators, annotations and language-specific patterns that keyword and semantic search may overlook

Once these candidate starting points are identified, Graph Traversal takes over. The retrieval layer expands those candidate matches into architectural context.

If retrieval is part of the developer’s inner loop, it should remain just as local and self-contained as the graph itself. That led us to build the entire retrieval pipeline including keyword indexes and semantic embeddings to execute locally without depending on external services.

The goal wasn’t simply lower latency. It was to ensure that structural understanding remained available regardless of network connectivity, while keeping source code inside the developer’s environment. The retrieval layer shouldn’t decide what the model thinks. It should decide what the model needs to think about.

Making Structural Memory Consumable

Building a retrieval pipeline solves only part of the problem. The other half is exposing that structural knowledge in a way AI assistants can consume naturally. Rather than embedding graph traversal logic into individual coding assistants, Infigraph exposes focused capabilities – symbol lookup, dependency traversal, call graph exploration and structural search – through the Model Context Protocol (MCP).

That separation was intentional.

The graph remains the system of record. MCP becomes the interface through which AI assistants access that knowledge. Whether the client is Claude Code, Cursor, GitHub Copilot, Windsurf or another MCP-compatible tool, they all interact with the same persistent structural memory instead of rebuilding it independently.

This reinforces the same architectural principle that shaped the rest of Infigraph:

Structural knowledge should be shared infrastructure. MCP simply makes that infrastructure accessible.

The final challenge was making that extraction scale across the reality of modern polyglot systems.

Decision #4: Decouple Structural Extraction from Language

Very few systems live entirely within a single language. A typical request may begin in a TypeScript frontend, flow through a Java service, invoke a Python-based machine learning component and finally interact with SQL or infrastructure configuration. Supporting that reality required more than adding parsers. It required separating the extraction engine from language-specific syntax.

That became our final architectural decision:

The extraction pipeline should remain stable as language support grows.

Instead of writing language-specific logic inside the core engine, Infigraph separates parsing from extraction.

To support both mainstream languages and enterprise-specific grammars, we built a dual-extraction architecture:

For mainstream languages, we rely on Tree-sitter grammars and declarative queries to identify structural entities such as symbols, imports, calls and inheritance.
For proprietary languages, internal DSLs or environments where Tree-sitter isn’t the right fit, Infigraph provides an ANTLR-based extension path. New grammars can be added without modifying the extraction engine itself.

That separation turned out to be more valuable than we initially expected.

Once parsing produces a common structural representation, everything else in the architecture remains unchanged. Every additional language increases the capability of the platform without increasing the complexity of its core.

Today, that approach allows Infigraph to support 62 languages out of the box while remaining extensible for environments that need more. Persistent structural memory shouldn’t become more complicated every time your software ecosystem grows. By separating extraction from language, we made language diversity an extension point instead of an architectural constraint.

The Landscape: Where Infigraph Fits

Most code intelligence platforms are ultimately designed around one of two consumers:

Humans, who need to search, visualize, analyze or understand software systems.
Analysis engines, which evaluate code for correctness, security, compliance or quality.

Our primary consumer is different. It’s an AI assistant operating inside a developer’s editing loop.

Projects such as SciTools Understand, Sourcegraph, Joern and newer AI-native graph initiatives have each pushed the ecosystem forward in different ways. Many engineers already rely on them successfully.

Our goal wasn’t to replace those tools. It was to optimize for a different execution model.

The architectural differences become clearer when viewed through the problems each category was designed to solve. The differences aren’t primarily about features. They’re about architectural optimization. Each category solves a different problem and therefore makes different tradeoffs.

Dimension	Human-Centric Platforms	AI Knowledge Builders	Infigraph
Typical Examples	SciTools Understand, Sourcegraph, Joern	Understand-Anything, Graphiti, Nomik and similar projects	Infigraph
Primary Consumer	Engineers & Architects	AI knowledge generation workflows	AI coding assistants
Structural Extraction	Parser / index-based	Often combines parsing with LLM summarization	Deterministic parser-based extraction
Deployment Model	Desktop or centralized infrastructure	Frequently cloud-assisted	Local-first embedded infrastructure
Primary Interaction	Search, navigation, visualization	Repository understanding and documentation	Real-time MCP tool calls
Optimization Target	Human understanding	AI-generated repository knowledge	Persistent structural memory for AI

These categories aren’t mutually exclusive. In many organizations they complement one another. The difference lies in which problem each one is optimized to solve. This distinction matters because our optimization target was fundamentally different.

We weren’t building another interface for engineers to explore repositories OR building another cloud pipeline that asks an external LLM to understand a repository before a developer can ask a question.

We were trying to answer a much narrower architectural question:

How do we make structural knowledge continuously available to AI assistants without paying to rediscover it every conversation?

That single question explains almost every architectural decision described in this article.

Represent software as a persistent graph
Persist structural memory locally
Retrieve structural context instead of raw files
Expose that knowledge through MCP
Keep extraction extensible across languages

Everything else follows from that design center.

Choose Infigraph when…

Your primary development workflow revolves around AI coding assistants, such as Claude Code, Cursor, GitHub Copilot, etc
Your agents repeatedly ask structural questions about callers, dependencies, ownership or impact analysis
You want local-first execution without repeatedly sending repository context to external services
You want persistent structural context that survives beyond individual AI conversations

Continue using existing tools when…

Your primary need is enterprise-scale code search
You’re performing security or compliance analysis
You need architecture visualization or reverse engineering for human exploration

Instead of asking the AI to reconstruct relationships every session, Infigraph provides them as persistent structural memory that can be queried locally in milliseconds.

Our goal isn’t to replace the existing code intelligence ecosystem. It’s to become the lightweight local-first, structural memory layer that complements it for AI-native software development.

Looking Ahead

I don’t think Infigraph is the final answer to AI-native software development. In fact, I suspect we’re only beginning to define what this architecture layer should become.

Today, persistent structural memory captures relationships between software entities. Tomorrow, it may also incorporate architectural evolution, ownership boundaries, runtime behavior, operational telemetry, organizational knowledge and historical change patterns.

The better AI becomes at generating code, the more important these structural layers become. Generated code is only valuable if it fits coherently inside the system around it. I believe our responsibility is gradually shifting toward building better representations of the systems AI increasingly helps us evolve.

That’s ultimately why we open-sourced Infigraph.

Not because we think we’ve solved the problem, but because we believe persistent structural memory is an architectural direction worth exploring together.

If this way of thinking resonates with you, I’d encourage you to try Infigraph against your own repositories, challenge the assumptions we’ve made and contribute where you think the architecture can be improved.

We’re still learning.

Hopefully, we’ll learn together.

. Sandeep Mewara Github
Tech Explore
Trend
Learn Machine Learning with Examples
Machine Learning workflow

https://learnbyinsight.com/wp-content/uploads/2026/06/infigraph-vertical-light.png

GitHub: https://github.com/intuit/infigraph
Documentation: Detailed design specs and contribution guidelines are included in the repo.

Learn by Insight…

Explore & Share

Persistent Structural Memory for AI: The Architecture Behind Infigraph

The System Blueprint

Decision #1: Represent Code as a Persistent Graph