I Built a Background Memory Daemon for AI Coding Agents

Claude Code is extremely useful, but like most coding agents, it starts every session from zero. No memory of what was decided, what was corrected, or what changed since last time.
So I built mnemonic— a background daemon that watches your project and builds persistent memory automatically. It's written in Rust, runs locally, and integrates directly into the Claude Code workflow.
How It Works
mnemonic runs in the background and watches three things: your Git commits, file changes, and Claude Code conversations. When something meaningful happens — a new commit, an architectural decision, a user correction — it classifies the event, scores its importance, deduplicates against existing memories, and stores everything in a local SQLite database with FTS5 full-text search.
Classification is rule-based: decisions, feedback, notes, session summaries, and security events each get different signal weights. Importance scoring is dynamic — frequency, recency, and signal type combined with exponential decay. Deduplication uses SimHash embeddings (256-dim cosine similarity), with an optional neural mode (MiniLM-L6-v2, 384-dim) for higher precision.
No cloud. No API keys. Just a local daemon that quietly accumulates context.
Knowledge Graph
On top of raw memories, mnemonic builds a knowledge graph. Every memory gets processed through a rule-based entity extractor that identifies projects, technologies, modules, and concepts. Entities are linked through co-mention edges with typed relations: added_to, fixed_in, refactored_in, documented_in, and so on.
This means you can query "What do I know about SQLite?" and get not just text search results, but a graph of related entities, edges, and the memories that connect them.
Integration with Claude Code
Session-start context injection.mnemonic registers a SessionStart hook in Claude Code's settings. When a new session begins, it generates a CONTEXT.md with key decisions, recent activity, user feedback, knowledge graph entities, and current project state. The agent starts informed instead of blank.
MCP server. mnemonic exposes 7 tools through Model Context Protocol: memory_search, memory_save, memory_recent, memory_similar, memory_context, memory_status, and memory_graph. The agent can search, save, query the knowledge graph, and generate context — all mid-session.
Conversation watcher.mnemonic monitors Claude Code's session files in real-time. It detects user corrections and decisions automatically. Feedback events get the highest signal weight and are never cleaned up — the agent should never repeat a mistake you already corrected.
What I Shipped
The current version includes:
- 3 watchers: Git commits, file changes, Claude Code conversations
- 5 memory types: decision, feedback, note, session_summary, security
- Knowledge graph with rule-based entity extraction, typed edges, and neighbor queries
- 2 embedders: SimHash (256-dim, default) and MiniLM-L6-v2 (384-dim, optional)
- HNSW vector index for fast similarity search, scales to 50K+ memories
- 7 MCP tools for in-session memory access
- 22 CLI commands: search, save, similar, graph, entities, stats, context, export/import, cleanup, doctor, upgrade
- 4 output sinks: Claude Code memory files, Obsidian vault, shared Memory API, context generation
- SessionStart hook for automatic context injection
- macOS menu bar widget (SwiftUI) with live stats, search, quick save, and daemon control
Single Rust binary, local-first, zero external dependencies.
What Changed After I Started Using It
The shift is concrete: I stopped repeating myself.
Before mnemonic, every new Claude Code session meant re-explaining recent decisions, re-describing the architecture, re-stating preferences. With mnemonic running, the agent picks up where we left off.
A real example: I had two Claude Code sessions running in parallel on the same project. One session added a knowledge graph, conversation watcher, and neural embeddings — over 1500 lines of new code. When I switched to the second session, mnemonic had already captured all of it. The agent saw the new modules, the architectural decisions, the updated feature set — without me explaining anything. It just knew.
That's the whole point. Memory that works across sessions without manual effort.
Cross-Agent Memory
I run a team of AI agents — social media, marketing, voice, coordinator — on a remote server. Before mnemonic, none of them knew what my coding agent had been doing.
Recently I added a Memory API output sink, so mnemonic automatically syncs every captured memory to a shared store. Now when my social media agent needs to write about a project, it searches the shared memory and finds architecture decisions, progress updates, and technical details — all captured automatically.
The first time I asked my social agent to write about mnemonic, it pulled context straight from shared memory: the tech stack, the feature set, the recent changes. No manual briefing needed.
Not memory for one assistant. Memory infrastructure for multi-agent systems.
Why I Think This Matters
The future of AI coding isn't just better models. It's better models + better tooling + better memory + better continuity.
The missing layer in most agent workflows isn't generation — it's accumulated context. If an agent remembers what was decided, what was corrected, and what changed, it becomes significantly more useful over time. The interaction shifts from isolated sessions to something closer to a real working relationship.
AI coding agents are already useful. But without memory, they stay stateless — every session is day one.
I built mnemonic because persistent memory should be a default layer in agentic development, not an afterthought.
Want a system like this for your business?
I build custom AI agent systems deployed in 2 weeks. Discovery starts at $500.
Book a Discovery Call