Your Claude Code Repository Forgets What It Knows. Mine Doesn't.

Your Claude Code Repository Forgets What It Knows. Mine Doesn't.

·7 min read

I have a personal repository with thousands of documents, 50+ skills, and over 800 knowledge graph edges.

The system is robust. It's very capable. I drive it hard.

But it doesn't know what it knows because Claude Code doesn't remember its work across sessions.

Instead every session starts from zero. I spend time anchoring what workstream we're in. I watch tokens get eaten reading and reassembling context across large markdown files. Strong CLAUDE.md files, hierarchical and organized folders, wiki links and good frontmatter, and codified patterns help. The final output is great. But overall it feels inefficient AND like there's more potential for a compound effect.

So I asked a question: What if the repository could be aware of itself?

Background on the Problem: Amnesia at Scale

My repository is a unified personal and professional knowledge operating system. Eight workstreams (advisory, investments, writing, day job, side projects, and more) with various agents running 12+ hours a day 7 days a week touching hundreds of files in complex workflows across 20-30 step PRDs.

The question I was grappling with is how to build memory into my local Claude Code in a way that was referential, self-improving, and created better outputs across all workflows over time.

Goals:

  • Get better plans and execution with me driving
  • Create a better agentic copilot working alongside me
  • Organize myself and my work "in the real world" more concretely and effectively and make to do lists more agentic

What I Built in 10 Days

I executed 9 very large-scale interrelated projects over the course of ten days to experiment here. Each one built a layer that enabled the next.

The layers stack.

You can't enforce skill reuse (Layer 7) if skills don't have shared context (Layer 4). You can't auto-load context by file path if the taxonomy is inconsistent (Layer 3). You can't normalize the taxonomy if you don't have a schema to normalize against (Layer 1).

It's not like I had a crystal clear vision at the beginning. This ordering emerged as I built (and time will tell what frankenstein appendages I'll need to clean up).

The Kanban Layer

After the 9 PRDs and the cascade, I noticed a gap. GSD and ralph execution loops handle the series of micro-tasks in complex PRDs with minimal context loss. The repository-wide intelligence layer handles macro patterns. Nothing covered the middle: tracking work in progress across 8 workstreams with visibility into what's blocked, what's ready, and what I should work on next.

So I built a personal kanban system with hybrid storage (with tons of inspiration from what my friend and previous cofounder Parker Ferguson has built for himself).

In this system every task mutation writes to both a JSONL append log (git audit trail) and a materialized JSON state file (fast reads). Supabase mirrors both for SQL queries. A slack channel serves as one place where I tag work adds tasks, assigns them lanes, puts them in a queue for our deep planning and project manager agents to launch planning. The front-end is basically a Trello clone that tags agents and references related PRDs.

Concretely, what I'm saying in slack is basically:

I aspire to do [X], for [Y] project, assign it to the [deep planning agent] to build a plan to do ____. Use [knowledge synthesis] and [prompt creator] skills to help create the plan and mirror all constraints required to run autonomously with the [ralph-execute] skill.

In some ways this "Aspire" agent as I'm calling it fills the meso-level gap between "what's my next action" and "what the repository learning."

How the Awareness System Actually Works

Five hooks. Four Python scripts. That's the nervous system.

Session Starts

The SessionStart hook fires and runs session_start_intelligence.py. It loads learning rules, checks which of 8 workstreams are hot or cooling, counts active PRDs and available skills, and checks for pending insights from the last weekly audit.

The output lands in context before I type anything.

Five seconds. Zero effort. Every session starts with orientation instead of amnesia.

Work Happens

Two hooks fire during work.

The SubagentStart hook fires every time an agent spawns. It injects a prompt listing 8 key skills and an instruction to glob for the full list. Every subagent starts with skill awareness before its first tool call.

The PostToolUse hook fires after every Edit or Write. It logs the file path, action type, domain, and workstream to changelog.jsonl. Over time, this builds a record of what actually gets touched, not what I think gets touched.

Oh and if it's about context retrieval across the repo, the knowledge graph and ontology provides hyper efficient seeds before we get into using the folder hierarchy, Claude.md rules or otherwise.

Session Ends

The Stop hook fires and runs capture_skill_usage.py. It records which skills were invoked, which workstreams were active, and appends to skill_usage_ledger.jsonl. This feeds the weekly audit.

The PreCompact hook fires before context compaction. It saves a session summary: files modified, skills used, tool counts. A checkpoint before memory loss. I'm still not sure how faithfully this part is really working in the main orchestrator agents across plans.

Weekly Audit

The weekly audit skill collects changelog entries, skill usage patterns, and file co-access data. It runs 6 pattern detectors looking for skill chain gaps, stale references, cooling workstreams, and repeated file clusters.

Detected patterns become insights.jsonl entries. Each insight starts as "pending," gets reviewed, and either becomes an active learning rule or gets dismissed. Active rules get loaded at the next SessionStart.

The loop closes. Sessions generate data. Weekly audits extract patterns. Patterns become rules. Rules load at session start. The repository gets smarter each cycle.

How Memory Translates to Better Output

The awareness system should change what happens next. I'm trying this with four mechanisms that connect memory to output.

1. SessionStart hook eliminates re-explanation. Before hooks, every session started with me typing context. Which workstream. What PRD. Where I left off. Five minutes of setup, every time. Now the SessionStart hook loads repo state, active PRDs, learning rules, hot workstreams, and pending insights before I type anything. The agent knows which workstream has been active, what the last session touched, and what rules apply. Does this help and if so how much? TBD.

2. Mandatory skill audit prevents reinvention. I have 54 skills. Without enforcement, agents like to ignore them. One solid addition has been adding a skill audit gate in /deep-planning Phase 2 which requires identifying 3+ relevant skills to use before planning proceeds. In addition the SubagentStart hook injects skill awareness into every spawned agent. And I manually review my long PRDs, many times over.

3. Typed ontology edges reduce unnecessary reading. Wiki-links tell you that two documents connect. The ontology tells you how. A synthesizes edge means skip the sources and read the synthesis. A supersedes edge means skip the old version entirely. Before ontology, agents read linked documents to understand the relationship. Now the relationship type tells agents whether to read the linked document at all. Early testing suggests 30-50% token reduction on graph traversal tasks. I haven't evaluated if there's context degradation with less deeply ingested markdown files. But for now tokens not spent reading irrelevant documents go toward actual work.

The goal is to have these mechanisms compound. At some point I can go deeper on the specific libraries and tools that made this possible.

The Ralph Persistence Pattern

One thing is for sure though. I couldn't have executed all of these patterns in 10 days without Ralph.

Ralph is a stateless execution pattern. The PRD file is the memory. Each iteration reads state from the file, executes one action, updates the file. No context window reliance. The agent doesn't need to remember what happened three iterations ago because the file tracks it.

This matters because context compaction and session breaks are inevitable on complex work. You re-explain context, lose track of where you were, accidentally redo completed work. The stateless pattern treats every iteration as a fresh start with full state awareness.

Each PRD has phases sized to complete in a single context window. Success criteria are checkboxes. Running Learnings accumulate across iterations, capturing gotchas that inform later phases. I started with the base Ralph pattern out there, had it evaluate several other popular frameworks, concepts from memory-based MCPs, and lots and lots of interrogations of my skill to try to create the best PRDs with the most guard-rails possible.

I still see my orchestrator agent compact in runs but the PRDs keep it on track and the hooks do a good job re-loading important information as well.

Experimenting with a Repository-wide Meta Agent

The layers above give the repository a nervous system. It observes, records, and loads context. But observation alone isn't intelligence. The question I'm working toward: can it learn from what it observes and act on what it learns?

I think of this as three capabilities building on each other.

Observation works now. The repository sees what's happening in real time and remembers it across sessions.

Learning is building. The weekly audit runs 6 pattern detectors across changelog data, skill usage, and file co-access patterns. It looks for skill chain gaps, stale references, cooling workstreams, and repeated file clusters. It also reads all 8 workstream briefs looking for cross-workstream patterns: content feeding consulting, consulting insights feeding back to product positioning, project learnings applicable elsewhere. And it ties this all back to my project list/kanban/aspire project management patterns to apply NOW to current and upcoming work.

Action is the vision. I built a /proactive-review skill that reads workstream briefs, active PRDs, learning rules, and cross-workstream insights. It proposes 3-5 next actions ranked by staleness risk, opportunity cost, and cross-workstream leverage. The proposals land in an agent backlog. I approve, reject, or defer.

That runs more user-driven on top of the kanban system which fires some hooks to audit open but incomplete PRDs in the plans folder to make sure we have a mechanism to clean up what's in progress across all workstreams. Good hygiene helps the context which helps the execution.

The eventual goal is perhaps more of a master "repo" agent to replicate "me." And I'm working towards a system that wakes up autonomously, checks for stale PRDs, cooling workstreams, and approved backlog items, and produces a morning briefing before I open a terminal. I'm not there yet. And frankly, there's so many workstreams and so much context that I'm not sure I'm ready for that level of autonomous action quite yet.