How to Build a Karpathy-Style LLM Wiki for AI Agents Using Markdown, Git, and BM25

Build a Karpathy-style LLM wiki for AI agents using Markdown, Git, and BM25. Step-by-step guide to durable agent memory without vector databases.

You will build a durable, git-backed knowledge substrate where AI agents maintain persistent memory using plain Markdown files, BM25 text search via Bleve, and SQLite for structured metadata. This Karpathy-style LLM wiki runs entirely local in ~/.wuphf/wiki/, requires no vector database or graph DB, and achieves 85% recall@20 on retrieval benchmarks. Your agents get private notebooks for scratchpad thoughts, a shared team wiki for canonical knowledge, and append-only fact logs for entity tracking. Everything lives in Git, so you own your data completely, can clone it anywhere, and audit every change via commit history. This setup compounds context across sessions instead of re-pasting system prompts every morning, giving your OpenClaw agents true long-term memory without cloud dependencies or expensive embedding pipelines.

Why Markdown and Git Beat Vector Databases for Agent Memory

Vector databases like pgvector or Pinecone add operational weight you do not need for most agent memory workloads. You pay embedding costs, manage dimensionality, and lock yourself into schemas that quickly become obsolete. Markdown and Git give you durability that outlives any specific runtime or framework. If the agent framework breaks, you still have readable, human-understandable files. Git provides robust audit trails, allowing for branching for experimental knowledge, and instant rollback capabilities when agents hallucinate inaccurate facts into the wiki.

BM25 text search handles keyword and phrase matching with zero preprocessing, making it highly efficient. SQLite manages the structured metadata, such as entity relationships and redirect stubs, ensuring data integrity. This entire stack runs offline, clones to any machine, and opens in any text editor, offering unparalleled flexibility. You trade some semantic fuzziness found in vector embeddings for precision and complete data ownership, which is the correct trade for deterministic agent tools that require verifiable source citations.

Prerequisites and System Requirements for Your LLM Wiki

To successfully implement this Karpathy-style LLM wiki, you need a Unix-like environment with several key components. Node.js 18+ is required for the wuphf CLI tools, and Git 2.30+ is essential for version control. You should allocate approximately 500MB of free disk space for the initial index and database. The system leverages Bleve for BM25 indexing, which relies on native bindings, so a C compiler is necessary. Install build-essential on Ubuntu or Xcode Command Line Tools on macOS to meet this requirement.

SQLite comes bundled with most systems, but verify you have version 3.35+ to ensure support for JSON features. You also need an LLM CLI configured and accessible in your system’s PATH for the synthesis worker to function correctly. Examples include claude (if using the Anthropic API) or llm (Simon Willison’s excellent tool). The wiki layer operates locally and does not bind to any network ports by default, though the MCP server mode can optionally expose a stdio interface for inter-process communication. Crucially, ensure your ~/.wuphf/ directory is not located on a network filesystem if you prioritize fast SQLite writes, as NFS can introduce significant latency and potential locking issues for database operations.

Installing the Core Wiki Layer for Persistent Memory

Installing the wiki subsystem is straightforward using npx, which avoids cluttering your global package namespace. Execute npx wuphf@latest init --wiki-only to scaffold the necessary directory structure within ~/.wuphf/wiki/. This command performs several critical actions: it creates the agents/ and team/ subdirectories, which will house your agent’s private notebooks and the shared team knowledge, respectively. It also initializes a Git repository within ~/.wuphf/wiki/, ensuring all changes are version-controlled from the outset. Furthermore, it sets up the SQLite schema for facts and entities, providing the structured backbone for your wiki’s metadata.

The installation process also generates a config.toml file, where you can customize essential parameters such as your preferred LLM command and the synthesis threshold (which defaults to 10 facts before triggering a brief rebuild). If you already have an OpenClaw installation, the installer intelligently detects your existing configuration and offers to symlink the wiki into your agent workspace, streamlining integration. The entire installation typically completes in under 30 seconds and consumes roughly 50MB for the Bleve index headers and the initial empty database.

Initializing and Managing the Git Repository Structure

After installation, navigate to ~/.wuphf/wiki/ and verify that Git is properly initialized by running git status. The directory structure adheres to a strict convention to maintain order and clarity. agents/{slug}/notebook/ is designated for private agent scratchpads, acting as temporary working memory. The team/ directory is reserved for canonical wiki articles, representing validated and shared knowledge. Within team/, team/entities/ stores JSONL fact logs, which are append-only records of observations and assertions related to specific entities.

It is crucial to create a .gitignore file that explicitly excludes *.index/ directories, as Bleve indexes are rebuildable and do not need to be version-controlled. However, you should include wiki.db (the SQLite database file) if you intend to version the metadata alongside your Markdown content. Set your user identity for manual edits, but crucially, prepare a separate “Pam the Archivist” identity for automated writes. Configure this in Git using git config user.name "Pam the Archivist" and git config user.email "pam@local.wiki" before any automated commits. This clear separation makes the provenance of changes immediately visible in git log and facilitates filtering automated synthesis from human curation.

Configuring BM25 Search with Bleve for Efficient Retrieval

Bleve provides the robust BM25 index that powers the /lookup command, enabling efficient text search across your wiki. To initialize this index, run wuphf index build from the wiki root directory. This command creates a search.index/ directory, which will contain tokenized terms extracted from all your Markdown files and JSONL fact logs. The default mapping utilizes English text analysis, incorporating standard features like stop word removal and stemming. This configuration generally works well for most code documentation, general agent notes, and common knowledge bases.

For specialized technical domains with unique terminology, you might need to adjust the configuration. Edit config.toml to disable stemming or to add a custom analyzer that better suits your specific content. The indexer is designed to be efficient; it watches the filesystem for changes and updates incrementally, ensuring your search results are always current. To test the recall and relevance of your index before connecting agents, you can query it directly using wuphf search "deployment strategy". The system will return file paths, specific line numbers, and relevance scores, providing your agents with precise, cited retrieval capabilities.

Setting Up SQLite for Structured Metadata Management

While Bleve excels at full-text search, SQLite is indispensable for managing the structured graph of entities, redirects, and supersession chains within your wiki. The schema provided with the wiki ships with three critical tables designed to maintain this structure: entities stores canonical IDs and current briefs for each conceptual entity, facts contains foreign keys linking to entity files with specific sentence offsets, and redirects handles old slugs that point to new ones after articles are merged or renamed.

To initialize this database, execute wuphf db migrate, which creates the wiki.db file. You can then insert your first entity manually to test the functionality: wuphf entity create --kind=project --slug=api-gateway --name="API Gateway Refactor". This command not only creates a JSONL fact log file at team/entities/project-api-gateway.facts.jsonl but also generates a Markdown stub at team/project-api-gateway.md. The synthesis worker, which operates in the background, reads the JSONL, generates a concise summary using your configured LLM, and then writes this summary to the corresponding Markdown file. Critically, these automated updates are committed under “Pam the Archivist,” maintaining a clear audit trail.

Creating Private Agent Notebooks for Scratchpad Thoughts

Each AI agent operating within this wiki framework is allocated a private workspace located at agents/{slug}/notebook/. This directory serves as a dedicated area where the agent can dump stream-of-consciousness thoughts, record failed attempts, and store raw observations without immediate scrutiny. To create such a notebook for your OpenClaw agent, for example, you would use mkdir -p agents/claude-code/notebook and then add an initial file like scratchpad.md.

Agents are designed to write freely into these notebooks, treating them as a dynamic working memory buffer. The BM25 index includes these private notebooks, allowing agents to efficiently search their own historical thoughts and data. However, a crucial design principle is the prevention of unvetted notebook entries from polluting the shared team wiki. A carefully controlled promotion flow ensures that only validated knowledge transitions to the canonical shared space. You can also set a retention policy in config.toml to automatically archive notebook entries older than a specified duration, for instance, 30 days, moving them to agents/{slug}/archive/. This practice helps keep the active index lean and relevant, preventing private noise from adversely affecting shared retrieval results.

Building the Shared Team Wiki for Canonical Knowledge

The team/ directory is the central repository for canonical knowledge within your LLM wiki. Unlike the ephemeral nature of agent notebooks, entries in the team wiki are treated as established truths and undergo a more rigorous review process before publication. To create a new article, you can use the command wuphf wiki draft --title="Authentication Patterns". This action generates a new file within team/drafts/ with a timestamped slug, indicating its draft status.

Once created, you can edit the Markdown content of this draft, incorporating [[Wikilinks]] to related entities and concepts within the wiki. After the content is refined and ready for broader consumption, you promote it to the canonical wiki using a command such as wuphf wiki promote team/drafts/auth-patterns-20260425.md. The promotion command moves the file from the drafts directory to team/auth-patterns.md, making it part of the shared knowledge base. It also updates the Bleve index to reflect the new location and, if the slug was changed during drafting, creates redirect stubs to ensure old links remain valid. Crucially, canonical IDs are designed to be immutable; once assigned, they persist for the lifetime of the concept, guaranteeing stable citations across all agent sessions and human references.

Implementing the Draft-to-Wiki Promotion Workflow

The draft-to-wiki promotion flow is a critical mechanism designed to prevent the introduction of unverified or inaccurate information into the canonical knowledge base. When an agent or a human user marks a draft as ready for publication, the system initiates a comprehensive lint pass. This pass checks for various potential issues, including contradictions against existing facts, broken wikilinks (references to non-existent slugs), and stale temporal references that might indicate outdated information.

If all checks pass successfully, the state machine transitions the draft to a published status and moves the file from team/drafts/ to team/. The promotion command also automatically appends a back-link to the original notebook entry that spawned the draft, meticulously maintaining the provenance of the information. This ensures a clear lineage from initial thought to canonical knowledge. You can configure auto-expiry settings in config.toml to automatically archive wiki entries that have not been accessed or read for a specified period, for example, 90 days. These entries are moved to team/archive/, preserving their IDs for historical citations while keeping the active corpus focused and relevant. This entire process benefits from the audit trail provided by Git, allowing for full transparency and version control.

Managing Per-Entity Fact Logs with JSONL

Structured data within the wiki is meticulously managed through per-entity fact logs, stored as append-only JSONL files in team/entities/{kind}-{slug}.facts.jsonl. Each line in these files represents a distinct fact, complete with a deterministic ID that includes the sentence offset, timestamp, and the source agent responsible for recording it. This format ensures both traceability and immutability.

To append a new fact, you would use a command like wuphf fact add --entity=project-api-gateway --content="Migration completed 2026-04-20". The synthesis worker, a background process, continuously monitors these fact log files. When a specific entity accumulates a predefined number of new facts (defaulting to 10, configurable in config.toml), it triggers an LLM call. This LLM’s task is to rebuild and update the entity’s brief in the corresponding Markdown file. The worker then commits the updated brief under the “Pam the Archivist” identity, with a descriptive message such as “Synthesize project-api-gateway brief from 12 facts.” This append-only approach is crucial for preventing update conflicts and preserving the complete historical evolution of an agent’s understanding of an entity.

Wikilinks are a fundamental feature for creating a navigable and interconnected knowledge base. You use the [[Entity Name]] syntax to establish links between different wiki articles. The system includes robust validation for these links during the indexing process. If a link points to a non-existent slug, rendering it a broken link, it will be highlighted in red when you run wuphf wiki lint, making it easy to identify and correct.

Canonical IDs for slugs adhere to strict rules to maintain stability and prevent link rot. Once a slug is assigned, it is immutable; it is never renamed or deleted. If you need to merge two concepts, such as “Auth Service” into “API Gateway,” you accomplish this by creating a redirect. This involves creating a file team/auth-service.md containing only # Redirect\n[[API Gateway]] and adding a corresponding entry to the SQLite redirects table. The lookup resolver automatically follows these redirect chains, ensuring that agents and users always find the most current information. Fact IDs also follow a deterministic format: {entity-slug}-{sentence-index}-{hash}, guaranteeing stable references even if the underlying files are moved or reorganized. This stability is paramount for agents that rely on citing specific facts without fear of broken links.

Setting Up the Daily Lint Cron for Wiki Maintenance

To maintain the integrity and relevance of your LLM wiki, it is essential to schedule regular maintenance tasks. The wuphf wiki lint --fix-suggestions command is designed for this purpose and should be configured to run daily via a cron job or systemd timer. The linter performs checks for three primary failure modes: contradictions (where facts assert opposite truths), stale entries (articles that have not been read or accessed for a specified period, e.g., 90 days), and broken wikilinks (references to non-existent slugs).

Contradictions are flagged in a lint-report.json file but are not automatically resolved; human intervention is required to adjudicate and correct these. Stale entries, if configured, are automatically moved to the archive directory, keeping the active wiki focused. Broken links are not only highlighted in red but the linter also provides suggestions for similar slugs, leveraging BM25 fuzzy matching to aid in corrections. Additionally, the lint pass verifies that the Git identity is correctly set to “Pam the Archivist” for automated writes, preventing accidental commits under your personal name. It is recommended to schedule this cron job during off-peak hours, such as 3 AM, to avoid interrupting active agent sessions or human users.

Building the Lookup Slash Command for Agent Interaction

The /lookup command serves as the primary interface for your AI agents to query and retrieve information from the wiki. You can implement this command as a shell function or integrate it directly via the MCP server. For short, keyword-based queries, the system routes them directly to the Bleve BM25 search engine. This returns the top 5 most relevant matches, complete with citation paths. For example, a query like /lookup deployment rollback would search the index and return results such as team/deployment-patterns.md#L42 along with a relevance score.

A heuristic classifier within the system is designed to detect narrative queries, such as questions like “why did the migration fail?”. These more complex queries are routed to a specialized cited-answer loop. This loop first retrieves relevant BM25 results, then feeds these results as context to your configured LLM. The LLM then synthesizes an answer, which is returned to the agent along with footnotes citing the original wiki sources. This hybrid approach ensures that simple lookups are fast and efficient, while complex research questions receive comprehensive, context-rich answers.

Integrating MCP Tools for Cited Retrieval in OpenClaw

To fully leverage the wiki with your OpenClaw agents, you need to expose its capabilities via the Model Context Protocol (MCP). This involves defining a new tool, for example, wiki_lookup, in your claude_config.json file. This tool should point to wuphf mcp serve, which acts as the entry point for agent queries. The wiki_lookup tool accepts a query string as input and returns structured JSON output containing results (an array of matching wiki entries) and citations (the file paths of the sources).

When an OpenClaw agent invokes this tool, the wiki performs the necessary BM25 search. For more complex queries, it optionally executes the cited-answer loop, feeding information to the LLM to generate a synthesized response. The formatted context is then returned to the agent. Furthermore, you can configure the MCP server to automatically commit new facts. When an agent uses a wiki_add_fact command, the system writes the new fact to the appropriate JSONL file, and if the fact accumulation threshold is met, it triggers the synthesis worker. This seamless integration provides OpenClaw agents with persistent, auditable memory without requiring modifications to the core agent framework itself.

Tuning BM25 Recall and When to Consider Vector Databases

The default BM25 configuration provided with the wiki is designed to achieve a high recall rate, typically around 85% recall@20, based on internal benchmarks involving 500 artifacts and 50 diverse queries. However, if your specific domain or content type results in recall dropping below this threshold, you have the option to enable sqlite-vec as a fallback mechanism. This can be configured in your config.toml file by setting vector_fallback = true and specifying your preferred embedding model.

The heuristic classifier within the system intelligently routes queries that yield low BM25 scores to this vector index, ensuring that even complex semantic queries can be resolved. Before resorting to vectors, it is highly recommended to first tune the BM25 parameters. You can adjust k1 (term saturation) and b (length normalization) in bleve.json. For instance, increasing k1 to 1.5 might be beneficial for technical documents with a high frequency of key terms, while decreasing b to 0.3 could improve performance if document lengths vary significantly. Only after exhausting these text-based optimization strategies should you consider adding vectors, as they typically introduce higher latency and significantly increase storage requirements.

Troubleshooting Common Setup and Operational Issues

Encountering issues during setup or operation is common, but most have straightforward solutions. If Bleve indexing fails with a “mapping build error,” check that your Markdown files contain valid UTF-8 encoding and are free of binary blobs. “Database is locked” errors from SQLite usually indicate that multiple processes are attempting to write to wiki.db simultaneously; ensure that only one synthesis worker is active at a time, possibly by implementing file locks. If automated commits are not appearing under “Pam the Archivist,” verify your Git identity configuration by running git config user.name within the wiki directory to confirm it returns “Pam the Archivist.”

If wikilinks appear green when they should be red (indicating a broken link), it often means the index is stale; a wuphf index rebuild command should resolve this. Should the MCP server time out, increase the timeout setting in your claude_config.json to 30 seconds or more, as the cited-answer loop, especially when interacting with an LLM, can require significant processing time. Finally, for “permission denied” errors on ~/.wuphf/, ensure that your user account owns the directory and its contents, rather than root, which can happen after a sudo installation.

Next Steps and Production Hardening for Your LLM Wiki

Once your LLM wiki is operational, taking steps to harden it for production use is essential. Begin by configuring a robust backup strategy. You can clone the entire wiki to a remote Git repository: git remote add backup ssh://user@host/wiki.git and then implement a daily push to this remote. Monitor the size of your search.index/ directory; if it exceeds 10GB, consider archiving older notebooks to cold storage to manage disk space. Implement log rotation for wiki.db to prevent unbounded growth from fact logs, ensuring long-term stability.

The current implementation is designed for a single-office scope; avoid attempting cross-office federation at this stage, as conflict resolution logic for distributed environments is not yet implemented. For high-availability deployments, running the wiki on a networked filesystem with SQLite WAL (Write-Ahead Logging) mode enabled can improve concurrency, though this may impact performance. Finally, to ensure consistency across all agents and human contributors, meticulously document your specific entity kinds and fact schemas in a dedicated file, such as team/meta/standards.md. This standardization is crucial for maintaining a coherent and usable knowledge base.

Comparison Table: LLM Wiki vs. Vector Database for Agent Memory

FeatureKarpathy-Style LLM Wiki (Markdown, Git, BM25)Vector Database (e.g., pgvector, Pinecone)
Data FormatPlain Markdown, JSONL, SQLiteVector embeddings (dense numerical arrays)
Data OwnershipComplete, local, Git-versionedOften cloud-hosted, vendor-locked
Operational OverheadLow (local files, Git, SQLite)Moderate to High (indexing, scaling, lifecycle)
CostNear Zero (local compute)Embedding costs, storage, compute, egress
Search MechanismBM25 (keyword, phrase, text relevance)Semantic similarity (vector distance)
Recall (Typical)85% recall@20 (for most agent tasks)High (for semantic similarity)
Precision (Typical)High (direct text matches, cited sources)Moderate (can return semantically similar but irrelevant docs)
AuditabilityFull Git history, diffs, blameLimited, depends on vendor features
Offline CapabilityFullLimited (requires API access)
Schema FlexibilityHighly flexible (Markdown, JSONL schemas)Fixed schema, difficult to change
Human ReadabilityExcellent (plain text)Poor (raw vectors are unreadable)
ComplexityLow to ModerateModerate to High
ScalabilityScales well for document count, less for query concurrencyScales well for query concurrency, less for schema changes
Use CaseDurable agent memory, precise citation, auditable knowledgeSemantic search, context retrieval for fuzzier queries

Frequently Asked Questions

Why not just use a vector database like pgvector?

Vector databases add operational complexity and lock-in. Markdown and Git provide durability, portability, and complete data ownership. BM25 delivers 85% recall@20 for most agent queries without the embedding overhead, and you can always add sqlite-vec later if specific query classes need semantic search.

How does the synthesis worker handle conflicting facts?

The synthesis worker rebuilds entity briefs every N facts using your configured LLM. It appends new facts to the JSONL log but surfaces contradictions during the daily lint pass. Garbage facts get flagged, not auto-deleted, preserving provenance while alerting you to review conflicts manually.

Can I use this with existing OpenClaw agents?

Yes. The wiki layer attaches to any agent setup via the MCP tool interface. Point your OpenClaw configuration at the wiki directory, and agents gain persistent memory through the /lookup command and automatic notebook creation without modifying your existing agent logic.

What happens when BM25 recall drops below 85%?

The system logs query failures and routes those specific query classes to a cited-answer loop with full context. You can enable sqlite-vec as a fallback for those patterns. The benchmark is a ship gate, not a runtime constraint, so the system degrades gracefully rather than failing.

How do I migrate existing agent memory to this wiki?

Export your current memory as markdown files or structured JSON, then import them into agents/{slug}/notebook/ for private memories or team/ for shared knowledge. Use the fact log format for structured data. Run the lint pass to catch broken wikilinks, then commit everything under your identity before switching to Pam the Archivist for automated writes.

Conclusion

Build a Karpathy-style LLM wiki for AI agents using Markdown, Git, and BM25. Step-by-step guide to durable agent memory without vector databases.