Karpathy-Style LLM Wiki Ships for AI Agents: Markdown, Git, and BM25 as Memory Layer

A developer shipped WUPHF, an open source wiki layer that gives AI agents persistent memory using nothing but Markdown files, Git version control, and BM25 full-text search. The project landed on Hacker News as a Show HN post, positioning itself as a Karpathy-style LLM knowledge substrate where agents both read from and write into a local-first repository. This innovative approach offers a compelling alternative to traditional memory systems for AI agents. Unlike typical implementations that default to Postgres with pgvector, Neo4j, or Kafka streams, this system runs entirely in ~/.wuphf/wiki/ and uses Bleve for BM25 indexing paired with SQLite for structured metadata. It achieves 85% recall at 20 on internal benchmarks without touching a vector database, offering builders a way to compound agent context across sessions rather than re-pasting conversation history every morning.

What Exactly Shipped on Hacker News for AI Agents?

The Show HN post introduced a wiki layer specifically built for AI agents that treats Markdown and Git as the definitive source of truth. This design choice emphasizes simplicity, portability, and version control for agent knowledge. The system creates a local directory structure at ~/.wuphf/wiki/ containing private notebooks for individual agents at agents/{slug}/notebook/.md and a shared team wiki at team/. Each agent writes observations to its private notebook, which can then undergo review and promotion to the canonical wiki. The architecture uses Bleve for BM25 full-text search and SQLite for storing facts, entities, edges, redirects, and various metadata. You install it via npx wuphf@latest and can immediately git clone the directory to export your data, ensuring data ownership and flexibility. The project ships under an MIT license and integrates with Claude Code, Codex, OpenClaw, and local LLMs through OpenCode, functioning as part of the broader WUPHF collaborative office suite or as a standalone memory layer for diverse agent ecosystems.

Why Did the Builder Choose Markdown and Git for Agent Memory?

The builder wanted to rigorously test the capabilities of fundamental, widely adopted tools before introducing more complex infrastructure. Markdown provides a high degree of durability that outlives any specific runtime environment or application. If an agent process crashes, or if you decide to switch AI frameworks, you still retain human-readable files with a complete historical record. Git further enhances this by providing robust provenance tracking, the ability to create branches for experimental knowledge, and atomic commits. When the synthesis worker updates an entity brief, for example, it commits under a distinct identity called “Pam the Archivist,” making the automation’s contributions transparent and auditable in git log. This approach deftly avoids the operational complexity and overhead of managing distributed systems like Postgres clusters, Neo4j instances, or Kafka streams. The files remain highly portable and adhere to open standards, meaning you can open them in any text editor, synchronize them via GitHub, or seamlessly import them into knowledge management tools like Obsidian without the need for conversion scripts.

How Does the BM25 and SQLite Architecture Power AI Agent Retrieval?

The retrieval layer in WUPHF ingeniously combines Bleve for BM25 full-text search with SQLite for structured queries, creating a powerful and efficient mechanism for AI agents to access information. Bleve efficiently indexes the Markdown content, providing a highly effective ranked keyword search capability. Concurrently, SQLite stores the append-only fact logs, manages entity relationships, and maintains various metadata tables. When an AI agent needs to recall information, short, factual lookups are intelligently routed directly to the BM25 index for rapid retrieval. In contrast, more complex, narrative-based queries trigger a sophisticated cited-answer loop that may invoke the LLM to synthesize a comprehensive response. The current benchmark, utilizing 500 artifacts and 50 queries, demonstrates an impressive 85% recall at 20 using BM25 alone. Should specific query classes fall below this established threshold, the system is designed to gracefully fall back to sqlite-vec for vector search, although this feature is not enabled by default. This dual-layer approach ensures the system remains fast and deterministic, while also avoiding the significant embedding generation costs associated with every write operation.

What Is the Karpathy-Style Knowledge Substrate for LLMs?

Andrej Karpathy has articulated a compelling vision for LLM-native knowledge systems, where AI agents actively read from and write into a persistent, evolving substrate. This concept allows context to accumulate and compound across multiple sessions, fundamentally transforming how agents interact with information. Traditional chat interfaces, by their very nature, lose conversational state every time a new session begins. A true knowledge substrate, however, remembers what transpired yesterday, last week, or even months prior, providing a continuous thread of understanding. The WUPHF wiki effectively implements this vision by granting AI agents write access to a Git-backed repository. Consequently, each agent session builds upon previous entries, rather than starting from a blank slate. The design prioritizes append-only fact logs and immutable canonical IDs, ensuring that knowledge grows organically and collaboratively, without the risk of accidental overwrites or data loss. This fosters a cumulative memory effect, where agent behavior and performance improve significantly as the wiki accrues more context about entities, user preferences, and past decisions.

How Do Private Notebooks and Team Wikis Coexist Within WUPHF?

The directory structure within WUPHF is thoughtfully designed to separate private agent scratchpads from shared, canonical knowledge, mirroring effective human collaboration patterns. Each individual agent is allocated its own private notebook, located at agents/{slug}/notebook/.md, which serves as a space for rough observations, experimental data, and temporary notes. This dedicated space allows agents to explore and generate content without immediately impacting the collective knowledge base. In contrast, the team/ directory houses the canonical wiki, which is accessible and readable by multiple agents. This crucial separation prevents one agent’s potential hallucinations, erroneous experiments, or preliminary findings from inadvertently polluting the shared knowledge base. Agents initially write their observations and findings to their private notebooks. Following a review process, these entries can then be promoted to the team wiki, complete with proper back-links to their original source. This workflow mirrors human processes where individuals maintain personal notes before contributing to a shared documentation system. The permissions model leverages standard filesystem access controls and Git ownership, ensuring a simple, transparent, and auditable security framework for managing access to both private and shared knowledge.

What Is the Draft-to-Wiki Promotion Flow for Agent-Generated Content?

The system incorporates a robust state machine that meticulously governs the lifecycle of content, particularly how it transitions from private notebooks to the canonical wiki. Entries within an agent’s notebook initially exist as drafts. These drafts then undergo a review process, which can be performed either by a human reviewer or by a secondary, designated AI agent, to ensure accuracy, relevance, and adherence to established knowledge standards. Upon successful approval, the entry is promoted to the team wiki, becoming part of the shared, verified knowledge base, and is automatically assigned a back-link to its original location within the agent’s notebook. This state machine also intelligently handles expiry dates for temporary information and facilitates the auto-archival of stale content, preventing knowledge bloat. This multi-stage promotion flow establishes a vital quality control layer, actively preventing the accumulation of outdated, incorrect, or unverified information within the canonical wiki. It ensures that the team wiki consistently represents vetted, high-quality knowledge, while private notebooks remain dynamic spaces for exploration, experimentation, and temporary storage.

Who Is Pam the Archivist and Why Does Provenance Matter in an Agent Wiki?

Pam the Archivist is the designated Git identity utilized by automated processes whenever they commit changes to the wiki. This distinct persona is assigned to actions performed by the system itself, such as when the synthesis worker rebuilds an entity brief or when the daily lint cron job rectifies a broken link. Consequently, any such commit immediately appears under Pam’s name and email within the git log. This clear separation makes it instantly apparent which changes originated from human interaction and which were generated by an AI agent. Provenance tracking is of paramount importance for effectively debugging and understanding agent behavior over time. For instance, if an entity brief contains incorrect information, developers can easily trace back through the Git commits to pinpoint precisely which facts were synthesized, by which process, and at what time. This comprehensive audit trail is invaluable for supporting compliance requirements, understanding the evolution of agent knowledge, and ultimately improving the reliability and accuracy of AI systems.

How Does the Lookup Command Route Queries Between BM25 and LLM?

The /lookup slash command and the Multi-Component Protocol (MCP) tool implement a sophisticated heuristic classifier designed to intelligently route queries based on their complexity and informational needs. Short, direct, and factual lookups, such as “What is the API key for Stripe?”, are efficiently routed directly to the BM25 index. This allows for immediate and cost-effective retrieval of precise information without involving a large language model. Conversely, longer, more narrative-driven queries, like “Explain the history of our authentication decisions,” trigger a comprehensive cited-answer loop. This process may invoke the LLM to synthesize information from multiple relevant sources. In this scenario, the system first retrieves pertinent source documents using BM25, then presents these documents to the LLM with clear citations, and finally returns a synthesized answer that is fully referenced. This hybrid approach is highly efficient, minimizing LLM token costs for simple lookups while enabling deep, complex reasoning when required. Critically, the routing decision itself occurs locally, without necessitating external API calls for the classification step, ensuring speed and privacy.

How Do Wikilinks Enable Graph-Like Relationships Without a Graph Database?

The WUPHF wiki leverages the widely understood and intuitive [[Wikilink]] syntax to establish explicit relationships between different pages and entities within its knowledge base. When an agent or human writes [[Authentication Strategy]] within a Markdown file, the system automatically recognizes this as a direct link to that specific entity or concept. The daily lint cron job plays a crucial role here by actively checking for broken links and rendering them in a visually distinct manner, typically red, if the target page or entity does not exist. This mechanism effectively creates a graph-like structure through explicit, human-readable linking rather than relying on implicit vector similarity. The underlying SQLite database meticulously tracks these edges in a structured table, enabling powerful graph traversal queries without the need for specialized graph databases like Neo4j. Users can query this edge table to discover all pages that link to a particular concept, thereby generating backlink indexes similar to those found in tools like Roam Research or Obsidian, but with the added benefit of being populated and maintained by AI agent-generated content.

What Role Does the Daily Lint Cron Play in Maintaining Wiki Quality?

A meticulously scheduled cron job performs essential daily maintenance on the wiki repository, serving as a critical quality assurance mechanism. This cron job actively scans for potential contradictions between entity briefs, identifies stale entries that have not been updated within a configurable time threshold, and rigorously validates all wikilinks throughout the knowledge base. When it detects broken links, it either flags them for human review or automatically removes them, depending on the system’s configuration. The lint pass also includes checks for formatting errors within the append-only fact logs and identifies any canonical ID collisions, ensuring data integrity. This automated maintenance process is indispensable for preventing knowledge rot in long-running agent deployments. Without such regular cleaning and validation, agent-generated wikis would inevitably accumulate obsolete information and suffer from broken references, which would significantly degrade the quality and reliability of information retrieval over time, ultimately impacting agent performance.

How Do Entity Fact Logs and Synthesis Workers Function for Knowledge Generation?

Entities within the WUPHF system maintain meticulously organized, append-only fact logs located at team/entities/{kind}-{slug}.facts.jsonl. Each individual fact is assigned a deterministic ID, which includes its sentence offset, ensuring canonical and stable references. When the cumulative count of facts associated with a particular entity surpasses a predefined threshold (N facts), the synthesis worker is automatically triggered. This worker then reconstructs and updates the entity brief based on the complete historical record of facts. This brief is presented as a Markdown file, providing a concise summary of the entity’s current state and understanding. The synthesis process operates as a background job and commits its results under the identity of “Pam the Archivist,” maintaining transparency and an audit trail. In scenarios where two facts might appear to contradict, the synthesis logic typically prioritizes newer information or flags the conflict for human review. The append-only design ensures that no historical data is ever lost, even if the synthesis process produces a more streamlined or updated summary, preserving a full lineage of knowledge.

Why Does 85% Recall at 20 Matter for AI Agent Retrieval Performance?

The builder established an internal ship gate of 85% recall at 20 on a benchmark comprising 500 artifacts and 50 distinct queries. This metric signifies that for any given query, the correct and relevant answer consistently appears within the top 20 BM25 search results 85% of the time. This specific recall threshold is critically important for AI agents because they cannot afford to miss crucial contextual information. If the retrieval layer fails to surface relevant facts, the large language model (LLM) is forced to generate responses based on incomplete or inaccurate information, leading to suboptimal outcomes. The 85% threshold represents a confidence level where BM25 alone is deemed sufficient for effective retrieval, thereby avoiding the additional latency and computational cost associated with vector embedding generation and similarity search. When specific classes of queries fall below this performance threshold, the system is designed to seamlessly activate sqlite-vec as a fallback mechanism, ensuring robust retrieval capabilities while prioritizing efficiency and minimal overhead without relying on vector dependencies as a primary mechanism.

How Does WUPHF Integrate with OpenClaw and Other Agent Frameworks?

WUPHF is designed as a versatile, collaborative office suite that readily accepts connections from a wide array of agent frameworks, promoting broad compatibility. For OpenClaw specifically, integration is straightforward: you simply configure WUPHF to point to your existing OpenClaw setup, and the wiki layer seamlessly attaches via a Multi-Component Protocol (MCP) tool. This means you are not required to rewrite your existing agents to leverage the wiki’s capabilities. The system is compatible with Claude Code, Codex, and various local LLMs running through OpenCode. The /lookup command and entity synthesis scripts are designed to shell out to your configured LLM CLI, meaning OpenClaw agents can both read from and write to the wiki using their current model configurations and API keys. This interoperability aligns perfectly with OpenClaw’s philosophy of local-first, self-hosted agent infrastructure, empowering users to maintain full control over their knowledge repository and computational resources.

What Are the Current Limitations and Tradeoffs of This Approach?

While offering significant advantages, the WUPHF system comes with several explicit limitations and design tradeoffs. The ongoing recall tuning means that while the 85% benchmark is impressive, it does not guarantee universal performance across all conceivable query types or domains. Furthermore, the quality of synthesized information is directly dependent on the accuracy and completeness of observations written by the agents themselves. “Garbage in, garbage out” applies here: garbage facts will inevitably produce garbage briefs, and the daily lint pass only addresses structural issues, not factual accuracy. The current implementation primarily supports a single-office scope, meaning there is no inherent federation mechanism between multiple wiki instances at this stage. Vector search remains a fallback rather than a primary retrieval mechanism. These constraints suggest that the wiki is most effective for teams managing localized agent deployments with established human review workflows, rather than for fully autonomous, unsupervised agent swarms operating at internet scale without oversight.

How Does WUPHF Compare to Obsidian Vaults for Knowledge Management?

The Hacker News discussion frequently brought up the comparison with Obsidian vaults, often posing the question, “why not just use Obsidian with a plugin?” While Obsidian excels at Markdown editing, robust linking, and human-centric knowledge organization, it fundamentally lacks the automated, programmatic write paths that AI agents require. WUPHF, in contrast, includes structured fact logs in JSONL format, automated synthesis workers that intelligently rebuild entity briefs, and the ability for agents to commit changes under distinct identities via Git. Obsidian plugins typically focus on syncing human-written notes or providing graphical interfaces, rather than ingesting continuous streams of agent observations from JSONL append logs. However, it is important to note that the underlying formats remain highly compatible. You can easily open a WUPHF wiki directory within Obsidian, and the wikilinks will render correctly, allowing for human browsing and interaction. The choice between WUPHF and Obsidian depends on the primary user: Obsidian is ideal for human-centric knowledge management, while WUPHF is specifically engineered for agent-native memory with robust automated synthesis, version control, and content promotion workflows.

What Should Builders Watch For Next in the WUPHF Ecosystem?

Builders and developers should closely monitor the integration timeline for sqlite-vec, as the builder has publicly committed to implementing vector fallback capabilities for scenarios where BM25 proves insufficient for particular query classes. Further developments in cross-office federation features, which would enable multiple wiki instances to synchronize their knowledge bases via Git remotes or Conflict-free Replicated Data Types (CRDTs), are also anticipated. Continued improvements in the recall benchmarks are likely as the BM25 tuning process evolves and matures. Community adoption patterns will also be a key indicator of WUPHF’s impact. If OpenClaw developers, for instance, begin shipping standard integration patterns for this wiki layer, it could establish a new industry baseline for agent memory architectures that effectively bypass the reliance on cloud-based vector database dependencies. The project is actively developed and open-source, located at github.com/nex-crm/wuphf, and enthusiastically welcomes contributions, particularly on areas such as the promotion state machine, refinement of retrieval heuristics, and ensuring the stability of canonical ID rules.

Frequently Asked Questions

Can I use this Karpathy-style wiki with my existing OpenClaw setup?

Yes. WUPHF attaches to existing agent setups including OpenClaw, Claude Code, and Codex. You point WUPHF at your current configuration and the wiki layer connects via an MCP tool. You do not need to migrate your entire stack or use the full WUPHF office suite to benefit from the Markdown-based memory layer.

How is this different from using a vector database like Pinecone or pgvector?

The wiki uses BM25 full-text search via Bleve and SQLite for structured metadata instead of embeddings. This provides deterministic retrieval based on exact term matching and provenance tracking through Git history. The builder chose this to avoid the overhead of Postgres, Neo4j, and Kafka stacks until absolutely necessary, achieving 85% recall at 20 on benchmarks without vectors.

What happens if the synthesis worker produces incorrect information?

The system relies on a draft-to-wiki promotion flow where entries undergo review before becoming canonical. Facts live in append-only JSONL logs with deterministic IDs, enabling audit trails. A daily lint cron catches contradictions and stale entries, but the system does not judge truth. Garbage facts produce garbage briefs, so human oversight remains essential during the promotion stage.

Is the wiki searchable without an LLM running?

Yes. The Bleve BM25 index and SQLite database operate independently of any LLM. You can query the wiki using the /lookup slash command or direct SQL against the SQLite files. The heuristic classifier only invokes the LLM for narrative queries requiring cited answers, while short lookups hit the local BM25 index directly.

How do I migrate data if I want to stop using WUPHF?

The wiki stores everything as plain Markdown and JSONL files in ~/.wuphf/wiki/, tracked by Git. You can git clone the directory to export your complete knowledge base. Because the format uses standard Markdown with wikilinks and canonical entity IDs, you can import the content into Obsidian, GitHub, or any Markdown-compatible system without vendor lock-in.

Conclusion

A new local-first wiki uses Markdown and Git to give AI agents persistent memory without vector databases, inspired by Andrej Karpathy's knowledge substrate vision.