Claude Code is powerful, but it forgets everything between sessions. Every new conversation starts from zero. No memory of what you decided yesterday, what’s running on your server, or where that config file lives.

I built vault-mcp-server to fix this. It turns any folder of Markdown files into a persistent, searchable knowledge base that Claude Code can read and write to, across sessions, across machines.

The problem

I keep a personal knowledge vault in Markdown: project docs, server configs, learning notes, career plans. Around 100+ files organized by topic. Every time I started a Claude Code session, I had to re-explain context that was already written down somewhere in those files.

The MCP protocol lets Claude Code connect to external tools. So I built a server that exposes my vault through 7 tools: list, read, write, edit, summary, semantic search, and reindex.

Now Claude Code can just look things up.

What it looks like in practice

Me: "What do I have documented about Docker networking?"

Claude: *calls vault_search("Docker networking")*
→ Returns 3 relevant chunks from different files

Me: "Update the status in homeserver-backlog.md to Done"

Claude: *calls vault_edit("homeserver/homeserver-backlog.md",
         "**Status:** Active", "**Status:** Done")*
→ Done. One line changed.

No copy-pasting context. No “let me read that file for you.” It just knows where things are.

Architecture

The server is a Python app built with FastMCP using Streamable HTTP transport. It runs in Docker and mounts your Markdown folder as a volume.

graph TD A["Claude Code
any machine"] <-->|"MCP over HTTP"| B["vault-mcp-server
FastMCP + Streamable HTTP"] B --> C["Markdown Vault
your files"] B --> D["ChromaDB
semantic index"]

Semantic search is the key feature. The server chunks your documents by H2 headers, embeds them with paraphrase-multilingual-MiniLM-L12-v2, and stores them in ChromaDB. When Claude searches, it gets back the relevant chunks, not entire files. This saves tokens and gives more precise answers.

Indexing is incremental: on startup it only re-embeds files that changed since last run.

Setup in 3 minutes

git clone https://github.com/thebackpackdevorg/vault-mcp-server.git
cd vault-mcp-server

Edit docker-compose.yml to point to your Markdown folder:

volumes:
  - /path/to/your/notes:/vault
docker compose up -d
claude mcp add vault http://localhost:8091/mcp -t http

That’s it. Claude Code now has access to your vault.

Going remote: the Cloudflare Tunnel setup

Running locally is fine for one machine. But I have multiple devices: a work laptop, a home PC, a homeserver. I wanted all of them to access the same vault.

The solution: run the server on the homeserver, expose it through a Cloudflare Tunnel, and protect it with Cloudflare Access using a Service Token.

On remote machines, the Claude Code config looks like:

{
  "mcpServers": {
    "vault": {
      "type": "http",
      "url": "https://vault.yourdomain.com/mcp",
      "headers": {
        "CF-Access-Client-Id": "${VAULT_CF_CLIENT_ID}",
        "CF-Access-Client-Secret": "${VAULT_CF_CLIENT_SECRET}"
      }
    }
  }
}

The debugging story: SSE vs Streamable HTTP

This is the part that cost me hours and isn’t documented anywhere else.

When I first deployed behind Cloudflare, Claude Code would connect to the server but never get responses back. No errors, just silence. The MCP handshake completed, but tool calls hung forever.

The issue: Cloudflare buffers Server-Sent Events (SSE) by default. The original MCP transport uses SSE, and Cloudflare was holding the response stream open, waiting for it to “finish” before forwarding. That never happens with SSE.

The fix was switching to Streamable HTTP transport. Instead of a persistent SSE stream, each tool call is a regular HTTP request/response. Cloudflare handles these normally.

But there was a second gotcha: Claude Code sends requests without the Accept: application/json, text/event-stream header that the MCP spec requires. The server rejects these silently.

So I added a middleware that injects the correct Accept header if it’s missing:

class AcceptHeaderMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        accept = request.headers.get("accept", "")
        if "text/event-stream" not in accept or "application/json" not in accept:
            # Inject the required header
            request._headers = request.headers.mutablecopy()
            request._headers["accept"] = "application/json, text/event-stream"
        return await call_next(request)

If you’re deploying any MCP server behind a reverse proxy, save yourself the debugging: use Streamable HTTP, not SSE.

Metadata parsing

The server understands structured metadata in your Markdown files. If you use **Key:** Value patterns in the first 15 lines:

# My Project

**Status:** Active
**Created:** 2026-01-15
**Last Updated:** 2026-03-01

These become filterable fields. vault_list(status="Active") returns only active documents. vault_summary() gives you a dashboard with counts by domain and status.

It handles English and Spanish metadata out of the box. **Estado:** Activo works the same as **Status:** Active.

Why Markdown, not a database

I considered SQLite, JSON files, even a proper CMS. But Markdown won for three reasons:

  1. Git-friendly. Every change is a diff you can review.
  2. Editor-friendly. Edit in VS Code, Obsidian, vim, or any text editor.
  3. LLM-friendly. No serialization layer, no ORM, no schema migrations.

The vault is just a folder. Back it up with git. Sync it with Nextcloud. The MCP server is a read/write layer on top. If it goes down, your files are still there.

Try it

The server is open source and ready to use:

vault-mcp-server on GitHub

Clone, point it at your Markdown folder, and Claude Code gets persistent memory. If you deploy it remotely, watch out for the SSE gotcha. Streamable HTTP is your friend.