Introduction
Jeremy is a self-hosted documentation RAG system that makes library docs queryable by AI assistants.
Jeremy is a self-hosted documentation RAG (Retrieval-Augmented Generation) system built on Cloudflare Workers. It ingests library documentation and makes it queryable through semantic search, so AI coding assistants can look up accurate, up-to-date docs on demand.
Why Jeremy?
AI assistants have knowledge cutoffs and can't access the latest documentation for every library. Jeremy solves this by:
- Ingesting docs from any source -- point it at an
llms.txtfile or a documentation URL, and Jeremy fetches, chunks, and embeds the content. - Semantic search via vector embeddings -- queries return the most relevant documentation chunks, not just keyword matches.
- Serving context to AI assistants -- the MCP server integrates directly with Claude Code and other MCP-compatible tools, giving them access to your ingested docs.
Key Features
- MCP Server -- an MCP (Model Context Protocol) server that AI assistants like Claude Code can use to query documentation in real time.
- CLI -- a command-line tool for ingesting, listing, refreshing, and deleting libraries.
- Dashboard -- a web UI for managing your account, API keys, and libraries.
- REST API -- programmatic access to search, context retrieval, ingestion, and library management.
How It Works
- Ingest -- you provide a documentation source (an
llms.txtURL or a web page URL). Jeremy fetches the content, splits it into chunks, and sends it to the server. - Embed -- the server generates vector embeddings for each chunk using Cloudflare Workers AI and stores them in Cloudflare Vectorize.
- Query -- when an AI assistant needs documentation, it calls Jeremy's API (or MCP tools). Jeremy performs a semantic vector search and returns the most relevant chunks.
Architecture
Jeremy runs entirely on Cloudflare's platform:
- Workers -- handles API requests, authentication, and orchestration.
- D1 -- stores library metadata, user accounts, and API keys.
- Vectorize -- stores and searches vector embeddings for documentation chunks.
- Workers AI -- generates embeddings from documentation text.
Next Steps
- Get started in 5 minutes -- create an account, install the CLI, and ingest your first library.
- Install the CLI and MCP server -- detailed setup instructions.
- Use the Dashboard -- manage libraries and API keys through the web UI.