Ingest Your First Library

Jeremy supports two methods for ingesting documentation: llms.txt files and URL crawling.

Method 1: llms.txt (Recommended)

The llms.txt standard is a convention where websites publish a machine-readable index of their documentation. Many popular libraries support it. This is the preferred method because it provides structured, comprehensive coverage of a library's docs.

jeremy add --name react --id react --llms-txt https://react.dev/llms.txt

The --name flag is a human-readable label, and --id is a unique slug used to reference the library in queries.

How it works

Jeremy fetches the llms.txt file, which contains links to individual documentation pages.
Each linked page is fetched and its content is extracted.
The content is split into chunks sized for embedding.
The chunks are sent to the Jeremy server, which generates vector embeddings and stores them in Vectorize.

Method 2: URL Crawl

For libraries that don't publish an llms.txt file, you can ingest a single page by URL:

jeremy add --name mylib --id mylib --url https://mylib.dev/docs

This crawls the given URL, extracts its content, chunks it, and ingests it the same way.

When to use URL crawl

The library doesn't have an llms.txt file.
You want to ingest a specific page rather than an entire doc site.
You're ingesting internal or private documentation.

What Happens During Ingestion

Regardless of the method, the ingestion process follows these steps:

Fetch -- the CLI downloads the documentation content from the provided URL(s).
Chunk -- the content is split into smaller pieces (chunks) that fit within embedding model token limits. Each chunk preserves its source URL and title as metadata.
Upload -- the chunks are sent to the Jeremy API's /api/ingest endpoint.
Embed -- the server generates vector embeddings for each chunk using Cloudflare Workers AI.
Store -- the embeddings are indexed in Cloudflare Vectorize, and the library metadata is saved to D1.

Once ingestion completes, the library is immediately queryable via the API, MCP server, or dashboard.

Managing Libraries

After ingestion, you can manage your libraries with the CLI:

# List all ingested libraries
jeremy list

# Re-ingest a library from its original source
jeremy refresh --id react

# Remove a library
jeremy delete --id react

Next Steps

CLI reference -- full documentation for all CLI commands.
Use the MCP server -- query your ingested libraries from AI assistants.
API reference -- access ingestion and search programmatically.