← Free AI Tools

RAG & embedding cost calculator

Rough monthly cost for a typical retrieval setup: maintaining a text embedding index, embedding user queries, and running your chat model on retrieved context plus prompts. Tune numbers to match your pipeline — excludes vector DB hosting and re-rankers.

Vector index & embeddings

Sum of embedded chunk tokens across your corpus.

New docs, edits, or full re-embeds — your ops, not your users.

Queries & generation

Estimated monthly API cost

Total

$211.64

Embedding + LLM only. Embeddings: April 2026. LLM tables: April 2026.

Breakdown

  • Index re-embedding (maintenance)$0.016
  • Query embeddings$0.120
  • Chat completions (RAG answers)$211.50

Embeddings subtotal: $0.136

Not included

Vector DB hosting (Pinecone, pgvector on RDS, etc.), re-ranking models, OCR, or multimodal embedding. Add those separately.

Simple monthly LLM cost (no RAG)Need cost per customer & feature? → PerUnit