← Free AI Tools
RAG & embedding cost calculator
Rough monthly cost for a typical retrieval setup: maintaining a text embedding index, embedding user queries, and running your chat model on retrieved context plus prompts. Tune numbers to match your pipeline — excludes vector DB hosting and re-rankers.
Vector index & embeddings
Sum of embedded chunk tokens across your corpus.
New docs, edits, or full re-embeds — your ops, not your users.
Queries & generation
Estimated monthly API cost
Total
$211.64
Embedding + LLM only. Embeddings: April 2026. LLM tables: April 2026.
Breakdown
- Index re-embedding (maintenance)$0.016
- Query embeddings$0.120
- Chat completions (RAG answers)$211.50
Embeddings subtotal: $0.136
Not included
Vector DB hosting (Pinecone, pgvector on RDS, etc.), re-ranking models, OCR, or multimodal embedding. Add those separately.