RAG Document Chatbot
Upload any document and ask questions — the system finds the relevant passages and answers using Claude Haiku. No hallucinations, no guessing.
The Problem
Businesses sit on massive amounts of internal documents — manuals, contracts, reports, SOPs — but can't query them intelligently. Keyword search misses context. Generic AI tools hallucinate answers. Teams waste time hunting for information that already exists in a file somewhere.
The Solution
A full Retrieval-Augmented Generation (RAG) pipeline. Upload a document, ask a question, get an answer grounded in the actual content — not invented by the model. The system retrieves only the relevant passages before generating a response, so answers are accurate and traceable.
- Document upload: Drop in a PDF or text file — it's chunked and indexed automatically.
- Semantic search: Questions are matched by meaning, not just keywords — finds relevant content even when phrasing differs.
- Grounded answers: Claude Haiku only answers from retrieved chunks — no hallucinations from outside the document.
- Persistent storage: Embeddings are stored in Supabase — the document stays queryable across sessions.
How It's Built
The pipeline runs in two phases. At upload time: the document is split into chunks, each chunk is embedded using Voyage AI's voyage-3-lite model, and the vectors are stored in Supabase with pgvector. At query time: the question is embedded the same way, a cosine similarity search retrieves the top matching chunks, and those chunks are passed to Claude Haiku as context to generate the answer.
- Backend: Python + FastAPI, deployed on Render
- Embeddings: Voyage AI (
voyage-3-lite, 512 dimensions) — fast, lightweight, no local model - Vector DB: Supabase pgvector — stores and searches embeddings via cosine similarity
- AI: Claude Haiku via Anthropic API — answers grounded in retrieved document chunks
- Frontend: Astro + Vanilla JS — file upload, chat UI, live on Vercel