RAG Document Chatbot

Upload any document and ask questions — the system finds the relevant passages and answers using Claude Haiku. No hallucinations, no guessing.

Role Solo Developer

Status Live

Type RAG / AI Search

The Problem

Businesses sit on massive amounts of internal documents — manuals, contracts, reports, SOPs — but can't query them intelligently. Keyword search misses context. Generic AI tools hallucinate answers. Teams waste time hunting for information that already exists in a file somewhere.

The Solution

A full Retrieval-Augmented Generation (RAG) pipeline. Upload a document, ask a question, get an answer grounded in the actual content — not invented by the model. The system retrieves only the relevant passages before generating a response, so answers are accurate and traceable.

Document upload: Drop in a PDF or text file — it's chunked and indexed automatically.
Semantic search: Questions are matched by meaning, not just keywords — finds relevant content even when phrasing differs.
Grounded answers: Claude Haiku only answers from retrieved chunks — no hallucinations from outside the document.
Persistent storage: Embeddings are stored in Supabase — the document stays queryable across sessions.

How It's Built

The pipeline runs in two phases. At upload time: the document is split into chunks, each chunk is embedded using Voyage AI's voyage-3-lite model, and the vectors are stored in Supabase with pgvector. At query time: the question is embedded the same way, a cosine similarity search retrieves the top matching chunks, and those chunks are passed to Claude Haiku as context to generate the answer.

Backend: Python + FastAPI, deployed on Render
Embeddings: Voyage AI (voyage-3-lite, 512 dimensions) — fast, lightweight, no local model
Vector DB: Supabase pgvector — stores and searches embeddings via cosine similarity
AI: Claude Haiku via Anthropic API — answers grounded in retrieved document chunks
Frontend: Astro + Vanilla JS — file upload, chat UI, live on Vercel

Try the Live Demo →