This project is a full-stack web application that allows users to upload or pre-load PDF documents, convert them into vector embeddings, and store them in a Pinecone index. It provides a chat interface where users can ask questions about the content of these PDFs. The system retrieves relevant text chunks from the vector store based on the user's query and uses OpenAI's GPT models to generate contextual answers.
Features
Chat with multiple large PDF documents using natural language
Uses GPT-4 (or GPT-3.5-turbo) for high-quality answer generation
Vector storage and retrieval via Pinecone for scalable document handling
Streaming responses for a real-time chat experience
Customizable ingestion pipeline for PDF parsing and chunking
Quickstart
git clone https://github.com/davidbmar/gpt4-pdf-chatbot-langchain.git
cd gpt4-pdf-chatbot-langchain
pnpm install
cp .env.example .env
# Edit .env with your OPENAI_API_KEY, PINECONE_API_KEY, PINECONE_ENVIRONMENT, and PINECONE_INDEX_NAME
mkdir docs
# Add your PDF files to the docs folder
pnpm run ingest
pnpm run dev
The backend is built with Next.js API routes handling the logic. It uses LangChain for orchestrating the LLM calls, document loading, and text splitting. PDFs are parsed using a custom loader, split into chunks, and embedded using OpenAI's embedding model. These embeddings are stored in Pinecone. During chat, the system performs similarity search in Pinecone, constructs a prompt with retrieved context, and streams the GPT response back to the client.
How it runs
sequenceDiagram
participant U as User
participant N as Next.js Frontend
participant A as API Route (/api/chat)
participant P as Pinecone
participant O as OpenAI API
U->>N: Types question
N->>A: POST /api/chat {question, history}
A->>P: Similarity Search (Query Vector)
P-->>A: Relevant Document Chunks
A->>O: Send Prompt with Context & Question
O-->>A: Stream Token Response
A-->>N: Stream SSE Data
N-->>U: Display Answer Incrementally
How to apply & reuse
Developers can use this as a starter template for building document-specific Q&A systems. It demonstrates how to integrate vector databases with LLMs for retrieval-augmented generation (RAG). It can be extended to support other document types, different vector stores, or customized prompting strategies for specific domains.