Indexing Vectors and Querying VectorStore Retrievers
Once text documents are split into chunks, we must convert them into vector numbers and save them into a database. LangChain integrates with Vector Stores to store these arrays and provides Retrievers to query them.
1. Vector Database Workflow
graph TD
A[Text chunks] -->|OpenAIEmbeddings| B[Compute 1536-dim vector values]
B -->|Index| C[Vector Database: MemoryVectorStore]
D[User search query] -->|Embeddings| E[Query vector]
E -->|Cosine similarity| C
C -->|Return top K docs| F[Retrieved Context Docs]2. Setting Up MemoryVectorStore with OpenAI Embeddings
For fast local testing, use the built-in in-memory vector store database:
// src/services/vectorService.ts
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Document } from "@langchain/core/documents";
// Initialize embeddings calculator
const embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
});
export async function indexAndRetrieveDocs(chunks: Document[], query: string) {
// 1. Create vector store database and index all chunks
const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);
// 2. Convert vector store into a Retriever node query helper
const retriever = vectorStore.asRetriever({
k: 2, // Limit search results: return top 2 matching chunks
});
// 3. Search query to return matching documents
const relevantDocs = await retriever.invoke(query);
console.log("Top matching chunk content:", relevantDocs[0]?.pageContent);
return relevantDocs;
}3. Production Vector Stores
In production environments, memory vector stores clear their contents on server redeployment. Swap the MemoryVectorStore integration adapter with persistent external cloud providers (such as Pinecone, Supabase PGVector, or Chroma) to maintain your indexed database.
Published on Last updated: