Splitting Text with RecursiveCharacterTextSplitter

Why split files into smaller chunks? Large books or manuals exceed LLM context limit windows, and processing them entirely is expensive. Text Splitters slice long documents into small, coherent chunks before generating database vector representations.

1. Why Recursive Splitting is Preferred?

Simple text splitters cut strings after a fixed character count, frequently splitting sentences in half and separating key subjects.

RecursiveCharacterTextSplitter is smarter. It attempts to split by a prioritized array of characters:

Paragraph boundaries (\n\n)
Line boundaries (\n)
Space characters ( )
Empty strings (individual letters)

This hierarchy keeps related sentences within the same chunk block.

2. Implementing the Splitter in Node.js

Configure size limit rules:

// src/services/textSplitting.ts
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { Document } from "@langchain/core/documents";

export async function splitDocumentsIntoChunks(rawDocs: Document[]) {
  // 1. Instantiate the splitter with configuration parameters
  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000,     // Target limit count per chunk (characters or tokens)
    chunkOverlap: 200,   // Number of overlapping characters between adjacent chunks
  });

  // 2. Process documents array
  const splitChunks = await splitter.splitDocuments(rawDocs);

  console.log("Original docs count:", rawDocs.length);
  console.log("Generated chunks count:", splitChunks.length);
  
  return splitChunks;
}

3. Understanding Overlap

Setting a chunkOverlap value (e.g. 200 characters) ensures that the end of Chunk 1 contains the beginning text of Chunk 2. This overlap prevents semantic loss at boundary seams.

Chunk 1 (Chars 0 - 1000)

... the product price is $250 per user.

Chunk 2 (Chars 800 - 1800)

the product price is $250 per user. Monthly tiers are billed ...

Published on Jun 16, 2026 Last updated: Jun 16, 2026

Getting Started

Templates Parsers

Chains Memory

Rag Retrieval

Practice Project

Resources

Splitting Text with RecursiveCharacterTextSplitter

1. Why Recursive Splitting is Preferred?

2. Implementing the Splitter in Node.js

3. Understanding Overlap