Lesson 3.3 — Chunking & Segmentation Strategies

Introduction: The Art of the Bite-Sized Piece

In the world of AI, size matters. Large language models have a limited context window, which means they can only process a certain amount of text at a time. If you try to feed a massive document into an embedding model, you will either get an error or a very poor quality embedding.

This is where chunking and segmentation come in. These are the processes of breaking down large documents into smaller, more manageable pieces, or "chunks." This is a critical step in preparing your data for the Vector Store, and it has a major impact on the performance of your AI agent.

This lesson will explore the art and science of chunking and segmentation, and provide you with a range of practical strategies for finding the optimal chunk size for your data.

Example YAML header to use in a File

---
title: "Framework-HR"
source: "Framework-HR.md"
tags: ["M&A", "Framework", "Framework-HR"]
version: "1.0"
last_updated: "2025-09-20"
---

Why is Chunking So Important?

Chunking is important for two main reasons:

  1. Context Window Limitations: As mentioned above, LLMs have a limited context window. By breaking down a large document into smaller chunks, you can ensure that each chunk fits within the model's context window.

  2. Retrieval Accuracy: When an agent retrieves information from the Vector Store, it is retrieving individual chunks, not entire documents. If your chunks are too large, they may contain a lot of irrelevant information, which will dilute the meaning of the chunk and make it harder for the agent to find the specific information it is looking for.

The goal of chunking is to create meaningful units of text that can be embedded and retrieved effectively. The optimal chunking strategy will depend on the specific use case and the nature of the data [1].

Common Chunking Strategies

There are many different ways to chunk a document, and the best approach will depend on the specific characteristics of your data. Here are some of the most common strategies:

Strategy
Description

Fixed-Size Chunking

The simplest approach, where you split the text into chunks of a fixed number of characters or tokens. This is easy to implement, but it can be problematic as it can split sentences and paragraphs in the middle, destroying their meaning.

Content-Aware Chunking

A more sophisticated approach that takes into account the structure of the content. For example, you could split a document into chunks based on its headings, paragraphs, or other structural elements. This is more likely to produce meaningful chunks, but it requires more effort to implement.

Recursive Chunking

A hybrid approach that combines fixed-size and content-aware chunking. It starts by splitting the text into large, content-aware chunks, and then recursively splits those chunks into smaller, fixed-size chunks until they reach the desired size. This is often the most effective approach, as it balances the need for meaningful chunks with the need to stay within the context window limitations.

Finding the Optimal Chunk Size

There is no magic number for the optimal chunk size. It will depend on a variety of factors, including:

  • The embedding model you are using: Different models have different context window sizes.

  • The nature of your data: Some types of data, such as legal documents, may require larger chunks to preserve their meaning, while other types of data, such as social media posts, may be better suited to smaller chunks.

  • The specific use case: The optimal chunk size for a question-answering agent may be different from the optimal chunk size for a document summarization agent.

The best way to find the optimal chunk size is to experiment with different sizes and see what works best for your specific use case. You can use a tool like the Chunking & Segmentation Worksheet below to help you with this process.

Chunking & Segmentation Worksheet

Parameter
Value

Chunking Strategy

[Fixed-Size, Content-Aware, Recursive]

Chunk Size

[Number of characters or tokens]

Overlap

[Number of characters or tokens to overlap between chunks]

Conclusion: The Foundation of Retrieval

Chunking and segmentation are the unsung heroes of Retrieval-Augmented Generation. They may not be the most glamorous part of the process, but they are absolutely essential for building a high-performing AI agent.

By carefully considering the nature of your data and experimenting with different chunking strategies, you can create a knowledge base that is optimized for retrieval and that will provide your agent with the information it needs to answer questions and generate high-quality responses.

In our next lesson, we will explore the role of metadata and learn how it can be used to provide your agent with a

Last updated