Skip to main content

MarkdownTextSplitter

If your content is in Markdown format then MarkdownTextSplitter. This class will split your content into documents based on the Markdown headers. For example, if you have the following Markdown content:

# Header 1

This is some content.

## Header 2

This is some more content.

# Header 3

This is even more content.

Then the MarkdownTextSplitter will split the content into three documents:

import { MarkdownTextSplitter } from "langchain/text_splitter";

const text = `# Header 1

This is some content.

## Header 2

This is some more content.

# Header 3

This is even more content.`;

const splitter = new MarkdownTextSplitter();

const output = await splitter.createDocuments([text], {
metadata: "something",
});
/*
[
{
"pageContent": "# Header 1\n\nThis is some content.",
"metadata": "something"
},
{
"pageContent": "## Header 2\n\nThis is some more content.",
"metadata": "something"
},
{
"pageContent": "# Header 3\n\nThis is even more content.",
"metadata": "something"
}
]
*/