Text Splitters: Examples
📄️ CharacterTextSplitter
Besides the RecursiveCharacterTextSplitter, there is also the more standard CharacterTextSplitter. This splits only on one type of character (defaults to "\n\n"). You can use it in the exact same way.
📄️ MarkdownTextSplitter
If your content is in Markdown format then MarkdownTextSplitter. This class will split your content into documents based on the Markdown headers. For example, if you have the following Markdown content:
📄️ RecursiveCharacterTextSplitter
The recommended TextSplitter is the RecursiveCharacterTextSplitter. This will split documents recursively by different characters - starting with "\n\n", then "\n", then " ". This is nice because it will try to keep all the semantically relevant content in the same place for as long as possible.
📄️ TokenTextSplitter
Finally, TokenTextSplitter splits a raw text string by first converting the text into BPE tokens, then split these tokens into chunks and convert the tokens within a single chunk back into text.