Skip to main content

Additional Functionality: Chat Models

We offer a number of additional features for chat models. In the examples below, we'll be using the ChatOpenAI model.

Additional Methods

LangChain provides a number of additional methods for interacting with chat models:

import { ChatOpenAI } from "langchain/chat_models/openai";
import { HumanChatMessage, SystemChatMessage } from "langchain/schema";

export const run = async () => {
const chat = new ChatOpenAI({ modelName: "gpt-3.5-turbo" });
// Pass in a list of messages to `call` to start a conversation. In this simple example, we only pass in one message.
const responseA = await chat.call([
new HumanChatMessage(
"What is a good name for a company that makes colorful socks?"
),
]);
console.log(responseA);
// AIChatMessage { text: '\n\nRainbow Sox Co.' }

// You can also pass in multiple messages to start a conversation.
// The first message is a system message that describes the context of the conversation.
// The second message is a human message that starts the conversation.
const responseB = await chat.call([
new SystemChatMessage(
"You are a helpful assistant that translates English to French."
),
new HumanChatMessage("Translate: I love programming."),
]);
console.log(responseB);
// AIChatMessage { text: "J'aime programmer." }

// Similar to LLMs, you can also use `generate` to generate chat completions for multiple sets of messages.
const responseC = await chat.generate([
[
new SystemChatMessage(
"You are a helpful assistant that translates English to French."
),
new HumanChatMessage(
"Translate this sentence from English to French. I love programming."
),
],
[
new SystemChatMessage(
"You are a helpful assistant that translates English to French."
),
new HumanChatMessage(
"Translate this sentence from English to French. I love artificial intelligence."
),
],
]);
console.log(responseC);
/*
{
generations: [
[
{
text: "J'aime programmer.",
message: AIChatMessage { text: "J'aime programmer." },
}
],
[
{
text: "J'aime l'intelligence artificielle.",
message: AIChatMessage { text: "J'aime l'intelligence artificielle." }
}
]
]
}
*/
};

Streaming

Similar to LLMs, you can stream responses from a chat model. This is useful for chatbots that need to respond to user input in real-time.

import { CallbackManager } from "langchain/callbacks";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { HumanChatMessage } from "langchain/schema";

export const run = async () => {
const chat = new ChatOpenAI({
maxTokens: 25,
streaming: true,
callbackManager: CallbackManager.fromHandlers({
async handleLLMNewToken(token: string) {
console.log({ token });
},
}),
});

const response = await chat.call([new HumanChatMessage("Tell me a joke.")]);

console.log(response);
// { token: '' }
// { token: '\n\n' }
// { token: 'Why' }
// { token: ' don' }
// { token: "'t" }
// { token: ' scientists' }
// { token: ' trust' }
// { token: ' atoms' }
// { token: '?\n\n' }
// { token: 'Because' }
// { token: ' they' }
// { token: ' make' }
// { token: ' up' }
// { token: ' everything' }
// { token: '.' }
// { token: '' }
// AIChatMessage {
// text: "\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."
// }
};

Adding a timeout

By default, LangChain will wait indefinitely for a response from the model provider. If you want to add a timeout, you can pass a timeout option, in milliseconds, when you instantiate the model. For example, for OpenAI:

import { ChatOpenAI } from "langchain/chat_models/openai";
import { HumanChatMessage } from "langchain/schema";

export const run = async () => {
const chat = new ChatOpenAI(
{ temperature: 1, timeout: 1000 } // 1s timeout
);

const response = await chat.call([
new HumanChatMessage(
"What is a good name for a company that makes colorful socks?"
),
]);
console.log(response);
// AIChatMessage { text: '\n\nRainbow Sox Co.' }
};

Currently, the timeout option is only supported for OpenAI models.

Dealing with Rate Limits

Some providers have rate limits. If you exceed the rate limit, you'll get an error. To help you deal with this, LangChain provides a maxConcurrency option when instantiating a Chat Model. This option allows you to specify the maximum number of concurrent requests you want to make to the provider. If you exceed this number, LangChain will automatically queue up your requests to be sent as previous requests complete.

For example, if you set maxConcurrency: 5, then LangChain will only send 5 requests to the provider at a time. If you send 10 requests, the first 5 will be sent immediately, and the next 5 will be queued up. Once one of the first 5 requests completes, the next request in the queue will be sent.

To use this feature, simply pass maxConcurrency: <number> when you instantiate the LLM. For example:

import { ChatOpenAI } from "langchain/chat_models/openai";

const model = new ChatOpenAI({ maxConcurrency: 5 });

Dealing with API Errors

If the model provider returns an error from their API, by default LangChain will retry up to 6 times on an exponential backoff. This enables error recovery without any additional effort from you. If you want to change this behavior, you can pass a maxRetries option when you instantiate the model. For example:

import { ChatOpenAI } from "langchain/chat_models/openai";

const model = new ChatOpenAI({ maxRetries: 10 });