References
LLM Response Handler
llm_response_handler
An asynchronous function that’s responsible for returning the LLM’s response. This function is called every time a response is expected from the agent during a session. Inside this function, you can truncate the chat history, call a RAG pipeline, use any LLM provider, or add any other logic that controls the LLM’s response.
Example usage
Parameters
Example input
parameter:
Returns
A stream of chat completion chunks. Each of these chunks must be in the format
below, which conforms to OpenAI’s streamed chat completion chunk specification. This means that if
you use an LLM client that conforms to OpenAI’s API, such as the official openai
package, the
generated responses will be in the correct format automatically.
Example returned chunks: