llm_response_handler
An asynchronous function that’s responsible for returning the LLM’s response. This function is
called every time a response is expected from the agent during a session. Inside this function, you
can truncate the chat history, call a RAG pipeline, use any LLM provider, or add any other logic
that controls the LLM’s response.
Example usage
Parameters
input
parameter:
Returns
A stream of chat completion chunks. Each of these chunks must be in the format below, which conforms to OpenAI’s streamed chat completion chunk specification. This means that if you use an LLM client that conforms to OpenAI’s API, such as the officialopenai
package, the
generated responses will be in the correct format automatically.
Example returned chunks: