This guide will briefly introduce you to the necessary components of the agent. We’ll show you how to:

  • Configure sessions, which involves setting fields such as your speech-to-text provider, initial messages, etc.
  • Define how your LLM responds to users during sessions.
  • Define your agent.

A basic agent, which includes each of these components, looks like this:

async def configure_session(input: ConfigureSessionInput):
    return SessionConfig(
        initial_messages=[],
        vad=VAD.Silero(),
        stt=STT.Deepgram(api_key=os.environ["DEEPGRAM_API_KEY"]),
        tts=TTS.ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"]),
    )

async def llm_response_handler(input: LLMResponseHandlerInput):
    client = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = input["messages"]
    completion = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True,
    )
    return completion

agent = Agent(
    id="agent-id-1234",
    configure_session=configure_session,
    llm_response_handler=llm_response_handler,
)

We’ll explain each of these components below.

Configuring sessions

Sessions are configured in the configure_session function, which is called at the beginning of every session. It’s where you set fields such as your speech-to-text provider, initial messages, etc. Here’s a simple example:

async def configure_session(input: ConfigureSessionInput):
    return SessionConfig(
        initial_messages=[],
        vad=VAD.Silero(),
        stt=STT.Deepgram(api_key=os.environ["DEEPGRAM_API_KEY"]),
        tts=TTS.ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"]),
    )

In the configure_session function, you can specify arbitrary fields that will be available throughout a session, which is particularly useful if your LLM logic needs access to user-specific fields, like a user ID. Here’s how you can make a my_user_id field available throughout a session:

async def configure_session(input: ConfigureSessionInput):
    return SessionConfig(
        initial_messages=[],
        vad=VAD.Silero(),
        stt=STT.Deepgram(api_key=os.environ["DEEPGRAM_API_KEY"]),
        tts=TTS.ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"]),
        session_data={
          "my_user_id": "test-1234"
        }
    )

The configure_session function can also accept custom fields sent from wherever you start the session. For example, here’s how you can pass in a my_user_timezone field:

async def configure_session(input: ConfigureSessionInput):
    user_timezone = input["custom_data"]["my_user_timezone"]
    return SessionConfig(
        initial_messages=[],
        vad=VAD.Silero(),
        stt=STT.Deepgram(api_key=os.environ["DEEPGRAM_API_KEY"]),
        tts=TTS.ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"]),
    )

If you’d like to see how to start sessions (and specify custom data when doing so), see the Starting Sessions guide.

For more details about the configure_session function, see the Configuring Sessions reference.

LLM responses

The llm_response_handler function is responsible for returning the LLM’s response during a session. It’s called every time a response is expected from the agent.

Here’s an example:

async def llm_response_handler(input: LLMResponseHandlerInput):
    client = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = input["messages"]
    completion = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True,
    )
    return completion

Inside the llm_response_handler, you can truncate the chat history, call a RAG pipeline, use any LLM provider, or add any other logic that controls the LLM’s response.

The llm_response_handler can accept the custom fields that you define when configuring the session. For example, if you specified a user ID field called my_user_id when configuring the session, you can access it like this in your llm_response_handler:

async def llm_response_handler(input: LLMResponseHandlerInput):
    client = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])
    user_id = input["session_data"]["my_user_id"]
    messages = input["messages"]
    completion = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True,
    )
    return completion

For more details about this function, see the LLM Response Handler reference.

Agent

The Agent is an object that contains all of your agent’s functionality.

The Agent below is the simplest possible agent. It contains the two functions defined above (configure_session and llm_response_handler). It also contains your agent ID, which was automatically added to the agent in the Quickstart guide.

agent = Agent(
    id="agent-id-1234",
    configure_session=configure_session,
    llm_response_handler=llm_response_handler,
)

The Agent is also where you’ll define event handlers and function calls (i.e. tool calls), which are covered in other guides.

For a complete list of fields on the Agent, see the Agent reference.

Note: All of the logic related to your agent, including utility functions, must go in an agent folder. This is required because Jay packages this logic into a container that we host when you deploy your agent.

Next Steps

If you’re ready to connect your application to your agent, follow the Starting Sessions guide. Otherwise, if you want to add more functionality to your agent first, here are some guides to help:

  • Function Calling: Also known as “tool calling”, function calling lets you connect LLMs to external data and systems.
  • Event Handlers: Event handlers are functions that react to events that occur throughout the lifecycle of a session. For example, an event is emitted every time the user starts talking.