Skip to content

[BUG] LiteLLM + disable streaming breaks for Azure OpenAI #477

@dkmiller

Description

@dkmiller

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

1.0.0

Python Version

3.11.13

Operating System

Linux

Installation Method

pip

Steps to Reproduce

  1. Start a fresh Docker image with Python 3.11
    docker run -it python:3.11 /bin/bash
  2. Install the latest strands-agents SDK:
    pip install "strands-agents[litellm]" strands-agents-tools
  3. Run the sample code below.
    import os
    from strands import Agent
    from strands.models.litellm import LiteLLMModel
    from strands_tools import calculator
    
    os.environ["AZURE_API_KEY"] = "REDACTED"
    os.environ["AZURE_API_BASE"] = "https://eastus2.api.cognitive.microsoft.com"
    _model = LiteLLMModel(model_id="azure/gpt-4.1", params={"stream": False, "stream_options": None})
    
    agent = Agent(
        model=_model,
        system_prompt="You are an AI agent who helps with information and calculations about numbers. Use the calculator for any complex arithmetic.",
        tools=[calculator],
    )
    
    agent("What is 123981723 + 234982734")

Expected Behavior

The agent calculates the sum of two numbers using a tool call.

Actual Behavior

The strands-agents SDK throws an exception

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 379, in __call__
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/opentelemetry/instrumentation/threading/__init__.py", line 171, in wrapped_func
    return original_func(*func_args, **func_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 375, in execute
    return asyncio.run(self.invoke_async(prompt, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 400, in invoke_async
    async for event in events:
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 510, in stream_async
    async for event in events:
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 546, in _run_loop
    async for event in events:
  File "/usr/local/lib/python3.11/site-packages/strands/agent/agent.py", line 585, in _execute_event_loop_cycle
    async for event in events:
  File "/usr/local/lib/python3.11/site-packages/strands/event_loop/event_loop.py", line 182, in event_loop_cycle
    raise e
  File "/usr/local/lib/python3.11/site-packages/strands/event_loop/event_loop.py", line 130, in event_loop_cycle
    async for event in stream_messages(agent.model, agent.system_prompt, agent.messages, tool_specs):
  File "/usr/local/lib/python3.11/site-packages/strands/event_loop/streaming.py", line 318, in stream_messages
    async for event in process_stream(chunks):
  File "/usr/local/lib/python3.11/site-packages/strands/event_loop/streaming.py", line 273, in process_stream
    async for chunk in chunks:
  File "/usr/local/lib/python3.11/site-packages/strands/models/litellm.py", line 138, in stream
    async for event in response:
TypeError: 'async for' requires an object with __aiter__ method, got ModelResponse

Additional Context

No response

Possible Solution

No response

Related Issues

#312

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions