Streaming is not supported for Ollama models

I am using ollama QWEN model for a RAG Agent application. when I select the o3-mini model, the response from the agent is coming in a chunks format, but when I select the ollama QWEN model, ther response is coming as a complete string.

This behaviour is making the blank time very high especially when the response is large text.

Here is the code:

    if last_message and last_message.get("role") == "user":
        question = last_message["content"]
        with st.chat_message("assistant"):
            # Create container for tool calls
            tool_calls_container = st.empty()
            resp_container = st.empty()
            with st.spinner("🤔 Thinking..."):
                response = ""
                try:
                    # Run the agent and stream the response
                    run_response = agentic_rag_agent.run(question, stream=True)
                    for _resp_chunk in run_response:
                        # Display tool calls if available
                        if _resp_chunk.tools and len(_resp_chunk.tools) > 0:
                            display_tool_calls(tool_calls_container, _resp_chunk.tools)

                        print(_resp_chunk, _resp_chunk.content)
                        # Display response
                        if _resp_chunk.content is not None:
                            response += _resp_chunk.content
                            resp_container.markdown(response)

                    add_message(
                        "assistant", response, agentic_rag_agent.run_response.tools
                    )
                except Exception as e:
                    error_message = f"Sorry, I encountered an error: {str(e)}"
                    add_message("assistant", error_message)
                    st.error(error_message)

o3-mini response:

RunResponse(content='', content_type='str', thinking=None, event='RunResponse', messages=None, metrics=None, model='o3-mini', run_id='e8e64837-3e68-4231-b304-7581d6c9a97b', agent_id='86090364-1c65-4c49-926a-0ad725c0dad4', session_id='8baae117-b6a8-4e5f-9c1a-50d35c7b0bc5', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1744093287) 
RunResponse(content='Hello', content_type='str', thinking=None, event='RunResponse', messages=None, metrics=None, model='o3-mini', run_id='e8e64837-3e68-4231-b304-7581d6c9a97b', agent_id='86090364-1c65-4c49-926a-0ad725c0dad4', session_id='8baae117-b6a8-4e5f-9c1a-50d35c7b0bc5', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1744093287) Hello
RunResponse(content='!', content_type='str', thinking=None, event='RunResponse', messages=None, metrics=None, model='o3-mini', run_id='e8e64837-3e68-4231-b304-7581d6c9a97b', agent_id='86090364-1c65-4c49-926a-0ad725c0dad4', session_id='8baae117-b6a8-4e5f-9c1a-50d35c7b0bc5', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1744093287) !
RunResponse(content=' Welcome', content_type='str', thinking=None, event='RunResponse', messages=None, metrics=None, model='o3-mini', run_id='e8e64837-3e68-4231-b304-7581d6c9a97b', agent_id='86090364-1c65-4c49-926a-0ad725c0dad4', session_id='8baae117-b6a8-4e5f-9c1a-50d35c7b0bc5', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1744093287)  Welcome

Ollama QWEN Response:

RunResponse(content="Hello! I'm here to assist you with any IT-related issues. Whether it's a login problem, software installation, system crash, or anything else, I can help create a ticket and get it resolved. How can I assist you today? If this is your first time interacting with me, just let me know what you need help with! 😊", content_type='str', thinking=None, event='RunResponse', messages=None, metrics=None, model='qwen2.5:72b-instruct-q2_K-16k', run_id='8917c70c-cc1d-4f8d-9200-883ea30c5848', agent_id='192d3fcd-9dba-47ea-a704-e30d4c265f14', session_id='20673cd6-7b0b-4d50-838e-13b7156ef628', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1744093287) Hello! I'm here to assist you with any IT-related issues. Whether it's a login problem, software installation, system crash, or anything else, I can help create a ticket and get it resolved. How can I assist you today? If this is your first time interacting with me, just let me know what you need help with! 😊

Hi @Ahmad

thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.
If it’s urgent, please let us know. We appreciate your patience!

Hi @Ahmad

In our experience, performance with these models are not always very consistent. We can recommend you try using LMStudio and see if it makes things any better for you.
There are instructions in the LMStudio cookbook readme

cookbook/models/lmstudio/README.md