how to use process queue and rate limiter with agno so that if i have a rate limit of 3 and one agent does 2 llm calls and another does 1, by default rate limiter the third agent will fail or get exponentially backed off and multiple api calls will be tried and wasted
Hey @JoeMukherjee, Thank you for your question.
Agno doesn’t have a built-in global queue for this yet, so the recommended approach is to add a shared process queue + rate limiter around your agents/teams.
Instead of each agent calling the LLM directly, they submit requests into the queue. A single worker then pulls tasks out and executes them under the shared rate limit (e.g., 3 calls/sec).
So the fix is to introduce a queue that enforces the global limit, rather than letting each agent back off on its own.
Let me know if you have any followup questions