TPM: Tokens per minute limiting

luandro · December 10, 2024, 7:54pm

CrewAI has RPM options for crews, but not TMP (Tokens per minute), which is necessary for services such as Groq. Here’s an example error:

2024-12-10 16:50:11,862 - 135614723733312 - llm.py-llm:170 - ERROR: LiteLLM call failed: litellm.RateLimitError: RateLimitError: GroqException - {"error":{"message":"Rate limit reached for model llama-3.3-70b-versatile in organization org_xxxx on tokens per minute (TPM): Limit 6000, Used 7599, Requested 1777. Please try again in 33.764s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"tokens","code":"rate_limit_exceeded"}}

Are there any plans to implement this, or are there any work arounds that people have used?

rjanicki · May 6, 2025, 9:55am

That’d be awesome to have, are there any plans to do the limit or at least a backoff?

Tony_Wood · May 6, 2025, 10:22am

Welcome to the community

Yes sounds like a good idea to have TMP (Tokens Per Minute).
In the meantime this might be useful. Agents - CrewAI

@tonykipkemboi Is there an approved way to make suggestions?

rjanicki · May 6, 2025, 10:36am

Thanks!

Just FYI - I’ve used all of these recommendations in my project from the start. The TPM rate limit happened once to me so far, so it doesn’t guarantee it won’t happen, but I’m sure it reduces the risk

Topic		Replies	Views
Issues with token's limit General	3	307	July 31, 2025
CREWAI Agent keeps triggering rate limits after tool completion CrewAI Community Support openai , agent	1	354	February 22, 2025
Crew token_metrics usually 0 Crews crewai	0	96	October 29, 2024
RateLimitError using Openai CrewAI Community Support	3	720	January 20, 2025
Crewai result.token_usage not matching with LLMs token Usage count CrewAI Community Support	0	330	February 3, 2025

TPM: Tokens per minute limiting

Related topics