TPM: Tokens per minute limiting

CrewAI has RPM options for crews, but not TMP (Tokens per minute), which is necessary for services such as Groq. Here’s an example error:

2024-12-10 16:50:11,862 - 135614723733312 - llm.py-llm:170 - ERROR: LiteLLM call failed: litellm.RateLimitError: RateLimitError: GroqException - {"error":{"message":"Rate limit reached for model llama-3.3-70b-versatile in organization org_xxxx on tokens per minute (TPM): Limit 6000, Used 7599, Requested 1777. Please try again in 33.764s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"tokens","code":"rate_limit_exceeded"}}

Are there any plans to implement this, or are there any work arounds that people have used?

3 Likes