Error 'insufficient_quota' using Llma3

Fabricio_Silva · November 7, 2024, 2:07am

I am trying to create a crew with Llama3, which is already correctly installed on my machine, and I have even run tests with Langchain where everything worked as expected.

Now, I created a crew that worked well with OpenAI. I am changing it to use Llama, but the following error occurs:

2024-11-06 22:49:46,582 - 131459711406208 - llm.py-llm:161 - ERROR: LiteLLM call failed: litellm.RateLimitError: RateLimitError: OpenAIException - Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

I would like help to understand the reason and know how to solve it. I am creating the LLM like this:

from crewai import LLM
llm=LLM(model="ollama/llama3.1", base_url="http://localhost:11434")

I also tried with Groq, but the same error occurs.

from langchain_groq import ChatGroq
llm = ChatGroq(
    api_key="minha chave", 
    model="llama3-8b-8192"
)

I would greatly appreciate it if someone could help.

Fabrício Silva

rokbenko · November 7, 2024, 7:59am

@Fabricio_Silva You’re seeing this error:

You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.

This happens because the code is using OpenAI’s API, but your OpenAI account doesn’t have any credit.

Why? Even though you’re using Llama via Ollama, the OpenAI API is still called somewhere in your CrewAI code, likely without you realizing it.

Where? A common place this happens is when memory is set to True in CrewAI. By default, CrewAI uses OpenAI’s embedding model (e.g., text-embedding-3-small) to manage memory.

How not to use the OpenAI API for memory? You can customize that. See the docs. If you want to use Ollama for embedding, customize your code like this:

from crewai import Crew, Agent, Task, Process

my_crew = Crew(
    agents=[...],
    tasks=[...],
    process=Process.sequential,
    memory=True,
    verbose=True,
    embedder={
        "provider": "ollama",
        "config": {
            "model": "mxbai-embed-large"
        }
    }
)

Topic		Replies	Views
OpenAI API Key Issue in CrewAI Setup CrewAI Community Support	0	202	February 18, 2025
LiteLLM: Crew stuck in Loop CrewAI Community Support agent , crewai	6	637	February 15, 2025
Ollama stopped working since update to CrewAI 0.60.0 General	8	623	March 15, 2025
RateLimitError using Openai CrewAI Community Support	3	568	January 20, 2025
Llm connection to local server CrewAI Community Support	2	235	March 17, 2025

Error 'insufficient_quota' using Llma3

Related topics