CREWAI Agent keeps triggering rate limits after tool completion

Thank you for your assistance.

I am running an agent to support RAG. I have one tool => ‘Create_vector_store’ that iterates through a file with txt documents and metadata, reads the txt files, splits/chunks the documents for embedding and storage, stores the documents in a local chroma persistent database, and then ends. The storage and processing works perfectly. But then the agent doesn’t stop there. The agent then sends a request with what I assume to be all the documents at once to an the llm=‘gpt-40-mini’ and triggers a rate limit error:

“2025-02-20 16:28:12,056 - 471248 - llm.py-llm:161 - ERROR: LiteLLM call failed: litellm.RateLimitError: RateLimitError: OpenAIException - Error code: 429 - {‘error’: {‘message’: 'Request too large for gpt-4o-mini in organization org-6cu3xAqhtUTsvd4JH8zSbK4w on tokens per min (TPM): Limit 200000, Requested 598946.”

The agent does not have to do anything except start the tool. I am running a crew with one agent. I have all these parameters:
verbose=False,
memory=False,
cache=True,
max_rpm=100,
share_crew=False,
max_iter=1,
max_retry_limit=1,
respect_context_window=True

I can not find any way to stop the agent or stop it from sending a request.

Any help is greatly appreciated!

I did not figure out how to limit returning tokens, but I did discover that I was returning an index from my tool that was triggering the large token request. I just created a json file to hold the index inside of the tool and deleted the return index and that took care of it.