Hey there,
I’m relatively new to CrewAI but i’m quite fascinated so far. I’m finding my struggles in dealing with a controlled process though. Primarily speaking of custom made tools.
My primary question is: How would I best control a Crew/Agent in terms of failing Tasks/Tools.
Let’s assume I have a custom tool that queries some data from an endpoint. I easily got there but noticed it’s hard to get a reliable answer structure for once and also hard to deal with failure seemingly. Also it’s tough for me to get control over the amount of retries/exections.
I ran into this in a rather big application already so i tried to reproduce it in a simple test flow and it’s kind of the same.
Here’s my code - I hope it’s okay to post this here.
import os
from crewai import Agent, Crew, Process, Task
from crewai.tools import BaseTool
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# Load environment variables from .env file
load_dotenv()
# Check if the key is loaded (optional debug step)
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
print("Warning: OPENAI_API_KEY not found after loading .env!")
else:
print("OPENAI_API_KEY loaded successfully.")
# --- 1. Define a Tool that Simulates Data Fetching Failure ---
class DataFetcherTool(BaseTool):
name: str = "DataFetcherTool"
description: str = "A tool to fetch specific data based on an ID."
raise_on_error: bool = True # Ensure errors propagate
def _run(self, resource_id: str) -> str:
print(f"Tool: Attempting to fetch data for resource_id: {resource_id}")
# Simulate a failure condition (e.g., data not found)
raise ValueError(f"Resource Error: Could not find data for resource ID '{resource_id}'")
# --- 2. Initialize the Tool ---
data_fetcher_tool = DataFetcherTool()
# --- 3. Define the Agent ---
# Assumes OPENAI_API_KEY is set in the environment
llm = ChatOpenAI(model="gpt-4o")
data_retrieval_agent = Agent(
role='Data Retrieval Specialist',
goal=(
'Use the DataFetcherTool to retrieve specific information. '
'Handle potential issues gracefully.'
),
backstory=(
"You are an agent responsible for fetching data from various sources. "
"You must use the provided tool to get the data for a specific resource ID. "
"If the tool indicates the resource cannot be found or returns an error, "
"you must stop immediately and report this outcome. Do not invent data."
),
verbose=True,
allow_delegation=False,
llm=llm,
max_retry_limit=0, # Limit Agent retries
max_iter=1, # Limit tool iterations to ensure it stops on tool error
tools=[data_fetcher_tool]
)
# --- 4. Define the Task ---
fetch_data_task = Task(
description=(
"Use the DataFetcherTool to fetch data for resource ID 'item-123'. "
"If successful, the tool should return a JSON string with 'name' and 'value'. "
"Your final answer MUST be this JSON string. "
"If the tool cannot find the resource or encounters an error, report that the operation could not be completed."
),
expected_output="A JSON string like '{\"name\": \"...\", \"value\": \"...\"}' or a confirmation that the data could not be fetched.",
agent=data_retrieval_agent,
max_retries=0, # Limit task retries
verbose=True
)
# --- 5. Define the Crew ---
data_crew = Crew(
agents=[data_retrieval_agent],
tasks=[fetch_data_task],
process=Process.sequential,
verbose=True # Set verbose to boolean True
)
# --- 6. Execute the Crew with Error Handling ---
if __name__ == "__main__":
print("\n--- Starting Minimal Crew Data Fetch Test ---")
test_input = {'input_data': 'trigger'} # Crew needs some input dict
try:
result = data_crew.kickoff(inputs=test_input)
print("\n--- Crew Execution Finished ---")
print(f"Result: {result}")
print("Status: SUCCESS (Should not happen if tool fails correctly)")
except Exception as e:
print("\n--- Crew Execution FAILED ---")
print(f"Caught Exception: {type(e).__name__}")
print(f"Error Message: {e}")
print("Status: FAILURE (Expected due to tool error)")
print("\n--- Test Complete ---")
Here’s the result of the code above: Hastebin
From the execution you can see that the Agent uses the tool multiple times even though max_iter is set to 1 (it doesn’t use it at all if i set to 0 so I guess 1 should be the right number to go with if I only want it to execute once?). I also set Retries on Agent and Task on 0 but still it exectues more than once.
And then even though it fails with the tool it reports the Task as done and the Crew as well.
I’m sure I’m just missing very obvious things but I hope the foundation is good enough to get some help here
Thanks