I am using a csv file with tabular data; I have a column called ‘product_name’ which I pass as part of the inputs to the crew. Then I ask an Agent to use the CSVSearchTool to retrieve the data for this product, but the data does not match, is as if the Agent retrieves data for other products. Is there documentation or examples that show how to use CSV files or structured data and how Agents can read it and use it? I have set the LLM temperature to zero to reduce the risk of hallucinations, but it seems the Agent is not able to find the correct information from CSV file.
@jets6276 Set allow_code_execution
to True
for the agent. This allows the agent to write and run code when executing tasks, which should help improve the performance. Default is False
.
rokbenko’s solution didn’t seem to solve the problem for me. What I’ve done is follow this example where the CSV search tool is initialized with the csv path and then sent as the tool for the agent.
example:
from crewai_tools import CSVSearchTool
csv_search_tool = CSVSearchTool(csv_file_path)
then inside the agent:
@agent
def agent_name(self) -> Agent:
return Agent(
config=self.agents_config['agent_name'],
tools=[csv_search_tool]
)
Hi Leticia, even following this approach, the quality of response hasn’t improved at all. I’ve been using ChromaDB,and have tried different models e.g. OpenAI and Gemini. Have you been able to achieve satisfactory performance with this tool? If so, any advice you could share? Many thanks, Alexandre
Hey Alexandre, welcome aboard!
I haven’t really dug into the CSVSearchTool
code myself. So, while you wait for someone to give you a more spot-on answer for your use case, I’m going to take your question as a chance for us to reflect a bit on something Barry Zhang from Anthropic mentioned in this presentation:
Think like your agents
Alright, so we’ve got a CSV file packed with information. But let me simplify things a bit. Here’s what our file looks like:
Name,Age,ID,Pet
Peter,40,89,Lucy
Susan,35,11,Buddy
David,28,22,Daisy
Laura,32,18,Rocky
Alexandre,30,23,Bella
Mary,25,56,Max
Now, say you ask your agent: “What’s Alexandre’s age?” or even “Who owns Max?” As part of the RAG (Retrieval-Augmented Generation) process — CSVSearchTool
is one of those RAG tools — your file gets broken up into chunks. After a semantic search, the agent gets the following chunk as context for both questions:
Laura,32,18,Rocky
Alexandre,30,23,Bella
Mary,25,56,Max
Notice what’s going on here? Thinking like our agents, we find “Alexandre” and see the numbers 30 and 23. So, what’s Alexandre’s age? Then we also see “Mary” and “Max” on the same line, and with a bit of intelligence, we can guess there’s some kind of relationship there. But who exactly owns whom? See how, sometimes, by thinking like our agents, we start spotting weaknesses in how we’re tackling real-world use cases?
Now imagine instead that you received a JSON chunk made up of those same last three rows — still assuming your data got split during RAG:
[
{
"Name": "Laura",
"Age": 32,
"ID": 18,
"Pet": "Rocky"
},
{
"Name": "Alexandre",
"Age": 30,
"ID": 23,
"Pet": "Bella"
},
{
"Name": "Mary",
"Age": 25,
"ID": 56,
"Pet": "Max"
}
]
Our content is still fragmented, right? But this time, each fragment carries enough information to answer those original questions, and answer them well. I’m not sure if this exactly lines up with your real use case, but this kind of thought process can definitely help whenever you’re trying to solve real problems with agentic systems, whether that’s workflows or agents.
By the way, here’s some Python code that converts a CSV file into a JSON format like above, which you can then use with the JSONSearchTool
. Happy coding!
import pandas as pd
import json
def csv_to_json(csv_filepath, json_filepath):
"""
Reads a CSV file using pandas and saves its contents as a JSON file.
Args:
csv_filepath (str): Path to the input CSV file.
json_filepath (str): Path where the output JSON file will be saved.
"""
try:
# Load CSV into a pandas DataFrame
df = pd.read_csv(csv_filepath)
# Convert DataFrame to a list of dictionaries
json_data = df.to_dict(orient='records')
# Write JSON data to file with indentation for readability
with open(json_filepath, "w", encoding="utf-8") as f:
json.dump(json_data, f, indent=2)
print(f"Successfully converted '{csv_filepath}' to '{json_filepath}'")
except FileNotFoundError:
print(f"Error: CSV file not found at '{csv_filepath}'")
except Exception as e:
print(f"An unexpected error occurred: {e}")
csv_file = "pets.csv"
json_file = "pets.json"
csv_to_json(csv_file, json_file)