CSVSearchTool always returns only 3 results - How to increase the limit?

Valclemir_Rodrigues · May 28, 2025, 5:18pm

Problem: I’m using CSVSearchTool to search for products in a CSV database with thousands of records, but regardless of the query, I always receive only 3 results, even when I know there are many more relevant products in the database.

Current code:

python

from crewai_tools import CSVSearchTool
import os
from dotenv import load_dotenv

load_dotenv(override=True)
os.environ["OPENAI_API_KEY"] = os.getenv("CHAVE_API")

csv_tool = CSVSearchTool(
    csv="base_produtos.csv",
    config=dict(
        llm=dict(
            provider="openai",
            config=dict(
                model="gpt-4.1-mini",
                temperature=0.0
            ),
        ),
        embedder=dict(
            provider="openai", 
            config=dict(
                model="text-embedding-3-small"                                
            ),
        ),
    )
)

Example of the problem:

Query: “Do you have dipyrone?”
Current result: ["DIPYRONE 1G 10CP", "DIPYRONE 500MG 30CP", "DIPYRONE 500MG ENV 10CP"] (always 3)
Expected result: All available dipyrone products (should be 8-10 products)

Attempts made:

Tested with different queries - always 3 results
Verified there are more products in the CSV database - there are dozens of dipyrone variations
Tried changing the LLM temperature - no effect

Question: How can I configure the CSVSearchTool to return more than 3 results? Is there any parameter like k, limit, or results_limit that I can use in the configuration?

Thanks for any help!

Max_Moura · May 28, 2025, 8:24pm

Hey Valclemir,

Try using a custom Adapter like the one I’ve laid out below. I took the chance to swap out the search functionality between Adapters: the original one relies on Embedchain’s .query() method, but the new one uses the .search() method instead, which provides the flexibility to adjust the number of results you get back.

I’ve also beefed up the text that’s returned to the LLM. You really want to think about the tool output as something that enriches the LLM’s understanding. Whenever you’re building custom tools, consider this output as part of the LLM’s actual prompt.

Finally, you’ll notice I got rid of the llm attribute you were passing in your configuration and just stuck with the embedder. Even the original Adapter doesn’t require the llm attribute. It’s only used (in the default Adapter) if you set summarize=True. In that case, Embedchain itself hands your Agent a result that’s been summarized by the LLM specified in that parameter. Since you’re using the raw data from your search, that llm parameter is pretty much never necessary.

from typing import Any
from crewai_tools.tools.rag.rag_tool import Adapter
from embedchain import App

class CustomEmbedchainAdapter(Adapter):
    embedchain_app: App

    def query(self, question: str) -> str:
        response = "---\n"
        response += f"**Additional Context for Query '{question}':**\n"
    
        search_results = self.embedchain_app.search(
            query=question,
            num_documents=5, # Up to 5 relevant chunks
        )
        
        if search_results:
            for context, metadata in (item.values() for item in search_results):
                response += f"**Context:** '{context}' "
                response += f"(**Metadata:** '{metadata}')\n"
        else:
            response += "No relevant context found for this query.\n"
        
        response += "---\n"
        return response.strip()

    def add(self, *args: Any, **kwargs: Any) -> None:
        self.embedchain_app.add(*args, **kwargs)

#
# Test it out
#

from crewai_tools import CSVSearchTool
import os

os.environ["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"

embedchain_config = {
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small"
        }
    }
}

csv_tool = CSVSearchTool(
    csv="/path/to/your/file.csv",
    config=embedchain_config,
    adapter=CustomEmbedchainAdapter(
        embedchain_app=App.from_config(config=embedchain_config)
    )
)

print(
    csv_tool.run("<YOUR_QUERY>")
)

Valclemir_Rodrigues · May 29, 2025, 11:30pm

Max, thank you very much for your help

Topic		Replies	Views
Issue with CSVSearchTool CrewAI Community Support tools_issues	9	615	April 27, 2025
CSVSearchTool Tool Usage Failed General tools_issues	3	38	May 16, 2025
Large CSV file not working: CSVSearchTool CrewAI Community Support tools_issues	0	154	November 25, 2024
Knowledge source search Limitation CrewAI Community Support crewai	0	71	February 4, 2025
JSON Search Tool is not working as expected CrewAI Community Support tools_issues	2	87	April 16, 2025

CSVSearchTool always returns only 3 results - How to increase the limit?

Related topics