How to handle empty results from Executor Agent and trigger QueryBuilder Agent to simplify the query?

I’m working with two agents in a pipeline:

  1. QueryBuilder: translates natural language input into a keyword-based query with logical conditions.
  2. Executor: calls an endpoint to perform a search using the query generated by QueryBuilder.

The issue I’m facing is that sometimes the QueryBuilder generates queries that are too complex or restrictive, resulting in no results returned by the Executor’s output.

My idea is: after the Executor receives an empty result list, it should “pass the ball back” to the QueryBuilder and ask it to simplify the query to try again.

My questions are:

  • How can this mechanism be implemented effectively in CrewAI?
  • Are there better or more elegant ways to handle this scenario of empty results and query refinement?

Any suggestions, examples, or best practices would be much appreciated!

Thanks in advance!

If you find it better to treat each stage as a “step” or “node,” then you should use Flows with a flow control node placed after the Executor. This node is necessary because CrewAI doesn’t allow regular nodes to control the execution flow by returning a string; only flow control nodes have this capability. In this setup, QueryBuilder would listen for both the natural flow (i.e., the completion of its preceding node) and an error string emitted by the flow control node after Executor to restart the process. The data history should be maintained in the shared state for common access.

Alternatively, if you prefer a more self-contained approach, Executor could be implemented as a tool that QueryBuilder itself executes and then evaluates the output. Since the same tool can be called multiple times, this method can likely handle the necessary retries.

In either case, I strongly advise you to adopt a clear and consistent return signature for Executor. Whether it’s a node or a tool, it should return both the original query and the result of its execution. This way, QueryBuilder can understand that regenerating the same failed query is pointless. I’ve adopted a similar approach of standardizing tool outputs in these sample tools.

Here’s where I am now:

  • I’ve merged responsibilities: the QueryBuilder agent now both generates the query and uses a tool to execute it.
  • A guardrail function checks the result count, and when it’s too low or zero, it returns a message like:

“No results were found. The query might be too strict or overly specific. Please reformulate it by relaxing some conditions.”

This part works — the guardrail is triggered correctly.

The issue is: the QueryBuilder acknowledges that the query might be too strict… but then proceeds to reformulate the same exact query, or one that’s only minimally different. In practice, it’s not really relaxing the conditions or simplifying the logic.

Here’s the current behavior summarized:

  1. First query is generated and executed :white_check_mark:
  2. Guardrail detects few/no results :white_check_mark:
  3. Prompt sent back to QueryBuilder: “Please reformulate” :white_check_mark:
  4. QueryBuilder replies with essentially the same query again :cross_mark:

My questions:

  • How can I encourage the agent to significantly revise the query, rather than just repeating or slightly tweaking it?
  • Are there any prompting patterns or CrewAI-specific strategies to push the agent to recognize failure and diversify its output?

Here’s the actual implementation:

crew.py

from crewai import Agent, Task, Crew
from crewai.project import CrewBase, agent, task, crew
from dotenv import load_dotenv
from crewai.tools import tool
from pydantic import BaseModel
from crewai.crew import CrewOutput
from scripts import PubmedHelper, ScopusHelper
from crewai.task import TaskOutput
from typing import Tuple

class TitleExtractor(BaseModel):
    title: str
class TranslationResult(BaseModel):
    english_translation: str
    
class ExecutionResult(BaseModel):
    query: str
    article_id_list: list[str]
    error_message: str

@CrewBase
class ResearchAssistantCrew():
    
    def __init__(self):
        load_dotenv("../.env")
        
    def validate_serach_result(self, task_output: TaskOutput) -> Tuple[bool, str]: 
        if len(task_output.pydantic.article_id_list) < 10:
            return False, f"No results were found. The query might be too strict or overly specific. Please reformulate it with by relaxing some conditions."
        return True, task_output
    
    @tool("scientific_serach")
    def scientific_serach(query: str) -> str:
        """
        Performs a scientific search based on the provided query.

        Args:
            query (str): The scientific search query string to use for retrieving information.

        Returns:
            tuple: (success: bool, data: list | str)
                If success = True, the article id list.
                If success = False, the error message.
        """
        
        pubmedHelper = PubmedHelper()
        return pubmedHelper.search_pubmed(query)
    
    
    
    @agent
    def title_extractor_agent(self) -> Agent:
        return Agent(
            config=self.agents_config["title_extractor_agent"]
        )
    
    @agent
    def english_translator_agent(self) -> Agent:
        return Agent(
            config=self.agents_config["english_translator_agent"],
        )

    @agent
    def query_builder_agent(self) -> Agent:
        return Agent(
            config=self.agents_config["query_builder_agent"],
            tools=[self.scientific_serach]
        )
        
    @task
    def title_extraction(self) -> Agent:
        return Task(
            config=self.tasks_config["title_extraction"],
            async_execution=True,
            output_pydantic=TitleExtractor
        )
        
    @task
    def english_translation(self) -> Task:
        return Task(
            config=self.tasks_config["english_translation"],
            output_json=TranslationResult 
        )
    
    @task
    def query_building(self) -> Task:
        return Task(
            config=self.tasks_config["query_building"], 
            context=[self.english_translation()],
            output_pydantic=ExecutionResult,
            guardrail=self.validate_serach_result,
            max_retries=3
        )
        
    @crew
    def crew(self) -> Crew:
        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            verbose=True
        )

    def run(self, inputs: dict) -> CrewOutput:
        try:
            return self.crew().kickoff(inputs=inputs)
        except Exception as e:
            raise Exception(f"Errore durante l'esecuzione della crew: {e}")

agents.yaml

title_extractor_agent:
  role: >
    Scientific Research Title Synthesizer
  goal: >
    To analyze the description of a research project and generate a concise, clear, and descriptive title that accurately reflects its content, using a professional and engaging tone.
  backstory: >
    Trained on thousands of academic abstracts, research reports, and publication titles, this agent specializes in distilling complex research ideas into impactful and accurate titles. With deep experience in scientific communication, it identifies key topics, recognizes domain-specific terminology, and produces titles that balance clarity, precision, and brevity. It plays a crucial role in ensuring that research outputs are easily discoverable and well-represented from the very first line.
  llm: together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
  verbose: True

query_builder_agent:
  role: >
    Academic Query Builder Specialist
  goal: >
    Transform the user’s input — a plain-language research prompt — into a well-formulated query that can be directly used in academic search engines to retrieve relevant, high-quality scientific articles. Your queries must:
    - Reflect the user’s true research intent.
    - Use advanced search syntax (where applicable).
    - Include key concepts, synonyms, and filters (e.g., publication year, field).
    - Be engine-agnostic but adaptable to scholarly platforms.
  backstory: >
    You are a senior academic with over 15 years of experience in scientific research and information retrieval. You have authored peer-reviewed papers, conducted systematic reviews, and mentored graduate students in research methodology. Over time, you’ve developed a deep understanding of how to translate complex research ideas into actionable search strategies, navigating the nuances of different scholarly databases. You think like a researcher and search like a librarian. Your mission is to help others find exactly what they need — faster, smarter, and more precisely than ever.
  llm: together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
  verbose: True

english_translator_agent:
  role: >
    English Translation Specialist
  goal: >
    Translate input written in any language into clear and natural English. Focus on:
    - Preserving the original meaning and tone.
    - Producing grammatically correct, idiomatic English.
    - Adapting expressions and sentence structure to feel natural to native English speakers.
    - Avoiding literal translations that sound awkward or unnatural.
  backstory: >
    You are a seasoned language expert with a background in translation studies and years of hands-on experience translating a wide variety of texts — from everyday conversations to formal documents. You are not limited to any specific field or jargon. Your strength lies in your ability to understand nuance, adapt tone, and ensure that the final English version reads as if it were originally written in English. You are trusted for your precision, elegance, and sensitivity to context.
  llm: together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
  reasoning: True
  verbose: True

task.yaml

title_extraction:
  description: >
    Analyze the provided research description and extract a concise, accurate, and informative title. The title should clearly reflect the main topic and purpose of the research, using professional and engaging language. Avoid overly generic or vague titles; focus on relevance and clarity.

    research description: {sentence}
  expected_output: >
    A single sentence or phrase (max 15 words) representing the title of the research, without quotation marks or additional commentary. The title should be capitalized appropriately and written in a style suitable for a scientific paper, presentation, or publication. The output must be in {language}
  agent: title_extractor_agent

english_translation:
  description: >
    Given this sentence or passage written in a non-English language, produce a fluent and faithful English translation. The translation should preserve the intent, tone, and meaning of the original, while using natural and idiomatic English appropriate to the context.
    This task supports a wide range of source languages and handles general content, including conversational, descriptive, informative, and narrative text.

    sentence or passage: {sentence}
  expected_output: >
    A fluent English translation that:
    - Conveys the same message as the original text.
    - Sounds natural to a native English speaker.
    - Avoids word-for-word translations unless appropriate.
    - Preserves tone (formal, informal, neutral, emotional) and intent.
    Provide the translation without any additional commentary or explanation
  agent: english_translator_agent


query_building:
  description: >
    Given a natural language prompt describing a research objective or topic, transform it into a precise and structured search query tailored for use on academic search engines (e.g., Google Scholar, Scopus, PubMed, Semantic Scholar).
    The query must retain the core semantic intent, include relevant keywords and synonyms, and — when applicable — incorporate advanced search operators (such as Boolean logic, filters, date ranges, or quotation marks) to increase search accuracy and relevance.
    After generating the query, use the "scientific_search" tool to perform the search.
  expected_output: >
    A structured, engine-ready search query string that:
    - Captures the user's research intent.
    - Includes Boolean operators (AND, OR, NOT) where appropriate.
    - Wraps multi-word expressions in quotes for exact matching.
    - Suggests relevant synonyms or related terms when helpful.
    And the nformation about the tool exection results. 
    Indicate the result in this format:
    
    query: str # the generated query
    article_id_list: [] # the list of article ids returned by the tool
    error_message: str # the result return by the tool as execution and retrieve of the article successed or faild
  agent: query_builder_agent

Well, reviewing both your approach and your implementation, I don’t see any significant error that would explain the failure you’re encountering. A minor issue is that “scientific_search” is misspelled as “scientific_serach,” which could cause the LLM to call the tool incorrectly.

Otherwise, everything seems to revolve around good ol’ Prompt Engineering. When you say, Please reformulate it by relaxing some conditions, what exactly does “relaxing some conditions” mean? Perhaps you could try a few-shot approach here, providing some examples. Improve this part of the failure communication; make it more robust. If needed, test this specific instruction in a chat session with your LLM. Interact with it until it responds in a way you consider sufficiently different, and then transpose what you learned in the chat to your guardrail’s error string.

You could also include the output of a run here so we can see the execution dynamics.

I’m noticing that the tool is being called twice during a single execution, before the guardrail logic is even reached.

This is not expected. I would expect the tool to be called once, then the guardrail to evaluate the result.

Why might the tool be invoked twice during one run of the QueryBuilder? Is there any internal retry mechanism or automatic re-prompting that might explain this?

As you can see here:

🚀 Crew: crew
├── 📋 Task: fd9785af-891d-4c26-9fd8-f85bda48b8c6
│   Assigned to: Scientific Research Title Synthesizer
│
│   Status: ✅ Completed
│   └── ✅ Reasoning Completed
├── 📋 Task: 0fd6f28c-f984-4292-a189-89864e46236a
│   Assigned to: English Translation Specialist
│
│   Status: ✅ Completed
├── 📋 Task: 7dbb4e69-bc81-496a-b23f-d0b5e90fd037
│   Status: Executing Task...                  --> First execution
│   ├── 🔧 Used scientific_search (1)          --> The tool is called 2 times
│   └── 🔧 Used scientific_search (2)
├── 📋 Task: 7dbb4e69-bc81-496a-b23f-d0b5e90fd037
│   Status: Executing Task...                  --> Re-execution after guardraill validation
│   └── 🔧 Used scientific_search (3)
└── 📋 Task: 7dbb4e69-bc81-496a-b23f-d0b5e90fd037
    Status: Executing Task...
    └── 🔧 Used scientific_search (4)

Your suspicion is well-founded. Kudos for your diligence. It does seem like something isn’t working as expected.

We could set up a custom event listener to listen for and capture ToolUsage... events, but that would be a poor man’s solution. Instead, I’d rather encourage you to expand your AI engineering toolkit. That’s why I recommend adding a crucial tool for this optimization phase of your agentic system: a monitoring and observability platform.

I suggest AgentOps or Phoenix, as they both offer smooth integration with CrewAI and a decent enough free tier. This will give you a much clearer picture of what’s actually happening during your Crew’s execution.