How to Gracefully Terminate Hierarchical Crew Execution on LLM Failure?

Hello friends,

Hope everyone doing great.

I’ve inherited the LLM class in CrewAI to add custom error handling and logging (e.g., storing LLM errors and metrics in MongoDB). The custom class also supports retry logic and fallback models. Here’s a simplified version of what I’m doing:

class CustomLLM(LLM):
    def call(self, messages, tools=None, callbacks=[], available_functions=None, retry_count=0):
        try:
            return super().call(...)  # Normal LLM call
        except Exception as e:
            # Log error, fallback to another model or retry
            if retry_count < 3:
                # Retry logic with fallback model
                return self.call(messages, ..., retry_count + 1)
            else:
                # Final failure after retries
                raise e  # Intend to stop Crew here

The issue I’m facing is: even after raising the exception (after retries are exhausted), the Crew continues execution — particularly in hierarchical setups with a manager agent and sub-agents.

My question:
How can I terminate the entire Crew execution when an unrecoverable LLM failure occurs, even if it happens deep in a hierarchy of agents?

Any guidance on how to hook into Crew or manager-agent logic to intercept such terminal failures and gracefully stop would be highly appreciated.

Thanks in advance!

oh I would love to know this too… Right now I exit the Docker instance and restart.. but it would be good to have a “crewai flow stop” feature