How to run a Task out of its crew? With manual context

Sure! Let’s find the best architecture together. Mine is rather a set of drafts, but I managed to achieve what I wanted (more or less.)

One thing is missing: in my another project I use neat unit tests with LLM calls mocking. So I can debug my algorithmic parts lightning fast and with no LLM fees. But that project is in TypeScript and on a custom framework. For Python and CrewAI, I am still assembling a toolset to do all that.


prepare_idea_research is simply a task definition, nothing magic. It looks like this:

    @task
    def prepare_idea_research(self) -> Task:
        return Task(
            config=self.tasks_config['prepare_idea_research']
        )

Btw, the thing that is not clear from courses and documentation, is that the Task object itself has a field output that gets the execution result wrapped in the TaskOutput class.

from crewai.tasks.task_output import TaskOutput

So our Task object is not just a kickoff definition, but also a runtime values container for the task. This is not obvious at all; I read the source code to figure this out.


Below are more practices I use to run subsets of the tasks or to test the tools separately.


Assembling partial crews

crew.py

@CrewBase
class Publishing():
...
    # I've extracted that finding from my previous post into a separate method
    def mock_tasks(self, context: Dict):
        for task_name, prev_result in context.items():
            prev_task = getattr(self, task_name)()
            pydantic_output, json_output = prev_task._export_output(prev_result)
            prev_task.output = TaskOutput(
                name=prev_task.name,
                description=prev_task.description,
                expected_output=prev_task.expected_output,
                raw=prev_result,
                pydantic=pydantic_output,
                json_dict=json_output,
                agent='Mock',
                output_format=prev_task._get_output_format(),
            )
  
    @crew
    def crew(self, mode: str = None) -> Crew:
        if not mode:
            return Crew(
                agents=self.agents,
                tasks=self.tasks,
                verbose=True,
            )
  
        if mode == 'ideate':
            return Crew(
                agents=[
                    self.idea_hunter(),
                ],
                tasks=[
                    self.ideate_free(),
                ],
                verbose=True,
            )

        ...    
  
        if mode == 'research_idea':
            return Crew(
                agents=[
                    self.scientific_researcher(),
                ],
                tasks=[
                    self.prepare_idea_research(),
                    self.research_idea(),
                ],
                verbose=True,
            )
   
        # An example of a completely detached agent for testing-only purposes
        if mode == 'test_tool':
            tool_tester = Agent(
                role="Tool Tester",
                goal="Test and validate the provided tool's functionality",
                backstory="I am an expert at testing tools and validating their outputs. I ensure tools work as expected. Use word 'dog' as a part of the product name and 'language learning' as domain to search for knowledge.",
                tools=[self.search_in_knowledge_base()],
                verbose=True,
                allow_delegation=False,
            )
  
            tool_test_task = Task(
                description="Test the {tool} tool by using it and validating its output",
                expected_output="A detailed report of the tool's functionality and performance",
                agent=tool_tester,
            )
  
            return Crew(
                agents=[tool_tester],
                tasks=[tool_test_task],
                process=Process.sequential,
                verbose=True,
            )
  
        raise ValueError(f"Unknown mode: {mode}")

main.py

def assemble_regular_inputs():
    with open('current_run/input.md', 'r') as f:
        input = f.read()
    
    return {
        # Stuff for very different tasks is mixed together,
        # because CrewAI interpolates all the task strings, not only the used ones.
        'session_date': '2025-01-12',
        'task': 'Inspect content plan, suggest ideas to fill the gaps',
        'calendar_principles': get_publication_calendar_principles(),
        'input': input,
    }
  
def run():
    """
    Run the crew.
    """
    agentops.init()
  
    cmd = os.getenv('RUN_MODE')
  
    if not cmd:
        inputs = assemble_regular_inputs()
        Publishing().crew().kickoff(inputs=inputs)
        return
  
    if cmd == 'ideate':
        inputs = assemble_regular_inputs()
        Publishing().crew(mode='ideate').kickoff(inputs=inputs)
        return
  
    ...
  
    if cmd.startswith('test_tool:'):
        tool_name = cmd.split(':', 1)[1]
        inputs = {'tool': tool_name}
        Publishing().crew(mode='test_tool').kickoff(inputs=inputs)
        return
  
    raise ValueError(f"Unknown command: {cmd}")

We’d better invent a way to split this into separate classes.


Typical tool tester

tests/test_articles_backlog.py (not the best decision to put a script into a unit-test directory, but I haven’t invented a better place yet):

from dotenv import load_dotenv
from src.publishing.tools.get_articles_backlog import GetArticlesBacklog

def main():
    # Load environment variables
    load_dotenv()
    
    # Initialize the tool
    backlog_tool = GetArticlesBacklog()
  
    print("\n=== Testing Articles Backlog Tool ===")
    try:
        results = backlog_tool._run()
        print(results)
        print(f"\nFound {len(results)} articles in backlog:")
        for i, article in enumerate(results, 1):
            print(f"\nArticle {i}:")
            for key, value in article.items():
                print(f"{key}: {value}")
    except Exception as e:
        print(f"Error occurred: {str(e)}")
  
if __name__ == "__main__":
    main()