Task input/output models

New to CrewAI & I’m investigating how I can remove some of the ‘fuzziness’ in the interfaces between tasks by way of having a known input & output data schema.

Got this from Crew GPT Assistant:

from crewai import Agent, Task, Crew, Process

from pydantic import BaseModel, Field

# Define a Pydantic model to validate order data
class OrderInputModel(BaseModel):
    order_id: int
    product_name: str
    quantity: int = Field(gt=0, description="Quantity must be greater than zero")

# Define a Pydantic model for the output structure
class OrderAnalysisOutputModel(BaseModel):
    total_orders: int
    top_product: str
    total_quantity: int

# Create an agent to analyze orders
order_analyst = Agent(
    role='Order Analyst',
    goal='Analyze the list of orders and provide a summary report.',
    verbose=True,
    memory=True,
    backstory="You are an expert in processing and analyzing order data."
)

# Define the task using the Pydantic models for input and output validation
order_analysis_task = Task(
    description="Analyze the incoming list of orders and provide a report on total orders, top product, and total quantity.",
    input_model=OrderInputModel,  # Pydantic model for input validation
    output_model=OrderAnalysisOutputModel,  # Pydantic model for output structure
    agent=order_analyst
)

# Define the crew and process
order_analysis_crew = Crew(
    agents=[order_analyst],
    tasks=[order_analysis_task],
    process=Process.sequential
)

# Sample input data (list of orders)
input_data = [
    {"order_id": 1, "product_name": "Laptop", "quantity": 2},
    {"order_id": 2, "product_name": "Monitor", "quantity": 5},
    {"order_id": 3, "product_name": "Laptop", "quantity": 3}
]

# Kick off the task
result = order_analysis_crew.kickoff(inputs={'input_data': input_data})

# The result will be validated and structured by the output_model
print(result)

I am guessing that ‘input_model/output_model’ params are hallucinations as I can not find another reference to them! However, would be good if it could work that way: pydantic model → input pydantic model → output

Back to my question: How can I get a consistent data schema as output from a task & how to use that known schema within the next task?

I did find this from Joaom which seems to be related.

yes it is hallucinating attributes, the correct attribute would be output_pydantic or output_json

I would start building your own crews with the CLI command crewai create create <Name>

This will give you the best project scaffold to work from. Also check out the docs @ docs.crewai.com & our GitHub GitHub - crewAIInc/crewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

1 Like

@matt this seems to be coming up everywhere when you look at the posts over the last few days here. I agree output_pydantic or json has helped me get more consistent results. Even a dumb dictionary in the narrative for task expected output helps. It would be great to do a somewhat sophisticated example of this and talk about best practices to intake this in a subsequent task.
I was going to start some experiments on using all caps, threatening the llm, using special delimiters around stuff to pass between tasks, etc to compare correctness and consistency of the recognition of these items in the jumble of context passed to the model as complexity grows in the prompt. Does the team have any data on this?
I haven’t looked in the code, but how does crewai treat the data in the json object when it is output from a task? Do you embed it any special way in the overall context you keep for the whole process?

Could we discuss the Task param output_pydantic

From the docs:
Output Pydantic (optional) output_pydantic Outputs a Pydantic model object, requiring an OpenAI client. Only one output format can be set.

  1. Outputs a Pydantic model object
  2. requiring an OpenAI client
  3. Only one output format can be set

All 3 points raise more questions than answers for me

This post is probably relevant

This is a straightforward example of structured output using crewai tools. What if you then use task.output, take what you want from that and pass it as context to the next task?

I can see that what you suggest has it’s use case. Unfortunately at present I don’t know enough about CrewAI to comment further.

This has now evolved into this thread

Hi,

I’m trying to get output_json to work as well. I’m struggling to find the right syntax.

In the task I’m adding setting expected_output to the string “Return the results as JSON following this schema [{ ‘url’: ‘url’, ‘location’: ‘location’, ‘expiration_date’: ‘expiration_date’}]”.

After the crew results are in, I’m trying

crew_output = crew.kickoff()
crew_ai_result_as_json = json.dumps(crew_output.json_dict, indent=2)

However, crew_output.json_dict returns nothing while crew_output.raw does.

I tried setting the following parameters on task

output_format="output_json"

That doesn’t work.

What’s the right syntax to be able to access the final results from the crew as pure JSON with a predefined schema?

output_format=‘json’,
output_json=ResearchReportList

print(f"“”
Task completed!
Task: {task_output.description}
Output: {task_output.raw}
Type: {task_output.output_format}
Jsondict: {task_output.json_dict}
“”")
print(lizer(task_output.json_dict))

where for example,

class ResearchReport(BaseModel):
Title: str = Field(…, description=“The title of the article.”)
URL: str = Field(…, description=“The URL of the article.”)
Author: List[str] = Field(…, description=“An author of the article.”)
Published_Date: str = Field(…, format=“MM/YYYY”, description=“The publication date of the article using the format MM/YYYY.”)
Methodology: str = Field(…, description=“The methods or procedures used in the experiment in the article in great DETAIL.”)
Drug: str = Field(…, description=“The specific drug tested in the article.”)
Dosage: str = Field(…, description=“The dosage amount and frequency of the drug used in the article.”)
Results: str = Field(…, description=“The results or outcome of the experiment in the article in great DETAIL.”)

class ResearchReportList(BaseModel):
root: List[ResearchReport]

1 Like

Hi Moto,

Thank you very much for your help.

That worked pretty well. I just had to change the last part to lower case list.

So

class ResearchReportList(BaseModel):
    root: list[ResearchReport]

Glad it worked. Yes sorry on the cap L . I think that was from a long ago experiment that gpt4 wrote a lot of since I certainly am no python coder.

1 Like