Use more info from pydantic models

To better describe my initial suggestion I’ve created an ‘out of the box’ >> crewai create crew test02 via the cli, then added a couple of minor updates to show a working example.
The LLM used GPT4omini

The crew.py

import json
import os
from typing import Optional, List

from crewai import Agent, Crew, Process, Task
from crewai.project import CrewBase, agent, crew, task
from langchain_openai import ChatOpenAI
from langchain_community.llms.ollama import Ollama
from langchain_groq import ChatGroq
from pydantic import BaseModel, Field


# Uncomment the following line to use an example of a custom tool
# from test02.tools.custom_tool import MyCustomTool

# Check our tools documentations for more information on how to use them
# from crewai_tools import SerperDevTool


@staticmethod
class LLMS:
	def __init__(self):
		self.OpenAIGPT35 = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.7)
		self.OpenAIGPT4oMini = ChatOpenAI(model_name="gpt-4o-mini", temperature=0.8)
		self.OpenAIGPT4 = ChatOpenAI(model_name="gpt-4", temperature=0.8)
		self.Phi3 = Ollama(model="phi3:mini")
		self.Llama3_1 = Ollama(model="llama3.1")
		# self.Phi3 = Ollama(model="phi3.5:latest")
		# self.Phi3 = ChatOpenAI(model_name="phi3:medium-128k", temperature=0, api_key="ollama", base_url="http://localhost:11434")
		self.groqLama3_8B_3192 = ChatGroq(temperature=0.5, groq_api_key=os.environ.get("GROQ_API_KEY"),
										  model_name="llama3-8b-8192")



class Stage1OutputModel(BaseModel):
    summary: Optional[str] = Field("", description="Explanation of how the report should be read")
    bullet_points: Optional[List[str]] = Field([], description="3 key bullet points from the report")

    def __init__(self):
        super().__init__()

    def get_field_info(self) -> str:
        field_info = "\n"
        for field_name, field_instance in self.model_fields.items():
            field_info += field_name + ", described as: " + field_instance.description + "\n"
        return field_info


@CrewBase
class Test02Crew:
    """Test02 crew"""
    agents_config = 'config/agents.yaml'
    tasks_config = 'config/tasks.yaml'

    def __init__(self):
        self.llms = LLMS()

    @agent
    def researcher(self) -> Agent:
        return Agent(
            config=self.agents_config['researcher'],
            llm=self.llms.OpenAIGPT4oMini,
            # tools=[MyCustomTool()], # Example of custom tool, loaded on the beginning of file
            verbose=True
        )

    @agent
    def reporting_analyst(self) -> Agent:
        return Agent(
            config=self.agents_config['reporting_analyst'],
            llm=self.llms.OpenAIGPT4oMini,
            verbose=True
        )

    @task
    def research_task(self) -> Task:
        model_info = Stage1OutputModel().get_field_info()
        return Task(
            # config=self.tasks_config['research_task'],
            description=" Conduct a thorough research about {topic}   Make sure you find any interesting and relevant information given the current year is 2024.",
            expected_output=f""" Your response must have these exact field names with values as described, : {model_info}""",
            agent=self.researcher()
        )

    @task
    def reporting_task(self) -> Task:
        return Task(
            # config=self.tasks_config['reporting_task'],
            description="""
			Review the summary you got and expand each bullet-point into a full section for a report.
    		Make sure the report is detailed and contains any and all relevant information.
    		""",
            expected_output="""
			 A fully fledge reports with the mains topics, each with a full section of information.
    		Formatted as markdown without '```'
    		""",
            #context=self.research_task(),
            agent=self.reporting_analyst(),
            output_file='report.md'
        )

    @crew
    def crew(self) -> Crew:
        """Creates the Test02 crew"""
        return Crew(
            agents=self.agents,  # Automatically created by the @agent decorator
            tasks=self.tasks,  # Automatically created by the @task decorator
            process=Process.sequential,
            verbose=True,
            memory=True
            # process=Process.hierarchical, # In case you wanna use that instead https://docs.crewai.com/how-to/Hierarchical/
        )

My lack of crew/python knowledge forced me to extract the Task descriptions, expexted_output values from the config files into the Task definition/creation methods of the crew class.!

Stage1OutputModel
A simple Pydantic model with fields that represent ‘typical’ values of the Task.
get_field_info
Returns a prompt ready string describing the model field names and descriptions.

N.B. That the result of get_field_info is used as the main component of the expected_output of the research_task.

Possible Uses

  1. The get_field_info method applied to any Pydantic model allows for a generalized expected_output prompt for a required Pydantic model.
  2. Change the model fields: add, remove & re-run the crew for an updated output from the task.
  3. Change the model field descriptions & get same output structure, but with different content from the Task. Change bullet points from 3 to ?, etc

Example
Add a new model field ‘todo’

class Stage1OutputModel(BaseModel):
	summary: Optional[str] = Field("", description="Explanation of how the report should be read")
	bullet_points: Optional[List[str]] = Field([], description="3 key bullet points from the report")
	todo: Optional[str] = Field("", description="Suggestions for further research")

	def __init__(self):
		super().__init__()

	def get_field_info_json(self) -> str:
		field_info = "\n"
		for field_name, field_instance in self.model_fields.items():
			field_info += field_name + ", described as: " + field_instance.description + "\n"
		return field_info

Above I have added a new field ‘todo’ then ran the same crew again to get the Task output below:

Thought: I now know the final answer  
Final Answer: {
  "summary": "This report synthesizes the latest advancements, applications, and trends in AI Large Language Models (LLMs) for the year 2024. It covers new models, innovative applications across various industries, and predictions for future developments, providing insights essential for stakeholders.",
  "bullet_points": [
    "In 2024, notable models like GPT-4.5 and BERT-Next have been released, showcasing improved contextual understanding and application effectiveness.",
    "AI LLMs are transforming sectors such as healthcare and finance, with applications ranging from diagnostic support to fraud detection, leading to significant operational efficiencies.",
    "Future trends indicate a move towards hyper-personalization, increased regulatory scrutiny, and the emergence of multimodal models that integrate various forms of data."
  ],
  "todo": "Suggestions for further research include exploring the ethical implications of AI LLMs, examining the impact of regulatory frameworks on AI development, and investigating the potential of multimodal models in various applications."
}

fields addressable within next Task
Take note of the mention of both bullet points & summary within the reporting_task description.

Contexts
As can be seen by the commented out context line in the reporting_task, I have tried to get the values via the previous task context, but adding that just broke everything for me!

I hope this better describes how I believe Pydantic Field description params can be better used.

More of a learning exercise, all comments welcome.

Can anyone improve the expected_output prompt template? The challenge issued :muscle: