Could you please help me resolve issues with the code below? Thanks in advance
this issue is that the @tool lookup_activities_by_ids does not seem to get inputs and therefore automation_activities_list (these are blank). However, it correctly gets activity_ids.
prepare_inputs function correctly returns automation_activities_list.
how should I refactor the code? Is there any issue with the tool code, the agent, or the task?
I have tried changing the task description multiple times. I don’t think this is the answer.
Relevant code:
from crewai import Agent, Task, Crew, Process
from crewai.project import CrewBase, agent, task, crew, before_kickoff, after_kickoff
from crewai.tools import tool
from crew_agents.crew_llm import oai_llm
from langchain_openai import ChatOpenAI
from langchain.output_parsers import BooleanOutputParser
import os
import traceback
from typing import List, Dict
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
embedding = OpenAIEmbeddings()
llm = ChatOpenAI(model_name=“gpt-4”, temperature=0, openai_api_key=os.getenv(“OPENAI_API_KEY”))
vector_store = Chroma(
embedding_function=embedding,
persist_directory=“memory_store” # Path relative to your project root
)
boolean_parser = BooleanOutputParser()
tool
def lookup_activities_by_ids(activity_ids: List[int], inputs: dict) → List[dict]:
“”“Retrieve activities from AutomationActivities dataset based on activity IDs.”“”
print(f"\ninputs in lookup_activities_by_ids: {inputs}\n")
automation_activities_list = inputs.get(“automation_activities_list”, )
print(f"\nautomation_activities_list in lookup_activities_by_ids: {automation_activities_list}\n")
print(f"\nactivity_ids in lookup_activities_by_ids: {activity_ids}\n")
acts = [activity for activity in automation_activities_list if activity.get(“activity_id”) in activity_ids]
print(f"\nacts in lookup_activities_by_ids: {acts}\n")
return acts
#CrewBase
class DigitizationCrew:
before_kickoff
def prepare_inputs(self, inputs: dict) → dict:
… various methods…
prepared_inputs = {“automation_activities_list”: automation_activities_list,
“digitized_forms_data”: filtered_digitized_forms_data,
“dfs_order”: dfs_order,
“subgraph_model”: subgraph_model.model_dump(),
“activity_dict”: {k: v.model_dump() for k, v in activity_dict.items()},
“teams_dict”: {k: v.model_dump() for k, v in teams_dict.items()},
“digitized_forms”: unique_forms.model_dump(), # converted to dict
“automation_activities”: [activity.model_dump() for activity in automation_activities]} # also convert
self.inputs = prepared_inputs
print(f"\n***Self.input in prepare_inputs: {self.inputs}***\n")
return prepared_inputs
#after_kickoff
def process_results(self, digitization_results: DigitizedForms) → dict:
inputs = self.inputs
… various methods…
return {
“digitized_forms”: DigitizedForms(digitized_forms=merged_forms),
“automation_activities”: classified_activities}
agent
def form_digitization_analyst(self) → Agent:
return Agent(
role=“Form Digitization Analyst”,
goal=“”“Identify and recommend the best digitization strategies for non-digitized forms.”“”,
backstory=“”“You specialize in document digitization and automation. Your expertise lies in determining the most efficient digitization method for each form, ensuring they become digital, structured, and machine-readable to facilitate automation.”“”,
tools=[lookup_activities_by_ids],
verbose=True,
llm=oai_llm,
)
task
def assess_digitization_task(self) → Task:
return Task(
description=“”"Analyze the provided forms dataset to recommend digitization plans.
{digitized_forms_data}
**Important Rules:**
1. You must use only the provided dataset.
2. You must determine the digitization need based on `digital`, `machinereadable`, and `structured` flags.
3. Recommend the best `digitization_method` (e.g., API, OCR, ICR, Speech to Text, NLP, AI Tagging or a combination of these).
4a. From inputs provided to this crew, access:
{automation_activities_list}
4b. Use the `lookup_activities_by_ids` tool to find digitized_forms_data form_activities in automation_activities_list for details on these form_activities.
4c. The tool, should return a list of activities that you should use to determine the `digitization_point` after which form activity the digitization should occur.
5. Define a clear `digitization_activity` name that explains what needs to be done. Include the digitization method as part of the digitization activity name.
6. Update the form's attributes (`digital`, `machinereadable`, `structured`) based on the chosen method.
7. Ensure all forms in the dataset requiring digitization are included in the final output.
8. Retain the form name and case sensitivity in the output.""",
expected_output="A structured Pydantic object of type DigitizedForms with digitization recommendations.",
output_pydantic=DigitizedForms,
agent=self.form_digitization_analyst()
)
crew
def digitization_crew(self):
return Crew(
planning=True,
agents=[self.form_digitization_analyst()],
tasks=[self.assess_digitization_task()],
process=Process.sequential, # Ensure the reviewer runs after the analyst
memory=True,
memory_config={“storage”: vector_store},
)
on running, the tool returns {} for inputs, and therefore [] for automation_activities_list although this is correctly populated in self.inputs. The tool correctly identifies activity_ids. I have correctly used @ to define tool, agent, crew etc.
Thanks much