Base64 String Gets Altered During Crew Execution

Hey everyone,

We’ve developed an agent and a task to extract information from a base64-encoded image string. The extraction is handled by a custom tool that decodes the base64 string, converts it into an image, and processes it using a vision model.

However, during Crew execution, we’ve noticed that the base64 string gets altered somehow, making it unreadable by our custom tool. This results in extraction failures.

We’ve tested multiple approaches, including trying different versions of the crewai library (including the latest), but the issue persists.

Here’s a snippet of our implementation:

@tool(“callback_extract_expense_details_fn”)
def callback_extract_expense_details_fn(base64_image: str) → dict:

   try:
    image_data = base64.b64decode(base64_image)  
    image = Image.open(io.BytesIO(image_data))  
    prompt = """Identify and extract only the following from the image:  
                Merchant Name, Invoice Number, Invoice Date, and Charges for the Period."""

    vision_model = genai.GenerativeModel("gemini-1.5-flash")  
    response = vision_model.generate_content([prompt, image])  

    return response.text  
except Exception as e:
    print(f"Error extracting details: {e}")
    return None

The Crew setup involves an agent assigned to process the base64-encoded image string:
image_data_extractor = Agent(
role=“Image Data Extractor”,
goal=“Extract expense details from an image.”,
tools=[callback_extract_expense_details_fn],
verbose=True,
llm=llm
)

extract_task = Task(
description=“Extract expense details from a base64 encoded image”,
agent=image_data_extractor,
input=base64_encoded_file,
expected_output=“dict containing extracted details or error”
)

crew = Crew(
agents=[image_data_extractor],
tasks=[extract_task],
verbose=True,
process=Process.sequential
)

crew_output = crew.kickoff()

Issue Faced:
The base64 string changes when passed through the Crew execution flow.

The modified string becomes unreadable by our custom tool.

Questions:

  1. Could CrewAI be modifying the string during task execution?
  2. Is there a best practice for passing binary data (base64) between agents without corruption?

Would really appreciate any insights or suggestions.

Thanks!

What do you mean when you say altered? Have you tried to print the string when it’s called in the tool comparing with the original you passed in.

How are you passing the base64 string in? The more details you share the better

Hi @zinyando , Thanks for commenting on my issue. I’ve printed the input base64 string. Also set verbose=True. During the crew execution it print the prompt I observed that input base64 string and the value available in the prompts are different. Also when it decoding gives the error that base64 is unable to decode.

Do you have any suggestion to check this issue further?

Thanks,

@liyan2dmn, if you’re trying to build a multimodal agent (especially one that needs to handle images), you should probably start by checking out CrewAI’s built-in capabilities for this. Take a look at the examples in “Using Multimodal Agents” in the official docs, and then you can tailor it to your specific needs from there.

Can you provide more details on what changes occur on the Base64 string?
Is the string truncated? Is it completely different?