[HELP] DALL-E Agent Not Downloading Images Correctly (197-byte file) in crewai run

Hi everyone,

I’m experiencing a persistent issue with a custom agent in CrewAI and was wondering if anyone in the community has encountered something similar or has insights into what might be happening.

Context: I have an agent whose primary task is to:

  1. Generate a prompt for an image.
  2. Use a custom tool (DalleGeneratorTool) to call the DALL-E 3 API, generate the image, and download it to a specific local path.

The Problem: The DalleGeneratorTool seems to execute all its logic successfully. My logs indicate that:

  • The DALL-E API call is successful.
  • A valid image URL is received.
  • The image download via requests appears to complete correctly, and the file size reported during the download is as expected (several MBs).
  • The file is explicitly closed, and handler.flush() and os.fsync() are used.
  • I’ve even added a time.sleep(5) seconds after the file is written and closed to give the operating system time to finalize the write.
  • The final verification using PIL.Image.open().load() also seems to indicate success in the tool’s logs.

However, when I try to open the generated image file after CrewAI finishes the agent’s execution, the image is corrupted or incomplete. Its final size on disk is consistently 197 bytes, despite my tool’s logs showing that several MBs were written.

What I’ve Tested and Confirmed:

  1. The tool’s code works outside of CrewAI: I’ve isolated the DalleGeneratorTool logic and run it in a standalone Python script (providing a fixed prompt and output path), and it works perfectly. The image downloads, saves, and opens without any issues, with the correct file size.
  2. Not a permissions or disk space issue: I’ve tried different output paths (including directories with broad permissions) and have ample disk space.
  3. Not a too-short time.sleep: I’ve increased the pause up to 5 seconds, without success.
  4. Text files work: If my agent creates text files, I don’t encounter any problems; this only happens with DALL-E generated images.

My Hypothesis: Since it works outside of CrewAI but fails within it, I suspect there might be a premature termination or a process/thread handling issue within CrewAI (or one of its underlying dependencies) that is truncating the image file before its physical write to disk is complete, despite the flush/fsync calls. It seems like the tool’s process might be exiting before the operating system’s buffers fully flush for large/binary files.

Relevant Tool Code (DalleGeneratorTool):

import os
import requests
import openai
from crewai.tools import BaseTool
from typing import Type
from pydantic import BaseModel, Field
import time
from PIL import Image

class DalleImageGeneratorInput(BaseModel):
    prompt: str = Field(..., description="The image prompt.")
    output_file: str = Field(..., description="Path where the generated image will be saved")

class DalleGeneratorTool(BaseTool):
    name: str = "DALL-E Image Generator"
    description: str = (
        "Generates an image from a prompt using the OpenAI DALL·E API and saves it to the output folder."
    )
    args_schema: Type[BaseModel] = DalleImageGeneratorInput

    def _run(self, prompt: str, output_file: str) -> str:
        try:
            client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
            
            print(f"DEBUG: Starting image generation for prompt: '{prompt}'")
            response = client.images.generate(
                model="dall-e-3",
                prompt=prompt,
                n=1,
                size="1024x1024"
            )

            print("DEBUG: Image generation response received.")
            image_url = response.data[0].url
            print(f"DEBUG: Generated image URL: {image_url}")

            print(f"DEBUG: Attempting to download image from {image_url}...")
            res = requests.get(image_url, stream=True)
            res.raise_for_status()

            output_dir = os.path.dirname(output_file)
            if output_dir and not os.path.exists(output_dir):
                os.makedirs(output_dir)
                print(f"DEBUG: Created output directory: {output_dir}")

            downloaded_size = 0
            with open(output_file, 'wb') as handler:
                for chunk in res.iter_content(chunk_size=8192):
                    if chunk:
                        handler.write(chunk)
                        downloaded_size += len(chunk)
                handler.flush()
                os.fsync(handler.fileno())

            final_file_size_after_write = os.path.getsize(output_file)
            print(f"DEBUG: Image saved to {output_file} with reported size: {final_file_size_after_write} bytes (Downloaded chunks total: {downloaded_size} bytes)")
            
            print("DEBUG: Pausing for 5 seconds to allow OS to finalize file write...")
            time.sleep(5) 
            print("DEBUG: Resuming after pause.")

            final_file_size_after_delay = os.path.getsize(output_file)
            print(f"DEBUG: File size after delay: {final_file_size_after_delay} bytes")

            try:
                with Image.open(output_file) as img_check:
                    img_check.load()
                    print(f"DEBUG: Pillow successfully loaded the image. Format: {img_check.format}, Size: {img_check.size}, Mode: {img_check.mode}")
            except Exception as pillow_load_err:
                print(f"ERROR: Pillow failed to load the image data for full verification: {pillow_load_err}")
            
            if final_file_size_after_delay == 0:
                return "ERROR: Image was downloaded but saved as a 0-byte file. Check permissions or disk space."
            elif final_file_size_after_delay < downloaded_size:
                return f"WARNING: Image was downloaded ({downloaded_size} bytes) but saved size is smaller ({final_file_size_after_delay} bytes) after delay. Possible truncation. Check disk space or permissions."
            else:
                return f"The image was generated and saved successfully at {output_file}. No further action is required."
            
        except requests.exceptions.RequestException as req_err:
            print(f"ERROR: Image download failed: {req_err}")
            return f"Failed to download the image due to network or HTTP error: {str(req_err)}"
        except openai.APIError as api_err:
            print(f"ERROR: OpenAI API call failed: {api_err}")
            return f"Image generation failed with OpenAI API error: {str(api_err)}"
        except Exception as e:
            print(f"ERROR: General operation failed: {e}")
            return f"Image generation failed: {str(e)}"

Versions:

  • CrewAI: 0.130.0
  • uv: 0.7.13
  • Python: 3.12.1

Any help or suggestions would be greatly appreciated. Thanks in advance!

There doesn’t seem to be anything wrong with your code. At least not in the part you showed, which is the definition of the custom tool.

Example code:

from crewai import Crew, Agent, Task, Process, LLM
import os

os.environ["OPENAI_API_KEY"] = "YOUR_KEY"

llm = LLM(
    model="openai/gpt-4o-mini",
    temperature=0.7
)

image_creator = Agent(
    role="Image Creator",
    goal="Generate an image from a user-provided text prompt",
    backstory=(
        "You are an AI expert at turning textual descriptions into "
        "compelling, high-quality images using generation tools."
    ),
    tools=[DalleGeneratorTool()],
    llm=llm,
    verbose=True,
    allow_delegation=False,
)

generation_task = Task(
    description="Generate an image of: '{image_prompt}'.",
    expected_output=(
        "A paragraph about what you did and where the image you made is."
    ),
    agent=image_creator,
)

image_crew = Crew(
    agents=[image_creator],
    tasks=[generation_task],
    process=Process.sequential,
    verbose=True,
)

result = image_crew.kickoff(
    inputs={
        "image_prompt": "Schrödinger's cat wondering if it's alive or dead"
    }
)

print(result.raw)

Text output:

I have generated an image based on the prompt "Schrödinger's cat wondering
if it's alive or dead." The image visually represents the famous thought
experiment in quantum mechanics, capturing the cat's curious expression and
the ambiguity of its situation. The image has been saved successfully at
the location output/schrodingers_cat.png.

Output image:

That’ts strange, it appears to be something wrong with my computer then. Thanks for checking it out!

1 Like

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.