Multimodal agents dont work with Gemini due to LiteLLM errors

Rajeev_SG · January 26, 2025, 12:47pm

This code works:

import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, LLM
import google.generativeai as genai

# Load environment variables
load_dotenv()

# Initialize Gemini
GOOGLE_API_KEY = os.getenv('GOOGLE_API_KEY')
if not GOOGLE_API_KEY:
    raise ValueError("Please set GOOGLE_API_KEY environment variable")

genai.configure(api_key=GOOGLE_API_KEY)

# Configure CrewAI to use Gemini
llm = LLM(
    model="gemini/gemini-1.5-pro-latest",
    temperature=0.7,
    api_key=GOOGLE_API_KEY
)

def analyze_image():
    print("\nStarting image analysis...\n")
    
    # Validate image path
    image_path = "data/test2.png"
    abs_image_path = os.path.abspath(image_path)
    
    if not os.path.exists(abs_image_path):
        raise FileNotFoundError(f"Image not found at: {abs_image_path}")
    
    print(f"Found image at: {abs_image_path}")
    
    # Create a multimodal image analyst agent
    image_analyst = Agent(
        role='Image Analyst',
        goal='Analyze images and provide detailed, accurate descriptions with meaningful insights',
        backstory='''You are an expert image analyst with years of experience in visual content 
        interpretation. You have a keen eye for detail and can identify subtle elements that others 
        might miss. Your expertise spans across various types of images, from photographs to 
        technical diagrams, and you excel at providing clear, structured analysis.''',
        llm=llm,
        multimodal=True,  # Enable multimodal capabilities
        verbose=True,  # Enable detailed execution logs for debugging
        max_iter=1
    )

    # Create an image analysis task
    image_task = Task(
        description=f'''Analyze the image at {abs_image_path} and generate a comprehensive description.
        Focus on:
        1. Main subjects and objects in the image
        2. Visual composition, colors, and lighting
        3. Any text or notable details
        4. Context and setting
        5. Any notable patterns or relationships between elements''',
        expected_output='''A detailed, structured analysis of the image covering all the requested focus areas.
        The output should be clear, concise, and organized into sections for easy reading.''',
        agent=image_analyst
    )

    # Create and run the crew
    crew = Crew(
        agents=[image_analyst],
        tasks=[image_task],
        verbose=True  # Enable detailed execution logs
    )

    result = crew.kickoff()
    print("\nImage Analysis Result:")
    print(result)

if __name__ == "__main__":
    analyze_image()

However the console output shows either a 429, 500 or 503 error (regardless of how many requests are being sent):

I have tried to use max_iter=1, max_execution_time=5, max_rpm=5 and max_retry_limit - no matter what I try, the result is always the same: a literal spam of 429, 500 or 503 errors.

What confuses me, is when I make a similar request using the gemini SDK it works fine, I never get 429, 500 or 503 errors. Also using Crew AI with Gemini just for text also works fine - no errors.

Has anybody gotten multimodal agents working with Gemini? Do we know if it works?

Let me know if I can provide any additional information.

graindorgeanthony · January 27, 2025, 8:18am

Same here! I’ve got EXACTLY the same error (but for text only)… LiteLLM really starts to annoy me; it seems impossible to use gemini reliably with Crewai due to these kinds of errors.

Even when I use “gemini/gemini-1.5-pro”

    gemini_creative_1_5_pro = LLM(
        model="gemini/gemini-1.5-pro",
        api_key=os.environ["GOOGLE_API_KEY"],
        temperature=1,
        max_completion_tokens=8192
    )

Anyone know how to solve this?

Is it because we us the LLM object instead of the Agent(llm=“gemini/gemini-1.5-pro”)?

graindorgeanthony · February 6, 2025, 12:30pm

Follow-up; this works now since the 0.100 crewai version.

Topic		Replies	Views
I prove to use agent ai with gemini but give a value error and it want an openaikey CrewAI Community Support agent	2	137	January 23, 2025
Gemini not working since update CrewAI Community Support crewai	15	968	October 30, 2024
Gemini has stopped working again! LLMs gpt4o	2	264	January 22, 2025
Automated Project notebook Gemini CrewAI Community Support	4	892	January 5, 2025
CrewAI multimodal Capability CrewAI Community Support	4	114	April 26, 2025

Multimodal agents dont work with Gemini due to LiteLLM errors

Related topics