AzureOpenAI LiteLLM CrewAI Connection Error

This is the error I keep getting using this code,

Error: litellm.APIError: APIError: OpenAIException - Connection error.

from crewai import Agent, Crew, Task, Process, LLM
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Azure OpenAI Configuration
AZURE_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT_RG")
API_VERSION = os.getenv("OPENAI_API_VERSION_RG")
DEPLOYMENT_NAME = os.getenv("AZURE_OPENAI_GPT4o_DEPLOYMENT")
API_KEY = os.getenv("AZURE_OPENAI_KEY_RG")

# Configure the LLM with all required Azure parameters
llm = LLM(
    provider="azure",
    model="gpt-4o",  # Your model name - don't use azure/ prefix
    api_key=API_KEY,
    azure_endpoint=AZURE_ENDPOINT,
    api_version=API_VERSION
)

# Define the Edge Case Identifier Agent
edge_case_identifier = Agent(
    role="Edge Case Identifier",
    goal="Identify Transaction ID's where the ML model's predictions are likely misclassified or uncertain. Do not analyse, only flag these cases.",
    backstory="An AI agent trained to identify potential edge cases based on misclassifications in the model predictions without further analysis.",
    llm=llm,  # Use the configured LLM
    verbose=True
)

# Define the Task for the Agent
edge_case_task = Task(
    description="Flag transactions with prediction scores between 0.7143434968407539 and 0.864343496840754 as potential edge cases that require further review.",
    agent=edge_case_identifier,
    expected_output="A list of transaction IDs that fall within the specified score range and might be edge cases.",
)

# Create the Crew with the defined agent and task
crew = Crew(
    agents=[edge_case_identifier],
    tasks=[edge_case_task],
    verbose=True
)

I suggest you run the code below and reply with the full exception traceback:

from crewai import LLM
import os
import litellm
litellm._turn_on_debug()

os.environ["AZURE_API_KEY"] = "" # "your-azure-api-key"
os.environ["AZURE_API_BASE"] = "" # "https://your-endpoint.openai.azure.com"
os.environ["AZURE_API_VERSION"] = "" # "2023-05-15"

azure_llm = LLM(
    model="azure/gpt-4o",
)

azure_response = azure_llm.call(
    "Hey, who are you?"
)

print(f'\nAzure Response:\n\n{azure_response}\n')

:cross_mark: Crew: crew
└── :clipboard: Task: 563d013c-f599-431c-a0af-e784d8ea7f71
Assigned to: Edge Case Identifier
Status: :cross_mark: Failed
└── :robot: Agent: Edge Case Identifier
Status: In Progress
├── :cross_mark: LLM Failed
├── :cross_mark: LLM Failed
└── :brain: Thinking…

16:42:43 - LiteLLM:DEBUG: utils.py:298 - DEBUG:LiteLLM: 16:42:43 - LiteLLM:DEBUG: utils.py:298 - Request to litellm: DEBUG:LiteLLM:Request to litellm: 16:42:43 - LiteLLM:DEBUG: utils.py:298 - litellm.completion(model=‘azure/gpt-4o’, messages=[{‘role’: ‘user’, ‘content’: ‘Hey, who are you?’}], stop=, stream=False) DEBUG:LiteLLM:litellm.completion(model=‘azure/gpt-4o’, messages=[{‘role’: ‘user’, ‘content’: ‘Hey, who are you?’}], stop=, stream=False) 16:42:43 - LiteLLM:DEBUG: utils.py:298 - DEBUG:LiteLLM: 16:42:43 - LiteLLM:DEBUG: litellm_logging.py:377 - self.optional_params: {} DEBUG:LiteLLM:self.optional_params: {} 16:42:43 - LiteLLM:DEBUG: utils.py:298 - SYNC kwargs[caching]: False; litellm.cache: None; kwargs.get(‘cache’)[‘no-cache’]: False DEBUG:LiteLLM:SYNC kwargs[caching]: False; litellm.cache: None; kwargs.get(‘cache’)[‘no-cache’]: False 16:42:43 - LiteLLM:DEBUG: transformation.py:115 - Translating developer role to system role for non-OpenAI providers. DEBUG:LiteLLM:Translating developer role to system role for non-OpenAI providers. 16:42:43 - LiteLLM:INFO: utils.py:2896 - LiteLLM completion() model= gpt-4o; provider = azure INFO:LiteLLM: LiteLLM completion() model= gpt-4o; provider = azure 16:42:43 - LiteLLM:DEBUG: utils.py:2899 - LiteLLM: Params passed to completion() {‘model’: ‘gpt-4o’, ‘functions’: None, ‘function_call’: None, ‘temperature’: None, ‘top_p’: None, ‘n’: None, ‘stream’: False, ‘stream_options’: None, ‘stop’: , ‘max_tokens’: None, ‘max_completion_tokens’: None, ‘modalities’: None, ‘prediction’: None, ‘audio’: None, ‘presence_penalty’: None, ‘frequency_penalty’: None, ‘logit_bias’: None, ‘user’: None, ‘custom_llm_provider’: ‘azure’, ‘response_format’: None, ‘seed’: None, ‘tools’: None, ‘tool_choice’: None, ‘max_retries’: None, ‘logprobs’: None, ‘top_logprobs’: None, ‘extra_headers’: None, ‘api_version’: None, ‘parallel_tool_calls’: None, ‘drop_params’: None, ‘reasoning_effort’: None, ‘additional_drop_params’: None, ‘messages’: [{‘role’: ‘user’, ‘content’: ‘Hey, who are you?’}]} DEBUG:LiteLLM:

16:42:44 - LiteLLM:DEBUG: utils.py:4441 - model_info: {‘key’: ‘azure/gpt-4o-2024-05-13’, ‘max_tokens’: 4096, ‘max_input_tokens’: 128000, ‘max_output_tokens’: 4096, ‘input_cost_per_token’: 5e-06, ‘cache_creation_input_token_cost’: None, ‘cache_read_input_token_cost’: None, ‘input_cost_per_character’: None, ‘input_cost_per_token_above_128k_tokens’: None, ‘input_cost_per_query’: None, ‘input_cost_per_second’: None, ‘input_cost_per_audio_token’: None, ‘output_cost_per_token’: 1.5e-05, ‘output_cost_per_audio_token’: None, ‘output_cost_per_character’: None, ‘output_cost_per_token_above_128k_tokens’: None, ‘output_cost_per_character_above_128k_tokens’: None, ‘output_cost_per_second’: None, ‘output_cost_per_image’: None, ‘output_vector_size’: None, ‘litellm_provider’: ‘azure’, ‘mode’: ‘chat’, ‘supports_system_messages’: None, ‘supports_response_schema’: None, ‘supports_vision’: True, ‘supports_function_calling’: True, ‘supports_assistant_prefill’: False, ‘supports_prompt_caching’: True, ‘supports_audio_input’: False, ‘supports_audio_output’: False, ‘supports_pdf_input’: False, ‘supports_embedding_image_input’: False, ‘supports_native_streaming’: None, ‘tpm’: None, ‘rpm’: None} DEBUG:LiteLLM:model_info: {‘key’: ‘azure/gpt-4o-2024-05-13’, ‘max_tokens’: 4096, ‘max_input_tokens’: 128000, ‘max_output_tokens’: 4096, ‘input_cost_per_token’: 5e-06, ‘cache_creation_input_token_cost’: None, ‘cache_read_input_token_cost’: None, ‘input_cost_per_character’: None, ‘input_cost_per_token_above_128k_tokens’: None, ‘input_cost_per_query’: None, ‘input_cost_per_second’: None, ‘input_cost_per_audio_token’: None, ‘output_cost_per_token’: 1.5e-05, ‘output_cost_per_audio_token’: None, ‘output_cost_per_character’: None, ‘output_cost_per_token_above_128k_tokens’: None, ‘output_cost_per_character_above_128k_tokens’: None, ‘output_cost_per_second’: None, ‘output_cost_per_image’: None, ‘output_vector_size’: None, ‘litellm_provider’: ‘azure’, ‘mode’: ‘chat’, ‘supports_system_messages’: None, ‘supports_response_schema’: None, ‘supports_vision’: True, ‘supports_function_calling’: True, ‘supports_assistant_prefill’: False, ‘supports_prompt_caching’: True, ‘supports_audio_input’: False, ‘supports_audio_output’: False, ‘supports_pdf_input’: False, ‘supports_embedding_image_input’: False, ‘supports_native_streaming’: None, ‘tpm’: None, ‘rpm’: None} 16:42:44 - LiteLLM:DEBUG: litellm_logging.py:846 - response_cost: 0.0006349999999999999 DEBUG:LiteLLM:response_cost: 0.0006349999999999999

:cross_mark: Crew: crew
└── :clipboard: Task: 563d013c-f599-431c-a0af-e784d8ea7f71
Assigned to: Edge Case Identifier
Status: :cross_mark: Failed
└── :robot: Agent: Edge Case Identifier
Status: In Progress
├── :cross_mark: LLM Failed
└── :cross_mark: LLM Failed

Azure Response:

Hello! I’m an AI language model created by OpenAI, here to help you with any questions you might have or to assist you with various tasks. How can I assist you today?

The fact that there was a valid response shows that your provider’s (Azure) configuration must follow the provided template: