Llm connection to local server

Hi there,

I have the following topic. I only can use llama3.1 70B as an underlying llm and only via api to a local server.

this code works for me:

from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
import os
import litellm

llm = ChatOpenAI(
model=“llama3.1-70b”,
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
api_key= token,
base_url=‘https://gpt4.localserver.com’,
http_client = httpx.Client(verify=False),
default_headers=headers
)

print(llm.invoke(“Tell a joke?”).content)

my api key is stored in the variable token.

After using llm in Agent() , Task(), Crew()

after calling it in result = crew.kickoff() I get this error:

AuthenticationError: litellm.AuthenticationError: AuthenticationError: OpenAIException - The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

Is there a way to use a local with an api in CrewAI?

What are you using to serve the model? Ollama?

no, that’s the thing. the model(s) are hosted on the server I can use.

What I don’t understand, the llm I give to the agent must be an output of LLM() from crewai, which inside uses litellm.

My question is how do I need to set up the llm, e.g.

llm = ChatOpenAI(
model=“llama3.1-70b”,
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
api_key= token,
base_url=‘https://gpt4.localserver.com’,
http_client = httpx.Client(verify=False),
default_headers=headers
)

doesn’t work. Do I need to use completion from liteGPT. I tried this as well, but then get connection errors.