Agent with Ollama on a remote server

Hi!
I have Llama3.1 (ollama-based) running on a remote server, accessible via an HTTP API (a header “Authorization” must be passed to the requests). I’m facing a problem with using it as an LLM for my Agent.
This request works fine:

curl -X 'POST' \
  ''LLAMA_URL'' \
  -H 'accept: application/json' \
  -H 'Authorization: 'LLAMA_AUTHORIZATION_TOKEN'' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "llama3.1",
  "prompt": "Why is the sea blue?"
}'

But running the following code

@agent
    def researcher(self) -> Agent:
        return Agent(
            config=self.agents_config['researcher'],
            # tools=[MyCustomTool()], # Example of custom tool, loaded on the beginning of file
            verbose=True,
            llm=LLM(model="ollama/llama3.1", 
                    base_url=os.environ.get('LLAMA_URL')
                    api_key=os.environ.get('LLAMA_AUTHORIZATION_TOKEN'))
		)

I get litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - {"message": "Token is missing or invalid"}
The token is 100% correct

I have also tried

from langchain_community.llms.ollama import Ollama

@agent
    def researcher(self) -> Agent:
        return Agent(
            config=self.agents_config['researcher'],
            verbose=True,
            llm=Ollama(model="llama3.1", #also tried "ollama/llama3.1"
                       base_url=os.environ.get('LLAMA_URL')
                       headers={"Authorization": f"{os.environ.get('LLAMA_AUTHORIZATION_TOKEN')}"})
		)

But in that case I get litellm.exceptions.BadRequestError( # type: ignore provider you are trying to call. You passed model=Ollama Params: {'model': 'llama3.1', 'format': None, 'options': {'mirostat': None, 'mirostat_eta': None, 'mirostat_tau': None, 'num_ctx': None, 'num_gpu': None, 'num_thread': None, 'num_predict': None, 'repeat_last_n': None, 'repeat_penalty': None, 'temperature': None, 'stop': None, 'tfs_z': None, 'top_k': None, 'top_p': None}, 'system': None, 'template': None, 'keep_alive': None, 'raw': None} Pass model as E.g. For 'Huggingface' inference endpoints pass in completion(model=‘huggingface/starcoder’,…) Learn more: https://docs.litellm.ai/docs/providers

Could you please give me a hint on what I should change so it works?

Did you run this before running the code?

export LLAMA_URL='http://your-ollama-server-url'
export LLAMA_AUTHORIZATION_TOKEN='your-auth-token'

Or try this as well. Haven’t tested it but would love to see what error you get:

import requests
from crewai import LLM

class CustomLLM(LLM):
    def _post(self, payload):
        headers = {
            'Authorization': f'Bearer {os.environ["LLAMA_AUTHORIZATION_TOKEN"]}',
            'Content-Type': 'application/json',
            'accept': 'application/json'
        }
        response = requests.post(
            os.environ['LLAMA_URL'], headers=headers, json=payload
        )
        response.raise_for_status()
        return response.json()

agent = Agent(
    llm=CustomLLM(
        model="ollama/llama3.1",
        base_url=os.environ['LLAMA_URL']
    ),
    verbose=True
)
1 Like
  1. I have the variables stored in the .env file, and then in my code I do load_dotenv(), so the variables are imported correctly - I’ve checked it.
  2. I’ll check it today and let you know, thank you!

@tonykipkemboi Hi! I had to make a few changes to the code you proposed, but it seems that it finally works!
The changes were (just in case anyone struggles with this as well):

  1. The overwritten method should be call, not _post
  2. I needed to add callbacks parameter to the method
  3. In my case, Authorization token had to include just the value itself, without “Bearer”. I think that is the case why I couldn’t use an object of LLM class. Probably the api_key is passed in a different way, just like OpenAI API requires it ('Authorization': 'Bearer xxxxxx')
  4. The POST request should include the direct endpoint, not the same one, as the base_url, but /api/generate, /api/chat, etc.
import requests
from crewai import LLM

class CustomLLM(LLM):
    def call(self, payload, callbacks):

        headers = {
            'Authorization': f'{os.environ["LLAMA_AUTHORIZATION_TOKEN"]}',
            'Content-Type': 'application/json',
            'accept': 'application/json'
        }
        
                
        data = {
            "model": os.environ["LLAMA_MODEL_NAME"],
            "messages": payload
        }
        
        response = requests.post(
			os.environ['LLAMA_URL']+"/api/chat", headers=headers, json=data)
        response.raise_for_status()
        
        response_data = response.json()
        response_data = response_data["message"]["content"].strip()

        return response_data

agent = Agent(
            config=self.agents_config['researcher'],
            # tools=[MyCustomTool()], # Example of custom tool, loaded on the beginning of file
            verbose=True,
            llm=CustomLLM(
                model=os.environ['MODEL'],
                base_url=os.environ['LLAMA_URL']
            )
        )

Also, os.environ[‘MODEL’] is “ollama/llama3.1”, but the os.environ[“LLAMA_MODEL_NAME”] is just “llama3.1”

Thank you very much!

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.

Awesome, glad it worked.

1 Like