Hi team,
I am trying to use llama3.2-vision on ollama to spin up a crew to extract text. I use the following to do the job:
import ollama
response = ollama.chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'What's in the image?',
        'images': ['test/test.jpg']
    }]
)
print(response['message']['content'])
However, I am struggling with specifying the agent to process the image from the directory. I don’t think using Tools is the right way to go. Is there any way to configure this?
Thanks!