Hi team,
I am trying to use llama3.2-vision on ollama to spin up a crew to extract text. I use the following to do the job:
import ollama
response = ollama.chat(
model='llama3.2-vision',
messages=[{
'role': 'user',
'content': 'What's in the image?',
'images': ['test/test.jpg']
}]
)
print(response['message']['content'])
However, I am struggling with specifying the agent to process the image from the directory. I don’t think using Tools is the right way to go. Is there any way to configure this?
Thanks!