Hi Team,
I need to use the multimodal LLM in crewai. In case of multimodal, usually image need to convert into base64 image and send that encoded value into LLM param as “Image_url” in message request body. I would like to know whether crewai have an options to send that image encoded value vis crewai’s LLM() package.
Here is my request body:
request_body_sample = {
"messages": [{"role":"system","content":system_prompt}, {"role":"user","content":[{"type":"text","text":user_text_input},{"type":"image_url","image_url":{"url": f"data:image/jpeg;base64,{imgBase64EncValue}"}}]}],
"project_id": credentials.get("project_id"),
"model_id": "watsonx/meta-llama/llama-3-2-90b-vision-instruct",
"decoding_method": "sample",
"random_seed": 568743,
"temperature": 0,
"top_k": 50,
"top_p": 1,
"repetition_penalty": 1,
"max_tokens": 8000
}
response = requests.post(
credentials.get("url"),
headers=headers,
json=request_body_sample
)
if response.status_code != 200:
raise Exception("Non-200 response: " + str(response.text))
data = response.json()
print(data['choices'][0]['message']['content'])
Or please help me to use multimodal in crewai in better way.
when I ask the question related to “multimodal support” documentation chatbot. Its output the message like
crewai doesn’t support multimodal?
Thanking you.