How to run gguf models with CREW AI Agents

I have already tried multiple approaches to load gguf model through different libraries but when am passing the loaded gguf llm in Agent - getting an error →
ERROR:root:Failed to get supported params: argument of type ‘NoneType’ is not iterable

model_path = “/persistent/MBNL_MASS-MOVIL_EMAIL_AUTOMATION/Models/llama-2-7b-chat.Q4_K_M.gguf”

1 →

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = Llama(

model_path=model_path,

temperature=0.7,

max_tokens=2000,

n_gpu_layers=1, # Adjust based on your GPU

n_batch=512,

callback_manager=callback_manager,

verbose=True,

)

2 →

llm=LLM(

model=model_path,

)

3 →

class LlamaCPP_LLM(LLM):

def init(self, model_path: str, **kwargs):

self.model = Llama(model_path=model_path, **kwargs)

def chat(self, messages, stop=None):

prompt = “”.join([msg[“content”] for msg in messages])

output = self.model(prompt, stop=stop)

return output[“choices”][0][“text”]

my_llm = LlamaCPP_LLM(model_path=model_path)#, n_gpu_layers=1)

4 →

class LlamaCPP_LLM(LLM):

def init(self, model_path: str, **kwargs):

super().init(model=None) # Corrected line

self.model = Llama(model_path=model_path, **kwargs)

self._stop: Optional[List[str]] = None

@property

def stop(self) → Optional[List[str]]:

return self._stop

@stop.setter

def stop(self, stop: Optional[List[str]]) → None:

self._stop = stop

def chat(self, messages, stop=None):

prompt = “”.join([msg[“content”] for msg in messages])

output = self.model(prompt, stop=stop)

return output[“choices”][0][“text”]

my_llm = LlamaCPP_LLM(model_path=model_path)

5 →

ollama_llm = Ollama(model=model_path)

ollama_llm = Ollama(model=“my_llama2_gguf_model”)

First i have executed below to run with ollama in terminal & successfully loaded the model with ollama

#apt-get update
#apt-get install -y curl
#curl -fsSL https://ollama.com/install.sh | sh
ollama create my_llama2_gguf_model -f Modelfile #Modelfile is a text file → having only this line ----> FROM /persistent/Models/llama-2-7b-chat.Q4_K_M.gguf
#ollama serve
#ollama run my_llama2_gguf_model

NOW RUNNING - BELOW IS RUNNING & GETTING OUTPUT →

ollama_llm = Ollama(model=“my_llama2_gguf_model”, base_url=“http://127.0.0.1:11434”)
print(ollama_llm(“Say hello”))

Dustin is joining us as a Junior Front-End Developer and will be working on various projects with our team. He has a passion for creating modern and user-friendly web applications using cutting-edge technologies like React and Angular. When he’s not coding, you can find him hiking or playing video games.
Get to know Dustin better by reading his bio below… So on…

My LLM is loaded but getting error while creating an crew ai agent with llm →

generate_sql_agent = Agent(
role=“Text to SQL Generator”,
goal=“Generate the SQL for this question - {question}”,
backstory="You are working as a telecom sql agent responsible for natural language to sql conversion "
“based on this table schema - {schema}”
"And below are the few shot examples for your reference - "
“{few_shot_examples}”,
allow_delegation=False,
llm=ollama_llm,
verbose=True
)

getting an error →
ERROR:root:Failed to get supported params: argument of type ‘NoneType’ is not iterable

PLEASE HELP IF ITS POSSIBLE TO LOAD llm gguf model with llamaCpp & pass the same loaded model to Agent ?