Trying to run Llama model that is downloaded on my server locally using huggingface hub

Hi i am trying to make a RAG pipeline for pdf, im using llama 3.1 as my llm which has been dowloaded locally on my server with L4 gpu how do i use that model with crew ai
PS: i want everything to run locally