How to embed knowledge source with ollama2

toirl · December 21, 2024, 7:10pm

Hi,

I am trying to use a local knowledge source like the TextFileKnowledgeSource together with an Agent using the llama3.2 model.

llm = LLM(model="ollama/llama3.2", base_url="http://localhost:11434")
ks = TextFileKnowledgeSource(
            file_path="user_preference.txt", metadata={"foo": "bar"}
       )
return Agent(
            config=self.agents_config["dummy_agent"],
            verbose=True,
            knowledge_sources=[ks],
            llm=llm,
            embedder={"provider": "ollama", "config": {"model": "llama2"}},
        )

I already got this running with the default openAI model. However it looks like that the embedding does not work, since the model can not access the information located in the textfile.

I See others are using a custom embedder configuration when they use a different model. I can’t find any information on how to embed the knowledge source together with ollama.

Can anyone help?

toirl · December 21, 2024, 7:39pm

I think I found a solution. I need to do two things:

Configure the LLM with the pretty low temperature. Not sure why this is the reason, but I guess with higher temperature the model tends to be more creative on its own model.
Actually tell the agent in his configuration (goal) to make use of his local knowledge sources.

When setting the temperature to <=0.3 I get the expected response pretty stable.

Is this behaviour expected?

Further I would like to learn more about custom embedder. I found out that a did not need the embedder at all in my configuration. So when do I need a custom embedder and where do I find infos on how to configure it. I found the config I used my accident here in the form.

progressiveOverload · December 21, 2024, 10:17pm

Hey Torsten! As you’ve experienced, lower temperature should be better on embedding/RAG related projects where we need the information correctly. Here’s the explanation of temperature from OpenAI’s documents:

“What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.”
Also here’s the link for it: https://platform.openai.com/docs/api-reference/chat/create#chat-create-temperature (Just scroll a bit to temperature if link doesn’t land you strictly on that)

-I’m also trying to better understand the embedding stuff so i won’t be able to say anything there, but good luck!

toirl · December 22, 2024, 9:06pm

Hey Ali, Thank you for your answer.

Since crewai uses chroma as application db to make knowledge available to the LLM I found Chroma Docs useful for a start. At least I was able to find the configuration there.

Anusha09 · January 7, 2025, 4:16am

Hi,
Do you have a sample code on how you got the embedding working using ollama model?
I have a custom knowledge source that I have created in my application. On creating the crew, like this
crew = Crew(
agents=[json_analyst, summarizer_agent],
tasks=[ analysis_task, summary_task],
verbose=True,
knowledge_sources=[knowledge_source],
embedder={
“provider”: “ollama”,
“config”: {
“model”: “nomic-embed-text”
}
},
process=Process.sequential
)
the script fails while creating the embedding with the following error -
Failed to upsert documents: Expected Embedings to be non-empty list or numpy array, got in upsert.

Any pointers around this?

kapenge · January 15, 2025, 10:29am

Hi @Anusha09

Did you managed to solve your issue?

Regards,

Anusha09 · January 15, 2025, 4:59pm

Yes, using newer version of crewai helped.

kapenge · January 15, 2025, 5:06pm

Without any code adjustment?

tonykipkemboi · January 15, 2025, 6:18pm

Embedder allows you to use your choice of embedding models.

The default is OPENAI. Here’s an example for Ollama:

from crewai import Agent, Task, Crew, Process, LLM
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource


# Create a knowledge source
content = "Users name is John. He is 30 years old and lives in San Francisco."
string_source = StringKnowledgeSource(content=content)

# Create an LLM with a temperature of 0 to ensure deterministic outputs
llm = LLM(
    model="ollama/llama3.2:latest", # run !ollama list to see models you have
    temperature=0, 
    api_key=""
)

# Create an agent with the knowledge store
agent = Agent(
    role="About User",
    goal="You know everything about the user.",
    backstory="""You are a master at understanding people and their preferences.""",
    verbose=True,
    allow_delegation=False,
    llm=llm,
)
task = Task(
    description="Answer the following questions about the user: {question}",
    expected_output="An answer to the question.",
    agent=agent,
)

crew = Crew(
    agents=[agent],
    tasks=[task],
    verbose=True,
    process=Process.sequential,
    knowledge_sources=[string_source],
    embedder={
        "provider": "ollama",
        "config": {
            "model": "nomic-embed-text",
            "api_key": ""
        }
    }
)

result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"})

Also, setting the temperature=0 basically makes the model less creative and a bit more deterministic and vice versa.

Darshan_Ravichandira · January 25, 2025, 4:11pm

I tried implementing this and I got an error.

from crewai import Crew, Process
embedder_config_ollama={
        "provider": "ollama",
        "config": {
            "model": "nomic-embed-text",             
        }}

Rag_agent=Crew(
    agents=[context_retriver_agent,senior_api_developer_agent],
    tasks=[context_retrieval_task,api_development_task],
    memory=True,
    verbose=True,
    knowledge_sources=[crewdocling],
    embedder=embedder_config_ollama,
    long_term_memory=LongTermMemory(storage=LTMSQLiteStorage(db_path="/content/long_term/mydatabase.db")),

short_term_memory=ShortTermMemory(storage=RAGStorage(type='short_term',path="./short", embedder_config=embedder_config_ollama)),

entity_memory=EntityMemory(storage=RAGStorage(type="entity_storage",path="./entity", embedder_config=embedder_config_ollama)))

[ERROR]: Failed to upsert documents: timed out in upsert

I don’t know what I did wrong. I installed ollama on and run it like this.

 !pip install colab-xterm   
 %load_ext colabxterm
 %xterm

And in the xterm I started the ollama server by running

ollama serve

Then I pulled the models like this.

!ollama pull llama3.1:8b  &
!ollama pull nomic-embed-text
!ollama list

Roopesh · May 31, 2025, 8:56pm

Issue: Trouble Using Knowledge Feature with Google Gemini Free Tier in CrewAI

Hi everyone,

I’m trying to use the Knowledge feature in CrewAI with the Google Gemini free-tier model as the embedder. However, I’m encountering a couple of issues:

Initially, when running my crew via uv run --active run_crew, I got the following error:
```
Failed to init knowledge: The Google Generative AI python package is not installed. Please install it with `pip install google-generativeai`
```
I resolved this by installing the package as instructed.
After that, the crew runs, but it doesn’t seem to actually use the knowledge base. The agent responds with:

“I am sorry, but I do not have access to personal information about individuals like John, including his city of residence and age. My knowledge is limited to the information that has been shared with me. Therefore, I cannot answer the question.”

It looks like the knowledge feature isn’t activating or loading correctly, even after installing the required dependencies.

What I’d Like Help With:

Is it possible to use Gemini free-tier for embeddings in CrewAI’s knowledge feature?
How can I ensure that my knowledge base is properly initialized and used during task execution?
Are there any additional setup steps or limitations with using Gemini for this?

Thanks in advance for your help!

Topic		Replies	Views
Custom knowledge source embedding failing for local Ollama model General	3	473	January 8, 2025
Custom knowledge source not working for Crew and Agents CrewAI Community Support agent , crewai	2	240	February 24, 2025
Failing to embed knowledge source using ollama CrewAI Community Support	3	414	February 6, 2025
CrewAI is not allowing the use of a knowledge source when using an Ollama-based LLM. It defaults to requiring an OpenAI API key, even though Ollama is correctly configure General openai , agent , crewai	6	634	February 6, 2025
I want to use the local LLM via Ollama, but I'm facing some issues. It keeps asking me for the OpenAI API key CrewAI Community Support agent , crewai	7	978	March 13, 2025

How to embed knowledge source with ollama2

What I’d Like Help With:

Related topics