String Knowledge sources not working with Gemini

subbu · December 15, 2024, 4:20pm

Hey guys,

Have anyone tried using StringKnowledgeSource with Gemini? Looks like there is a bug, although Gemni model is configured correctly, KnowledgeSource is always trying to create default embedder and it looks for OpenAI keys and it fails.

../../.venv/lib/python3.12/site-packages/crewai/project/crew_base.py:26: in __init__
    super().__init__(*args, **kwargs)
hr_slack_bot/crew.py:46: in __init__
    self.string_source = StringKnowledgeSource(
../../.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py:51: in __init__
    self._set_embedder_config(embedder_config)
../../.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py:174: in _set_embedder_config
    else self._create_default_embedding_function()
../../.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py:158: in _create_default_embedding_function
    return OpenAIEmbeddingFunction(

Any solution / work arround is welcome.

Thanks.

rokbenko · December 15, 2024, 4:33pm

@subbu It’s not a bug. CrewAI uses the OpenAI embedding LLM by default, and that’s why the code searched for the OpenAI API key that couldn’t be found. However, you can customize the embedding LLM to your liking by configuring the embedder for the knowledge store. See the docs.

# ...

string_source = StringKnowledgeSource(
    content="Users name is John. He is 30 years old and lives in San Francisco.",
)

crew = Crew(
    ...,
    knowledge_sources=[string_source],
    embedder={
        "provider": "openai", # Set the embedding LLM provider here
        "config": {"model": "text-embedding-3-small"}, # Set the embedding LLM here
    },
)

# ...

subbu · December 15, 2024, 4:44pm

Thanks for the quick reply @rokbenko . Actually I wanted to use the google embedder.

My code looks like this:

 crew = Crew(
            ....,
            verbose=True,
            knowledge_sources=[self.string_source],
            embedder={
                "provider": "google",
                "config": {"model": "models/embedding-001"},
            },
        )

For this code, I’m getting this error

File "./.venv/lib/python3.12/site-packages/crewai/project/crew_base.py", line 26, in __init__
super().__init__(*args, **kwargs)
File "./src/demo_crew/crew.py", line 31, in __init__
self.string_source = StringKnowledgeSource(
                     ^^^^^^^^^^^^^^^^^^^^^^
                     File "./.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 File "./.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 51, in __init__
self._set_embedder_config(embedder_config)
File "./.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 174, in _set_embedder_config
else self._create_default_embedding_function()
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "./.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 158, in _create_default_embedding_function
return OpenAIEmbeddingFunction(
    ^^^^^^^^^^^^^^^^^^^^^^^^
    File "./.venv/lib/python3.12/site-packages/chromadb/utils/embedding_functions/openai_embedding_function.py", line 56, in __init__
raise ValueError(
    ValueError: Please provide an OpenAI API key. You can get one at https://platform.openai.com/account/api-keys

What I suspect is irrespective of the provider passed, knowledge store is always trying to create default embedder.

Can you try with above embedder config, and let me know your results pls?

rokbenko · December 15, 2024, 7:08pm

@subbu Can you please share your full code?

subbu · December 16, 2024, 2:37am

Here you go

crew.py

from crewai import Agent, Crew, Process, Task
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
from crewai.project import CrewBase, agent, crew, task
from pydantic import BaseModel


class Response(BaseModel):
    content: str


@CrewBase
class HrSlackBotCrew():
    """HrSlackBot crew"""

    def __init__(self, knowledge: str = None):
        if knowledge:
            self.string_source = StringKnowledgeSource(
                content=knowledge,
            )
        else:
            self.string_source = None

    @agent
    def hr_agent(self) -> Agent:
        return Agent(
            config=self.agents_config['hr_agent'],
        )

    @task
    def hr_task(self) -> Task:
        return Task(
            config=self.tasks_config['hr_task'],
            output_json=Response,
        )

    @crew
    def hr_crew(self) -> Crew:
        """Creates the HR Agent crew"""
        return Crew(
            agents=[self.hr_agent()],  # Automatically created by the @agent decorator
            tasks=[self.hr_task()],  # Automatically created by the @task decorator
            process=Process.sequential,
            verbose=True,
            knowledge_sources=[self.string_source] if self.string_source else None,
            embedder={
                "provider": "google",
                "config": {
                    "model_name": "models/embedding-001"
                }
            }
        )

1. Execution without Knowledge

def run():
    """
    Run the crew.
    """
    inputs = {
        "user": "Subbu",
        "query": "Who is John?",
        'context': "Subbu: Hi, good morning."
    }
    HrSlackBotCrew().hr_crew().kickoff(inputs=inputs)

Output:

# Agent: HR Associate
## Final Answer: 
```json
{
  "content": "Hi Subbu! 👋 Good morning to you too!\n\nI'm afraid I don't have access to personal information about other employees.  It's important to protect everyone's privacy. 😊\n\nIf you have any HR policy questions, I'd be happy to help!  For this particular question, you might want to ask someone else.  Let me know if there's anything else I can assist you with!"
}

2. Execution with Knowledge

def run():
    """
    Run the crew.
    """
    inputs = {
        "user": "Subbu",
        "query": "Who is John?",
        'context': "Subbu: Hi, good morning."
    }
    HrSlackBotCrew("Employee name is John. He is 30 years old and lives in San Francisco.").hr_crew().kickoff(inputs=inputs)

Error:

File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/crewai/project/crew_base.py", line 26, in __init__
    super().__init__(*args, **kwargs)
  File "/crewai/hr_slack_bot/src/hr_slack_bot/crew.py", line 24, in __init__
    self.string_source = StringKnowledgeSource(
                         ^^^^^^^^^^^^^^^^^^^^^^
  File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 51, in __init__
    self._set_embedder_config(embedder_config)
  File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 174, in _set_embedder_config
    else self._create_default_embedding_function()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 158, in _create_default_embedding_function
    return OpenAIEmbeddingFunction(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/crewai/hr_slack_bot/.venv/lib/python3.12/site-packages/chromadb/utils/embedding_functions/openai_embedding_function.py", line 56, in __init__
    raise ValueError(
ValueError: Please provide an OpenAI API key. You can get one at https://platform.openai.com/account/api-keys
An error occurred while running the crew: Command '['uv', 'run', 'run_crew']' returned non-zero exit status 1.

rokbenko · December 16, 2024, 7:51am

@subbu Will let the CrewAI staff know about the issue and get back to you. It might be a bug.

fredzolio · December 16, 2024, 12:55pm

I’m having the same issue here. Whatever i do for try fix that, in the end anything don’t solve.

subbu · December 18, 2024, 2:29pm

hi @rokbenko can i get some update on this issue pls?

rokbenko · December 18, 2024, 2:38pm

@subbu They couldn’t reproduce the error. After that, the conversation died. I pinged CrewAI staff about it.

subbu · December 18, 2024, 3:05pm

@rokbenko The issue was easily reproducible with the above shared code. Let me know if any more details required, I’m more than happy to help the team to reproduce the issue. Thanks.

Indrajit_Haridas · December 18, 2024, 5:44pm

I am having the same issue. Using the code from the guide in the documentation throws the ValueError if I use any other embedded other than open-ai.

crewai version is 0.85.0

Mike_Watson · December 19, 2024, 10:24am

I also get the same issue using Amazon Bedrock…


	@crew
	def crew(self) -> Crew:
		"""Creates the Testone crew"""

		content = "Users name is John. He is 30 years old and lives in San Francisco."
		string_source = StringKnowledgeSource(
			content=content,
		)


		return Crew(
			agents=self.agents,
			tasks=self.tasks,
			process=Process.sequential,
			verbose=True,
			knowledge_sources=[string_source],
			embedder=dict(
				provider="bedrock",
				config=dict(
					model="bedrock/amazon.titan-embed-text-v2:0",
					region="us-east-1"
				)
			)
		)



  File "/root/development/crewai-review/testone/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 51, in __init__
    self._set_embedder_config(embedder_config)
  File "/root/development/crewai-review/testone/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 174, in _set_embedder_config
    else self._create_default_embedding_function()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/development/crewai-review/testone/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 158, in _create_default_embedding_function
    return OpenAIEmbeddingFunction(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/development/crewai-review/testone/.venv/lib/python3.12/site-packages/chromadb/utils/embedding_functions/openai_embedding_function.py", line 56, in __init__
    raise ValueError(
ValueError: Please provide an OpenAI API key. You can get one at https://platform.openai.com/account/api-keys

crewai version: 0.86.0

subbu · December 19, 2024, 1:19pm

I came across this issue in GH

github.com/crewAIInc/crewAI

crewAI is asking me for a openAI API key while im using gemini as model

opened 10:57PM - 12 Jun 24 UTC

closed 12:17PM - 22 Aug 24 UTC

LucasCBT

no-issue-activity

`from crewai import Agent, Task, Crew from google.cloud import bigquery from l…angchain_google_vertexai import VertexAI from crewai_tools import SerperDevTool llm = VertexAI( temperature=0.0, max_output_tokens=4096, model_name="gemini-1.5-pro", top_k=1, top_p=0.0 ) search_tool = SerperDevTool() researcher = Agent( role='Senior Researcher', goal='Uncover groundbreaking technologies in {topic}', verbose=True, memory=True, llm=llm, backstory=( "Driven by curiosity, you're at the forefront of" "innovation, eager to explore and share knowledge that could change" "the world." ), tools=[search_tool], allow_delegation=True ) writer = Agent( role='Writer', goal='Narrate compelling tech stories about {topic}', verbose=True, memory=True, llm=llm, backstory=( "With a flair for simplifying complex topics, you craft" "engaging narratives that captivate and educate, bringing new" "discoveries to light in an accessible manner." ), tools=[search_tool], allow_delegation=False ) research_task = Task( description=( "Identify the next big trend in {topic}." "Focus on identifying pros and cons and the overall narrative." "Your final report should clearly articulate the key points," "its market opportunities, and potential risks." ), expected_output='A comprehensive 3 paragraphs long report on the latest AI trends.', tools=[search_tool], agent=researcher, ) write_task = Task( description=( "Compose an insightful article on {topic}." "Focus on the latest trends and how it's impacting the industry." "This article should be easy to understand, engaging, and positive." ), expected_output='A 4 paragraph article on {topic} advancements formatted as markdown.', tools=[search_tool], agent=writer, async_execution=False, ) crew = Crew( agents=[writer], tasks=[write_task], memory=True, cache=True, max_rpm=100 ) result = crew.kickoff(inputs={'topic': 'AI in healthcare'}) print(result)` AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: fake. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}} Can someone help me with that? What am i doing wrong? All the code is running without problem except for the "result = ..."

Not sure how the issue got closed without any fix.
@rokbenko could you help to re-open the issue? Thanks.

rokbenko · December 19, 2024, 1:51pm

No. I’m not part of CrewAI staff. I pinged them that the issue hasn’t been solved, but there has been no response so far.

Mike_Watson · December 19, 2024, 2:24pm

I did some more digging.
CrewAI uses embedchain.ai under the covers for all embedding.
In my case (AWS Bedrock), assigning the embedding to an Agent works when configuring it according to embedchain.ai. The provider’s name is aws_bedrock, not bedrock (as per the CrewAI docs).

So, the following code works for me.


from crewai import Agent, Crew, Process, Task
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
from crewai.project import CrewBase, agent, crew, task


@CrewBase
class Red01():
	"""Red01 crew"""
	agents_config = 'config/agents.yaml'
	tasks_config = 'config/tasks.yaml'

	@agent
	def researcher(self) -> Agent:

		content = "Users name is freddy12. He is 32 years old and lives in Lincon,UK."
		string_source = StringKnowledgeSource(
			content=content,
			metadata={"source": "user"},
		)
		return Agent(
			config=self.agents_config['researcher'],
			verbose=True,
			knowledge_sources=[string_source],
			embedder={
				"provider": "aws_bedrock",
				"config": {"model": "amazon.titan-embed-text-v2:0"},
			},
		)

	@task
	def research_task(self) -> Task:
		return Task(
			config=self.tasks_config['research_task'],
		)


	@crew
	def crew(self) -> Crew:
		"""Creates the Red01 crew"""

		content = "Users name is freddy12. He is 32 years old and lives in Lincon,UK."
		string_source = StringKnowledgeSource(
			content=content,
			metadata={"source": "user"},
		)	
			
		return Crew(
			agents=self.agents,
			tasks=self.tasks,
			process=Process.sequential,
			verbose=True,
			# knowledge_sources=[string_source],
			# embedder={
			# 	"provider": "aws_bedrock",
			# 	"config": {"model": "amazon.titan-embed-text-v2:0"},
			# },
		)

However, uncommenting what is commented out in the def crew function fails as crewai looks to be validating the provider aws_bedrock against an internal list – bedrock.

tonykipkemboi · December 19, 2024, 3:37pm

For Bedrock, make sure you add Boto3 by running:

uv add boto3

and you need a MODEL= env variable too.

docs will be updated on this as well.

tonykipkemboi · December 19, 2024, 3:39pm

Hi @subbu - Just catching up to this now. I will test this today and update you.

tonykipkemboi · December 19, 2024, 3:40pm

@Mike_Watson, how are you setting your keys in the .env file?

Mike_Watson · December 20, 2024, 7:50am

Using the default profile in .env… Boto3 is installed.

fredzolio · December 20, 2024, 12:45pm

Hey @tonykipkemboi, any update about this issue?

Topic		Replies	Views
[BUG] Knowledge Source metadata generation doesn't work (and possibly the knowledge store at all) CrewAI Community Support	16	813	February 20, 2025
Issue: Trouble Using Knowledge Feature with Google Gemini Free Tier in CrewAI CrewAI Community Support	4	66	June 4, 2025
StringKnowledgeSource is erroring out General crewai	3	223	May 2, 2025
Failing to embed knowledge source using ollama CrewAI Community Support	4	457	June 12, 2025
Custom knowledge source not working for Crew and Agents CrewAI Community Support agent , crewai	2	264	February 24, 2025

String Knowledge sources not working with Gemini

Related topics