Hi!
I would greatly appreciate any insights on this issue.
I’m attempting to utilize a markdown.md file as a knowledge source within a crew that is currently in a flow.
The entire environment is local, and I’m working with Ollama.
The knowledge file is located in a folder named [knowledge], which resides in the same directory as main.py.
Knowledge integration works seamlessly in a single .py crew, in that I can also successfully use a .md file.
Here is the Error:
[ERROR]: Failed to upsert documents: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'
[WARNING]: Failed to init knowledge: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'
Versions:
Crewai: 0.100.1
crewai-tools: 0.33.0
Python 3.12.8
Using Ollama only, local llms i have tried. deepseekR1:7b, llama3.2:3b. others
Different temps.
Here is my code:
# /crews/poem_crew/poemcrew.py
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
knowledge_source = CrewDoclingSource(
file_paths=["knowledge.md"]
)
And
# /crews/poem_crew/poemcrew.py
@CrewBase
class PoemCrew:
"""Poem Crew"""
agents_config = "config/agents.yaml"
tasks_config = "config/tasks.yaml"
llm = LLM(model="ollama/deepseek-r1:7b", temperature=0.7)
@agent
def poem_writer(self) -> Agent:
return Agent(
config=self.agents_config["poem_writer"],
# memory=True,
llm=self.llm,
# LLM=LLM(model="ollama/llama3-70b-8192", temperature=0.3),
)
@task
def write_poem(self) -> Task:
return Task(
config=self.tasks_config["write_poem"],
)
@crew
def crew(self) -> Crew:
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,
# memory=True,
verbose=True,
knowledge_sources=[knowledge_source],
embedder={
"provider": "ollama",
"config": {
"model": "mxbai-embed-large"
}
}
)
I have also tried the following:
# /crews/poem_crew/poemcrew.py
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
knowledge_source = CrewDoclingSource(
file_paths=["knowledge.md"],
chunk_size=4000, # Characters per chunk (default)
chunk_overlap=200, # Overlap between chunks (default)
)
and
# /crews/poem_crew/poemcrew.py
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
from crewai.knowledge.storage.knowledge_storage import KnowledgeStorage
knowledge_source = CrewDoclingSource(
file_paths=["knowledge.md"],
storage=KnowledgeStorage(
embedder_config={
"provider": "ollama",
"model": "nomic-embed-text",
"base_url": "http://localhost:11434"
}
)
)
The above gives a slightly Different Error:
Failed to upsert documents: APIStatusError.init () missing 2 required keyword-only arguments: ‘response’ and ‘body’
tried updating:
pip Install --upgrade crewai crewai-tools transformers tokenizers docling docling-core
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
from crewai.knowledge.source.text_file_knowledge_source import TextFileKnowledgeSource
Anyone have any examples of a fow: crew.py with Knowledge, CrewDoclingSource and markdown.md with just ollama?
** Is this a flow problem?
opened 08:50PM - 14 Jan 25 UTC
bug
### Description
I am trying to use markdown files as knowledge for the crew but… when I try to use CrewDoclingSource with markdown files the docling\backend\md_backend.py file line snippet_text = str(element.children[0].children[0].children) throws error
IndexError: list index out of range
An error occurred while running the crew: Command '['uv', 'run', 'run_crew']' returned non-zero exit status 1.
If I convert a file to a word document it works fine, but id rather not have to do that for all incoming documents.
### Steps to Reproduce
Create any basic crew project and use a CrewDoclingSource knowledge for a markdown file, that's all I did
### Expected behavior
MD Files would be useable by the crew as a knowledge base
### Screenshots/Code snippets
content_source = CrewDoclingSource(
file_paths=['0-introductory-overview.md', '3-priesthood-principles.md'],
)
@crew
def crew(self) -> Crew:
"""Creates the Search crew"""
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,
verbose=True,
knowledge_sources=[
content_source
]
)
### Operating System
Windows 11
### Python Version
3.12
### crewAI Version
0.95.0
### crewAI Tools Version
0.25.8
### Virtual Environment
Venv
### Evidence
PS C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search> crewai run
Running the Crew
[2025-01-14 13:16:40][ERROR]: Error loading content: list index out of range
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Scripts\run_crew.exe\__main__.py", line 5, in <module>
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\src\search\main.py", line 5, in <module>
from search.crew import Search
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\src\search\crew.py", line 21, in <module>
content_source = CrewDoclingSource(
^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\crewai\knowledge\source\crew_docling_source.py", line 33, in __init__
super().__init__(*args, **kwargs)
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\pydantic\main.py", line 214, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\pydantic\_internal\_model_construction.py", line 126, in wrapped_model_post_init
original_model_post_init(self, context)
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\crewai\knowledge\source\crew_docling_source.py", line 66, in model_post_init
self.content = self._load_content()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\crewai\knowledge\source\crew_docling_source.py", line 80, in _load_content
raise e
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\crewai\knowledge\source\crew_docling_source.py", line 70, in _load_content
return self._convert_source_to_docling_documents()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\crewai\knowledge\source\crew_docling_source.py", line 92, in _convert_source_to_docling_documents
return [result.document for result in conv_results_iter]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\document_converter.py", line 212, in convert_all
for conv_res in conv_res_iter:
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\document_converter.py", line 247, in _convert
for item in map(
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\document_converter.py", line 288, in _process_document
conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\document_converter.py", line 311, in _execute_pipeline
conv_res = pipeline.execute(in_doc, raises_on_error=raises_on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 52, in execute
raise e
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 44, in execute
conv_res = self._build_document(conv_res)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\pipeline\simple_pipeline.py", line 41, in _build_document
conv_res.document = conv_res.input._backend.convert()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\backend\md_backend.py", line 340, in convert
self.iterate_elements(parsed_ast, 0, doc, None)
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\backend\md_backend.py", line 306, in iterate_elements
self.iterate_elements(child, depth + 1, doc, parent_element)
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\backend\md_backend.py", line 306, in iterate_elements
self.iterate_elements(child, depth + 1, doc, parent_element)
File "C:\Users\jred\OneDrive - Church of Jesus Christ\Desktop\githubRepos\aei_agent_exploration\crewai\search\.venv\Lib\site-packages\docling\backend\md_backend.py", line 212, in iterate_elements
snippet_text = str(element.children[0].children[0].children)
~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
An error occurred while running the crew: Command '['uv', 'run', 'run_crew']' returned non-zero exit status 1.
### Possible Solution
None, I think the sub library just needs to be fixed
### Additional context
There appears to be some kind of disconnect between the open source documentation (https://docs.crewai.com/concepts/knowledge) and reality because the documentation says text is supported but running it the library says md is supported.
Possible Solution - None, I think the sub library just needs to be fixed