I am working with inspecting files in the /knowledge I am using a crew that is using DOCXSearchTool to take the contents of .docx files and reformat them.
I think the implementation of DOCXSearchTool is not working correctly.. When i run docx2txt knowledge/file1.docx >> text.txt I get the right output. However when I look at the verbose output from the task it does not contain it all.
Is there a way i can inspect it?
Is anyone else having this problem?
My only workaround is to use the command line tool to pre-process the file and then use text.. but I’d like to use the built in tool.
Thoughts?
DOCXSearchTool
is one of CrewAI’s RAGTool
s, as we can see in the first lines of the code:
from typing import Any, Optional, Type
from embedchain.models.data_type import DataType
from pydantic import BaseModel, Field
from ..rag.rag_tool import RagTool
So basically, the process is:
Read the file → Create chunks → Embed the chunks → Store the embeddings
And, when the tool is used by your Agent
, we’ll have:
Embed your query → Semantic search → Prompt augmenting
If you were hoping (for whatever reason) to get the entire text content of your .docx
file back, this isn’t the tool for that job.
1 Like
This is interesting. I did not see it as a RAG tool Thank you.
So you would use convert to text and load it in as context for the Agent to use?
Yes, if your analysis of the specific case really leads you to believe that you need to load the entire content, then you can simply pass it in the inputs
of your crew. The point is, you should examine if you truly need the whole content for context. Just because you can, doesn’t mean you should. “‘All things are lawful for me,’ but not all things are helpful.” (1 Corinthians 6:12)
RAG is a technique that allows your Agent
to perform a semantic search on chunks of your document (knowledge base) and automatically add these relevant chunks to your prompt. This way, your Agent
receives content that’s more relevant to the task it’s trying to perform, make sense? Almost always, this approach (RAG) yields better, more scalable, and more reliable results.
If you really need to pass the full file content, convert it to text/markdown and inject it into the prompt. In this thread, I did exactly that, loading a 52k character file directly into the prompt, but that was for a specific reason: I needed to demonstrate that the LLM could handle a large context. It was just a test; I wouldn’t recommend doing this in production.
If you want to learn more about RAG and use cases, I recommend the following videos:
1 Like