pdfRAGtool not finding pdfs

Ruben_Casillas · October 15, 2024, 6:06pm

Hi there,

Is there a special folder or method to add the pdfs?
Like only accepting absolute paths? or have to get into a folder first?
Also, I see that Langchain is the one embedding the documents, but I asked to do it with an ollama text embedding model.

I also see that my tokens in Langchain ran out and I would like to use a local solution.
My pdftool looks like this:


PDFtool = PDFSearchTool(
    config=dict(
        llm=dict(
            provider="ollama",# or google, openai, anthropic, llama2, ...
            config=dict(
                model="llama3.1",
                temperature=0.01,
                base_url='http://127.0.0.1:11434',),),
        embedder=dict(
            provider="ollama",# or openai, ollama, ...
            config=dict(
                model="nomic-embed-text:latest",
                base_url='http://127.0.0.1:11434',
                # title="Embeddings",
            ),
        ),
    )
)

and my outputs are:

I encountered an error while trying to use the tool. This was the error: File path /path/to/Ruben_Casillas.CV.pdf is not a valid file or url.
 Tool Search a PDF's content accepts these inputs: Search a PDF's content(query: 'string', pdf: 'string') - A tool that can be used to semantic search a query from a PDF's content. query: 'Mandatory query you want to use to search the PDF's content', pdf: 'Mandatory pdf path you want to search'

can some one tell me what am I doing wrong and if there is a possibility to avoid using langchain.

Many thanks and greetings.

moto · October 15, 2024, 7:46pm

I think here it is api_base= rather than base_url=
not sure but take a look at the docs.

If it did make the call to the tool, it looks like it was an llm problem trying to formulate the call to the tool correctly. maybe not capable enough.

On the tool instantiation, if you are looking at a specific pdf, are you adding

pdf='path/to/your/document.pdf' parameter to the instantiation you show above?

ScarletAI · October 15, 2024, 10:50pm

Hey Ruben,

I use the file read tool often with my agents which is a little different, but try this:

file_read_tool = FileReadTool(file_path=‘memory.txt’)

Define agents

architect_agent = Agent(
role=‘Architect Agent’,
goal=f’Design a fully self-contained AI crew focused on {topic} with all agents, tasks, and tools within main.py.',
verbose=True,
memory=True,
llm=ChatOpenAI(model_name=“gpt-4o-mini”), # Updated to use gpt-4o-mini
backstory=(
"As an experienced architect within CrewAI, the Architect Agent specializes in creating modular, "
“self-contained AI systems. This agent excels in blueprinting AI systems that are flexible and efficient.”
),
tools=[file_read_tool]

I drop the file I would like the agents to look at in the project folder and it works fine.

Let me know if it works

Ruben_Casillas · October 16, 2024, 5:54pm

I want to thank you both for the answers.
It was actually to modify the model name, as it came with “:lastest” at the end

Many thanks!!

Topic		Replies	Views
PDFSearchTool fails with Ollama CrewAI Community Support	0	239	February 15, 2025
Embbeder insist in calling OpenAI CrewAI Community Support	7	232	October 18, 2024
Embedding configuration error CrewAI Community Support tools_issues , memory	0	71	April 16, 2025
PDFSearchTool with Azure OpenAI CrewAI Community Support tools_issues	2	492	January 7, 2025
Issue with PDFSearchTool and Azure Open AI CrewAI Community Support tools_issues	1	534	March 24, 2025

pdfRAGtool not finding pdfs

Define agents

Related topics