PDF Rag with File Streams

I’m experimenting with RAG on a large number of PDF files which are already hosted on the cloud.
To share these with my agents, currently it seems I need to use multiple instances of the PDFSearchTool and pass those to my agents. To do this, I need to download the PDF to my local file system.
I receive the files as a filestream, saving them to my local filesystem feels unnecessary. Is it possible to use a filestream directly or to be able to do so, would I be going down the path of a custom tool?

Hi @seanrobbins you would need to create a custom tool for sure due to the following

  1. The current implementation of our RAG Tools are tightly coupled with embedchain’s App class which expects file paths
  2. The PDFEmbedchainAdapter passes arguments directly to embedchain’s add method, which doesn’t support filestreams as far as I am aware.

Custom tool definitely the way to go though

1 Like