Hey Guys, I have just 2 Agents, and which are not that complex one is basic Agent who output who should be called and second provides the answer. The Agent has RAG tool. The response time is around 20 to 35 sec. How can this be resolved….!!! This is not acceptable for a Chatbot kind of Application
Try reducing retrieval chunk size & number of docs, use fast embedding models like text-embedding-3-small, use fast reasoning llms, stream the response, use caching
1 Like