I have some (potentially basic) questions about memory, but I’m having trouble finding clear answers in the documentation or threads.
I’m not sure if anyone has already looked into these topics.
what I understand so far:
Short-term memory & entity memory: Allow agents to maintain context, share their results, and collaborate within a single crew execution.
Long-term memory: Stores various results from the crew or agents over time and retrieves them for improvement purposes.
Maintaining context across multiple executions
It is mentioned that combining these different memory types enables the crew to maintain context within a single execution as well as across multiple executions (essentially within a conversation). However, when I queried the CrewAI codebase with a cursor chatbot about memory, it mentioned that the primary goal of these memory systems is not to maintain a conversational history or contextual memory between the user and the crew.
Given this, would it be advisable to complement CrewAI’s memory system with another conversation history tracking? (For example: passing the last 30 messages between the user and the assistant to the crew) in order to ensure a consistent conversation context for each user, with a minimum contextual window.
Memory isolation in multi-user environments
In a multi-user setup, are CrewAI’s memory systems isolated per user?
For instance, if:
User A states that they are looking for a primary residence in France, and
User B states that they are looking for a secondary residence in Spain,
If both requests happen at the same time or close to each other, could there be a risk of context mixing between A and B?
Would this lead to execution issues, where Crew A and Crew B could mix their context and tasks results, causing incorrect answers to be given to users?
Is it relevant to implement a filtering system based on user_id or session_id when handling CrewAI memory?
Alternatively, would it be best to run one execution per instance, ensuring that there is only one active crew per user at a time, and resetting short-term and entity memory after each execution? Several Cloud Run instances with 1 execution at a time for example.
This way, STM and entity memory would only persist for a single execution per user, reducing the risk of cross-contamination.
The built in memory system is not the best for conversational chat type applications. You probably need 2 types of additional memory like you mentioned:
Chat history - conversation history between your user and chatbot saved to a persistent storage/database. Like you said, you would pass eg the last 30 messages
Thanks for your answer and the blog posts. I had the opportunity to test mem0 thanks to this
Okay, I understand that adding additional memory sources can be useful for providing more context to the crew about its conversation with a user.
Do you have any information on how CrewAI memories (short-term, long-term, entity) are isolated per user?
If multiple users interact with the crew at the same time, how can we ensure that there are no conflicts in retrieving information? (For example, the crew might use the result of a task performed for User 1 as context while processing a task for User 2).
Right now, it seems that there is no filtering by user or session when the crew retrieves memory elements. The inputs and results from all users appear to be mixed together.
Should we implement filtering by user_id or session_id to avoid conflicts between different users’ contexts and informations?
Have you got any solution for user management for short term memory of crew built in crewai. platform
Or are you using any custom solution to work around session management logic and on every crew all for each user retrieving data from database and passing it to agent as an in out.
I am trying to work on this, if you can share your research idea then It would be helpful.
Hi @zinyando,
How do you suggest implementing a multi-user environment? I am using FastAPI to reach the endpoint. From there, I am running a flow which will start a new crew every time, but I am getting a mixing-data problem when multiple users call the endpoint.
Can you tell me a bit more about what you are trying to achieve, what you have so far and the errors that you are getting. Information like whether you are using http requests or websocket will help, the memory providers etc
Thanks @zinyando for the rapid answer. I will try to explain my situation.
A user can upload a file and get back some details, and it works fine for a single user, but when multiple users try to call the API and upload different documents, a user could get back details from the document of another user.
I am using asyncio to avoid the 504 http error and the answer will be sent back througth websocket.
The kickoff method will run the flow:
dataCollector_flow = DataCollectorFlow()
result = dataCollector_flow.kickoff(inputs)
At some point the flow will call the Crew:
crew = DataCollectorCrew().crew()
result = await crew.kickoff_async(inputs=inputs)
In the Crew I tried to disable the memory or use an external memory with mem0 (using a different userId for each execution), but in both case I am getting the same problem.
I am using PDFSearchTool. Could this be a problem related to the tool’s cache behaviour?
Looking at the logs, I notice some incorrect behaviour on the part of the tool.
For example, if we have three users (A, B and C), each of whom uploads three different documents (mutually X, Y, Z). After user A uploads document X, for all subsequent users, the PDFSearchTool always returns the content of document X, ignoring the documents uploaded by the other users.
I also tried to disable the cache in the Crew and in the Agent, but didn’t work.
I don’t know how users will use your app but have you tried running the uploads as a background job instead if processing is taking too much time. This allows the request to complete ie no more 504, then you can send the results back to the user when the processing is finished.
You could use WebSocket for that is you want, you just need to make sure you send the right results to the right connection. This is not really a CrewAI problem but more of a software engineering/architecture problem.
What you described is exactly how it works, as it currently stands.
In the CrewAI logs, I can see that the response from the pdfSearchTool, after the first document, always remains the same even for different documents, as if the tool were retrieving data from a cache. So I don’t know if I’m using the tool incorrectly.