GraphRAG Live demo. Real Time Context Generation

Hi People,
For the past two weeks I’ve been learning about graph databases and how to use them. I’ve created a youtube video (link below) where I give an overview of what I have learned.
In the video I also produce a demonstration of a GraphRAG set up.

I explain how you can create the same set-up for ‘FREE’ .
I recomend that you watch at least part of the video where I show how to set this up.
This is NOT a presentation video, it’s very informal, a ‘chat with the lads’ type thing!
IMPORTANT
The link below is a new updated version as of 07/10/2024. New content from 38 min 44 seconds onwards. Also covers NEW codebase
The You Tube Video CHANGE YT SETTING → Quality to 720

The links as mentioned in the vid:
Neo4j The GraphDB that we will use
Neo4j Desktop (Linux; Windows; Mac)

The Github repo for the code

The pupose of this post is to stimulate discussion on the subject of GraphRAG.

All comments good, or bad are welcome :grin:

Getting requests for help in setting up. Please watch the how to set-up, how to install the required sections of the video (some have not). If the vid does not help then come back to me.

Review:
Since I uploaded the vid, which was a demonstration of a concept using a GraphD, I’ve now looked at the code in more detail the following are my initial observations:

  1. I captured the context, wich can be found in the repo Misc-> context.txt. Looking at the context it appears that the code has issues in respect of amalgamating full text search results with those from the vector search. Note the ‘Structured’ & ‘Unstructered’ markers.
  2. While I demonstrate how to set-up for Windows 11, and I know that the procedure is exactly the same on my Linux box. Those with Macs’ are suggesting that locating the neo4j HOME DIR is not the same, as a result they are having issues.
  3. MAC USERS: A fellow member had issues with getting neo4j installed on his Mac. In the .env file change bolt://localhost:7687 → bolt://127.0.0.1:7687.

1: I am taking a look at today.
2: If anyone who is Mac based can give more details, or lend a hand i’m sure that it would be appreciated by others Mac users.

During my research I have seen some concern raised over the ammount of OpenAI calls when building knowledge graphs. Below is my OpenAI usage for this month. **REM While debugging/development I run my code maybe 20-30 times a day!

After having done some major re-writing of the code and fixing many issues I have decided to redo the video based on the new code base. I will leave the how to set-up, the basics of what is a GraphDB, simple usage. The code component of the video will be replaced within the next couple of days.

Code base repo now updated

Next Steps
After some thought I have decided to update the code to use local ollama based models for both embeddings & other semantic processes.

For anyone who has watched to ‘new’ video and listened to my concerns about arbitary chunk splitting, there may be a solution :grinning:: https://www.youtube.com/watch?v=tmiBae2goJM

Numpty Alert
Removed all generated context and still got an answer, not perfect, but an answer from the LLM. The knowledge Graph needs to be populated with ‘specialised custom’ data, or remove the LLM from the final stage, improve the Cypher queries and give the output direct from the GraphDB. Investigating. :roll_eyes:
UPDATE Take the output from the GraphDB and ask the LLM to summarise See repo

Diagram from new video:

PLEASE NOTE:
This post/thread IS NOT a project, it’s an engineering workshop/learning type thing. It’s more of a reflection of my own learning curve in getting to understand and use CrewAI and associated technologies more efficiently. The repo is public and if you have further interrest in anything that I discuss: clone concept’s the repo, copy the code, ideas and concepts. I have no problem in any of this I am an open book! I am also a NOOB with regards to Python; neo4j & CrewAI!
As and when I update the repo I will put a note here: local ollama branch hopefully sometime this week, etc. **REM my aim is soley to prove a concept while I’m learning.
Any lengthier discussions/comment about the concept I will confine to the repo.

2 Likes

Great but what’s the direct link with Crewai ?? Are you using Crewai to create your graph ??

Hi @Yannick
There’s none, other than it may be useful to have as a ‘tool’ for crews to access pre-loaded knowledge. I mention in the video that you could construct a GraphDB and populated with ALL property sales/trends data then make that available via a tool to Tasks & Agents.
Pureley conceptual at present.
As I write this I am updating the video (code part) to explain, A) the updaed code base, B new concepts of how such tech can be used.

I will be updating the link to a new video later today. I explain more in this video.
I’ll PM you when I have uploaded th enew video.

Thanks for your interrest :grinning:

1 Like

Anyone who has watched the video before now (15:20pm 07/10/2024) should watch the new content from 38:44 onwards:

PLEASE READ THE ORIGINAL POST ABOVE
Especially the final NOTE