Graph Grounding of LLMs
Three steps to boost your large language model with your organisations structured and unstructured data
Grounding LLMs is the approach of incorporating specific data or context into language models to provide more accurate and domain-specific answers. While plain language models are trained on large amounts of general textual data, they may lack the ability to generate accurate answers based on specific knowledge sources. By grounding language generation with a knowledge graph, we can build hybrid systems that use trusted, traceable data while leveraging an LLM’s strong abilities in formulating text.
Graph Grounding – Pairing two Technologies for a Perfect Fit
Large Language Models (LLMs) and knowledge graphs complement each other perfectly in their strengths and weaknesses. Where LLMs lack a contextual basis through implicit knowledge, knowledge graphs offer a structured truth. And where knowledge graphs lack readability and comprehensibility, LLMs can help with their linguistic interpretation of facts.
PoolParty Knowledge Graph augments your LLM with your organization’s structured and unstructured data so that the LLM can create outputs on internal organization-specific questions. You can effectively ground either a commercial LLM, open source LLM or own LLM. The results are more relevant, accurate and trustworthy and are enriched with links to the documents, on the basis of which further knowledge discovery is possible.
In three Steps to your Graph Grounded LLM
Our years of experience in creating enterprise-wide knowledge graphs combined with a premium class semantic suite enables us to quickly and efficiently ground your organization’s structured and unstructured data in LLMs.
Step 1
Model Creation on the Fly
The starting point is the domain model, which contains key entities, concepts, relations and hierarchies relevant to your industry, sector, product and business activities.
What used to be an extensive process in terms of time and effort years ago is now done in no time at all with the help of the PoolParty Semantic Suite. The range of labour-saving methods for modelling your company’s knowledge ranges from buying ready-made taxonomies, via the adaptation of free industry models, to harvesting concepts from Wikipedia. Finally, using GenAI has once again brought a quantum leap in time and effort savings. We can now truly speak of model creation on the fly.
Step 2
Quick-Start Knowledge Graph
The PoolParty Semantic Suite can now create a knowledge graph from your data and documents without further interaction. Based on the domain model, the knowledge graph maps relationships and links between your data objects in a machine-readable way, regardless of whether they are structured or unstructured.
The fluid structure allows your knowledge graph to grow organically as new data is added without further human intervention. And the more and more up-to-date relationships are created, the more context your data has and the bigger the ground is for your LLM.
Step 3
Graph-based Grounding of your LLM
LLMs do what they were built for: Find coherent formulations for statistical correlations. But they lack the contextual grounding to shine. It is the explicit, structured and semantically defined data that knowledge graphs can contribute to fully realize the potential of LLMs.
With PoolParty Semantic Suite it is easy to ground LLM with context and facts. The PoolParty Knowledge Graph (KG) provides RAG architectures, prompt solutions and other rail-guarding solutions with the necessary contextual knowledge. In addition, the interlinking of KG and LLM offers added value where other grounding solutions leave deficits: Traceability, fact fidelity and adaptability, but also the capability for further post-processing.
Should you use Graphs or Vector Databases to ground your corporate LLM?
Retrieval augmented generation (RAG) is the leading consideration to overcome the known deficits of LLMs. Knowledge graphs and vector databases are here the two most discussed and practiced solutions for RAG. But which of the two provides a more accurate, reliable and explainable grounding for your LLM on which costs and efforts?
Structured and Unstructured Data
Queries supported by vector databases make it possible to retrieve relevant parts of unstructured data based on a posed query, but applying the same process to structured data is difficult due to the loss of contextual meaning and correlation between instance data and schema. Knowledge graphs are capable of ingesting both structured and unstructured information and are able to maintain the semantic understanding.
Example: You want to answer your RAG questions like: “How many spare parts for machine XYZ are on stock today”. Only Graph RAG can perform here in combining your company’s stuctured and unstructured data silos.
The Truth between the Chunks
Where a graph provides the appropriate facts and connections to the LLM, a vector database can only cut out suitable chunks of text from a given text block. This limits completeness and complexity of the LLM output.
Example: Simple questions such as “Which flowers bloom in spring?” are easy for Graph and Vector RAG to solve. Combinatorial questions such as “Which flowers that bloom in spring need a lot of sun?” show that the Vector RAG has difficulty dealing with the vector space of the two parts of the question whereas the Graph RAG can simply link these parts.
Persistant in context and connections
When human-readable text is deconstructed into machine-readable vectors, the general scope and the connecting context are lost. Even if one tries to mitigate this loss with methods such as contextual and hierarchical chunking and embed the chunks back into the context from which they originate, these complex procedures are no substitute for the explicitly available relations of a knowledge graph.
Example: There are cases where the exact wording is important. In HR, for example, it makes a difference whether you are talking about any manager of a department or the exact position. Distinctions that are blurred in the vector, but remain explicit in the knowledge graph.
Second layer information included
One of the biggest advantages of graph grounding is that it can provide knowledge that cannot be represented in the vector. A knowledge graph always holds additional definitional knowledge about the concepts representing a text. Thus, concept definitions, alternative labels and synonyms can enrich the text, where simple vectors only represent chunks of text provided for vectorisation.
Example: Ground an LLM with an ambiguous term where training and also your knowledge base stay ambiguous is where Knowledge Graphs can shine. They have the definitions at hand to tell the LLM which meaning to focus on.
As we know from practice, Graph RAGs produce excellent results in most cases. However, we are not blind to the additional possibilities offered by vector RAGs. When it comes to providing non-explicitly modeled facts to the LLM to complete the grounding, the inclusion of vector data in the LLM input can be beneficial.
How your grounding strategy becomes a success
Do your GenAI tools or your grounding approaches fit your respective use case? This is ultimately the question that determines the success or failure of your GenAI strategy.
Built on top of your proven Knowledge Management
Companies want (or need) to offer their employees the amazing new ways provided by GenAI to interact with corporate knowledge. GenAI promises intuitive, conversational interactions, moving away from sifting through long lists of documents to a more engaging, real-time dialogue tailored to the user’s understanding.
The good news is that this does not require a completely new approach to technology and data. Your company’s knowledge, stored in a wide variety of systems, is “rewired” in a knowledge graph and thus made available for the GenAI. With Graph Grounding, the data remains where it has always been and the semantic layer ensures that GenAI can work with all available data. GenAI can thus become a comprehensive new power tool for your company.
Look at the total costs of investment
What initially looks like a turnkey solution turns out to be unsuitable for your own requirements. In recent months, providers of LLM solutions have appeared on the market at breakneck speed, offering RAG solutions with simple brute-force approaches. Simple, cheap but also full of risks. Quite a few operators of RAG offerings are confronted with consequential harms, complaints and claims for damages.
At the end of the day, it is the total cost of implementation that counts and the investment in Graph Grounding will be returned many times through reduced costs for operational risks and mitigated liability issues.
Keep track of what your GenAI talks about
One of the main criticisms of using LLM in professional use is that it cannot attribute the sources used to generate a particular output. This can lead to a loss of credibility or even licensing problems.
In cases a monetary value is linked to a fact that the LLM spits out, the explicitly stated and traceable facts provide with Graph Grounding is superior to all other approaches. Or in other words: it is better to trace what your LLM is speaking about when facts provided by you have a price tag.
Useful Resources
Optimizing LLMs with RAG: Key Technologies and Best Practices
Watch KMWorld’s webinar on how to optimize the LLM experience with the help of Retrieval Augmented Generation (RAG).
Document Object Model Graph RAG
Learn more about Document Object Model (DOM) Graph RAG which helps ground LLMs to build Conversational AI applications in this white paper.
Get the Starter Kit
Want to see Graph Grounding in your own enviroment? Check out our Generative AI Starter Kit!