Semantic Retrieval Augmented Generation
Generative AI powered by Knowledge Graphs
Consumers and businesses are all talking about Large Language Models (LLM) and many companies want to use this promising technology to retain a competitive advantage. However, LLMs do not come without flaws, as anyone who has witnessed a Generative AI “hallucination” can attest. Retrieval Augmented Generation (RAG) is a framework designed to make LLMs more reliable by incorporating relevant, up-to-date knowledge directly related to a user’s query.
The use of the Semantic RAG design pattern and its enrichment with semantic search combines symbolic AI (knowledge graphs) and generative AI (LLM) for better domain fidelity and fewer hallucinations.
What is Semantic Retrieval Augmented Generation?
Language models are trained on huge data sets but are by definition domain agnostic and “frozen” to a specific training state. If you want to bring in your own domain or even current data, you need your own training or retraining. Depending on the frequency and scope, this can be costly (100k +). These and other circumstances lead to the output information having a hallucination level of up to 45%. There are many techniques to improve the performance of language models, but what if language models could access real facts and data outside of their training set without the need for extensive retraining?
This is where Retrieval Augmented Generation (RAG) comes in with a simple and captivating concept. As soon as the language models are called up, they are provided with additional relevant context. This can be the latest news, research results, new statistics or even legacy content. With RAG, an LLM is able to retrieve “fresh” information in order to obtain higher quality answers and fewer hallucinations.
With our Semantic RAG, however, we go a few steps further to keep domain fidelity and response focus high and hallucination low. Often referred to as “Advanced RAG” in the literature, our Semantic RAG is a cascade of the following context-infused methods and LLM calls.
Step 1: Smart Query Builder
An assistant is already available when formulating the search. Auto complete and concept suggestions help to formulate the question in a targeted and domain-specific manner.
Step 2: Knowledge Retriever
The text mining implemented in the retriever extracts the semantic context of the query and provides the LLM with a list of directly identified and related concepts.
Step 3: Conversational Generation
The LLM processes the user query and context from a knowledge graph to produce an answer that is enriched with background information as summarized text. Depending on what the user is looking for, the dialog can be deepened or generalized by means of prompting the machine with additional questions – in other words, by having a conversation with the LLM.
Step 4: Document Recommender
A recommendation algorithm identifies documents of the company knowledge base that best match the result of the human-machine dialog and returns them as a list of summaries. This does not require sharing the knowledge base with the LLM provider, ensuring that company data remains secure.
Step 5: Conclusion
Our Advanced RAG delivers relevant and actionable results in each section. As a final step, we process these results with a final LLM stage to produce an easily understood conclusion.
Example applications
GraphSearch Generative Experience
The best choice if you want to harness the developments of Generative AI for your company. This option provides all the benefits explained above including recommendations and conversational generation.
A complete new search experience for the workplace of the future using the Semantic RAG. Shorten the time to insight and provide a domain-savvy querying even for the untrained.
PoolParty Meets ChatGPT
A simple, basic RAG that utilizes prompt engineering to focus results and our PoolParty concept tagging methods to provide contextual knowledge. Answers are enriched with the knowledge graph, but do not recommend additional articles, further ineraction with the chat, or a final conclusion.
Overcome the Limits of Large Language Models
The possibility of hallucinations with LLMs can never be ruled out. The forms that occur are sometimes easier or more difficult for people to recognize. We can limit the frequency of their occurrence with our Semantic RAG methods.
TYPE OF HALLUCINATION
TYPE OF HALLUCINATION
Nonsensical output. The LLM generates responses that lack logical coherence and comprehensibility.
Factual contradiction. This type of hallucination results in the generation of fictional and misleading content, yet still are presented as coherent despite their inaccuracy.
Prompt contradiction. The LLM generates a response that contradicts the prompt used to generate it, raising concerns about reliability and adherence to the intended meaning or context.
PROBLEMS WITH CONVENTIONAL LLM
LLMs sometimes have problems with understanding context. They may not be able to distinguish between different meanings of a word and use it in the wrong context. The higher the ambiguity of a query, the higher the probability of leading the LLM down the wrong path.
The data with which the LLM was originally trained is not relevant in terms of time or context to solve the question posed. The LLM begins to fill in the data gaps with hallucinations.
LLMs have their own rules, policies and strategies set by their parent company. They prevent them from distributing unwanted content, even if it is contained in the training data. If the LLM detects a violation of these rules, possible responses are decoupled from the request.
MITIGATION WITH SEMANTIC RAG
The Smart Query Builder injects the semantics of a word when formulating the query and thus unmistakably determines its meaning for the LLM.
The contextual and domain-specific knowledge provided in the Semantic RAG fills in data gaps and leads the LLM to meaningful answers.
The Smart Query Builder guides the formulation of the prompt and can take the rules of the LLM into account in advance. Of course, changing the LLM provider or fine-tuning can also shift the rules.
Unlocking the Business Potential of Large Language Models
Across all industries, there is a consensus that the use of LLM can increase productivity in almost all areas of a company. According to a study by Deloitte, 82% of managers believe that AI will improve the performance of their employees. Gartner predicts that companies will save at least 20% by using Generative AI in the coming years.
Shorten the Time to Insight
According to IDC, a knowledge worker spends around 30 % of their working day searching for information – primarily reviewing search results and processing them.
Our combination of AI assistants ensures that usable knowledge is available as soon as the query is entered. Instead of long lists of documents to be processed, our AI solution delivers summarized facts. We are therefore talking about an increase in efficiency of 15-20% through the use of semantic RAGs in everyday operations.
Savvy querying for the untrained
That’s a dilemma! Companies want to familiarize their employees with a topic quickly. However, in order to make successful search queries, domain knowledge (jargon and terminology) of a subject area is required.
Our AI-guided search assistant now helps inexperienced users to formulate search queries correctly. The LLM-supported human-machine dialog picks up employees at their current level of knowledge and enables on-the-fly exploration and learning.
And faster onboarding to topics and jobs means faster and greater productivity for your employees.
Low-cost for Implementation and Maintenance
It is common knowledge that even the best pretrained LLMs might not always meet your specific needs. You need to customize the model in terms of expertise, vocabulary and timeliness. To adapt it to your specific requirements, you need to optimize it. There are currently four known optimization methods: Complete fine tuning, Parameter Efficient Fine Tuning (PEFT), Prompt Engineering and RAG.
Fine tuning, even efficient fine tuning, requires a significant amount of computing power, time and ML expertise that you need to invest regularly to integrate new relevant data into the model. That’s why we rely on a combination of Prompt Engineering and RAG, both of which are independent of costly LLM customization and limited in cost-saving maintenance of knowledge base and graphs.
With our Semantic RAG, we can cut down costs of LLM implementation and maintenance by 70%. This creates an ROI increase of 3x and higher.
Want to see Semantic RAG in your own enviroment? Check out our Generative AI Starter Kit!
Useful Resources
Webinar: Responsible AI based on LLMs
Webinar: Intelligent Content Services based on Knowledge Graphs & Large Language Models
Helmut Nagy, (Semantic Web Company), and Eric Tengstrand, (Etteplan), show the benefits of combining knowledge graphs and intelligent structured content to create better results with generative technologies.
Guidebook: Generative AI You Can Rely On
Reference Architecture of PoolParty Semantic Retrieval Augmented Generation
Often referred to as “Advanced RAG” in the literature, our Semantic RAG is a cascade of context-infused methods and LLM calls.
By 2027, more than 40% of digital workplace operational activities will be performed using management tools that are enhanced by GenAI, dramatically reducing the labor required.
Predicts for Generative AI, Cameron Haight, Chris Matchett 2024