How to build a Knowledge Graph?
Five Steps to Building an Enterprise Knowledge Graph
Knowledge graphs are at the core of many of the tools that we use in our daily lives, such as voice assistants (Alexa, Siri or Google Assistant), intuitive search applications and even online store recommenders. Not only internet giants but also companies from other industries such as BBC, Capital One, Electronic Arts or AstraZeneca have already integrated the technology and are using knowledge graphs to harness the power of all of the data they have accumulated over the years.
The technology’s central promise is that it can harmonize and link structured and unstructured data, resulting in higher data quality that is ideal for machine learning. Knowledge graphs are, so to speak, the ultimate linking engine for the management of enterprise data and a driver for new approaches in artificial intelligence, which is expected to create trillions of dollars in value throughout the economy.
This article will show you the essential steps to building a knowledge graph. By following them, you will enable your company to join the global tech giants and benefit from precise search and analytics, semantic data catalogs, deep text analytics, agile data integration and other applications.
The tools and data you will add to your information management practices by building your knowledge graph, such as semantic metadata enrichment, taxonomies and ontologies, will also serve as the perfect foundation for many AI applications.
A semantic knowledge graph can be used to power data management tasks such as data integration in helping automate a lot of redundant and recurring activities.
Gartner, Inc: ‘Augmented Data Catalogs: Now an Enterprise Must-Have for Data and Analytics Leaders’, Ehtisham Zaidi and Guido De Simoni, September 2019
What exactly is a Knowledge Graph?
A knowledge graph is a model of a knowledge domain created by subject-matter experts with the help of intelligent machine learning algorithms. It provides a structure and common interface for all of your data and enables the creation of smart multilateral relations throughout your databases.
Structured as an additional virtual data layer, the knowledge graph lies on top of your existing databases or data sets to link all your data together at scale – be it structured or unstructured. The fluidity of the structure also allows for your knowledge graph to grow organically each time new data is introduced. The more relations created, the more context your data has – allowing you to get a bigger picture of the whole situation and helping you to make informed decisions with connections you may have never found.
Because knowledge graphs can be understood by both humans and machines, they serve as the perfect foundation for artificial intelligence, or Semantic AI, as the fusion between machine learning and knowledge graphs is often called. This approach allows organizations to develop optimized solutions to achieve their business objectives, either through automation or through enhanced cognitive capabilities.
Five Steps To Building A Knowledge Graph
Start Small and Grow
Your efforts to implement these technologies will probably have to compete with other initiatives for the resources and funds. Start by building a solid business case for knowledge graphs and semantic AI. To do that, select a small and concrete use case that shows the business value a knowledge graph can bring to your organization. This will help you gain support and buy-in.
Remember that effective business use cases are driven by strategic goals. Clearly define the business value of your use case by explaining how it makes processes or services more efficient and intelligent for the enterprise.
Some of the most relevant use cases for implementing knowledge graphs and AI are:
Get To Know Your Data
The next thing you need to do is gain a good overview of your data landscape. There are different approaches for inventorying and organizing enterprise data. To determine which types of content are relevant to your use case, consult with subject matter experts and analyze your data. If you are faced with a large number of items, there are automation tools that can help you.
Most companies work with large amounts of unstructured data, such as emails, reports, presentations and other text files. Taxonomies help to classify content and to organize your data and are the starting point for a data catalog! When based on machine-readable standards like SKOS, taxonomies also lay the foundation for even richer semantic models such as ontologies to automate data integration. A business taxonomy provides structure to otherwise unstructured information. It can help you capture, manage, and derive meaning from large amounts of data and content.
When selecting data for your prototype, make sure that it:
- contains both structured and unstructured data so you learn to work with both,
- is not too volatile so you do not have to deal with synchronization at the beginning,
- is not too big so you do not have to deal with performance at the beginning,
- choose data sources that when connected can do/show something that was not possible before.
Form a Working Team
A precise and detailed view of the roles involved such as taxonomists will also help to define appropriate skills and tasks to bridge mental differences between departments, which focus on data-driven practices on the one hand, and more on documents and knowledge-based work on the other.
Similarly, the question of how subject matter experts with strong domain knowledge (and possibly little technical understanding) can work together with data engineers who are able to use strongly ontology-driven approaches to automate data processes as efficiently as possible is also addressed.
Also involving business users and ‘citizen data scientists’ as soon as possible is essential, since users will become an integral part of the continuous knowledge graph development process nurturing the graph with change requests and suggestions for improvement.
Map Relationships Across Your Data
Once you have a well-defined prototype and know exactly what data you want to use, it is time for your team to start creating taxonomies and ontologies. But before you start, see what is already available. There are many well-developed taxonomies and ontologies out there for different domains, commercial and non-commercial. Do not start building something from scratch before evaluating if there is something out there you can reuse.
With the help of ontologies, connections between information and data from different sources can be created automatically. Ontologies enable you to map relationships between concepts in a single location at varying levels of detail. This sets the groundwork for intelligent AI capabilities, such as text mining and context-based recommendations. Here are some other things you can do with ontologies:
- Reuse hidden and unknown information
- Manage content more efficiently
- Optimize metadata management to improve search
- Create relationships between disparate and distributed data
Taxonomies and ontologies are a powerful method to map the actual business logic to all existing data models without having to significantly change the existing data landscape. This allows you to link your domain knowledge with your data in an agile way and analyze it as a whole. Ontologies also support the ongoing development of the knowledge graph, as they can be used to perform automatic data quality and consistency checks.
Most likely you will be successful with your first pilot application built on graphs. You have excited several stakeholders in your company, and even non-technical people have quickly grasped the beauty of graph technologies. People from other departments start asking what is in for them.
Now you are in a critical phase, as you may want to try to make the big change and plan it for the next 20 years. Don’t do that! Experiment in order to make valid decisions based on experience. Learn that experiments are not bad things or even a sign of immaturity, but rather the only chance to learn, to become better, to improve continuously and to develop skills. Agile is everywhere these days.
We know that, but we also need agile access to data to make better use of it. So we need agile data management. A knowledge graph project must always be an agile data management project. Knowledge is a living thing that is constantly changing.
“Change is the only constant in life.” (Heraclitus of Ephesus)
Four Examples of Knowledge Graph Implementations
One of the top 20 companies in the pharmaceutical industry uses the extensive capabilities of Enterprise Knowledge Graphs to provide a unified view of all their research activities.
IT & IT Services
A large IT services enterprise uses Enterprise Knowledge Graphs to help them link all unstructured (legal) documents to their structured data; helping the enterprise to intelligently evaluate risks that are often hidden in common legal documents in an automated manner.
A large governmental organization provides trusted health information for their citizens by using several standard industry Knowledge Graphs (such as MeSH and DBPedia etc.). The governmental health platform links more than 200 trusted medical information sources that help to enrich search results and provide accurate answers
Do you want more?
If what you need is a simple guide that makes building knowledge graphs as easy as cooking your favorite dish, watch Andreas Blumauer, CEO and Founder of Semantic Web Company, at the Book Launch Webinar, which took place on Wednesday, April 22, 2020.