What is Intelligent Content?
The Power of Semantic AI in Structured Content Authoring
Many are aware of the content lifecycle, the ebb and flow of how a piece of content is introduced, worked on, published, and finally “retired.” What most enterprises struggle with is the ability to find the right tools to help consolidate and manage this lifecycle – these days, as companies inherit more and more data and documents, the less capable a traditional content management system (CMS) is for getting the job done.
Now, a growing number of industries are embracing intelligent content authoring to streamline their work and deliver more valuable content to external audiences. Intelligent content fuses technology with human capability and knowledge so that it is findable (via understanding human search intent), reusable and machine readable (via structured components and metadata).
Navigating the content lifecycle
Rather simply, intelligent content is the practice of transforming unstructured content into material that is fit for machines so that it can be effectively used by humans later. Unstructured content refers to the plethora of presentations, reports, PDFs, videos, audio files, social media data, survey responses, webpages, and more that organizations generate and share on a daily basis.
As humans, we have the unique ability to navigate through PDFs, reports, emails, and other forms of information because we understand the meaning and context behind them. However, machines cannot discern this unless they are told how to with metadata. Additionally, though humans can read through content to understand it, the process of sifting through documents is typically not a productive one.
In a typical content lifecycle, the content is planned for, created, and published (either through making it public to an external audience or distributed to colleagues internally). The goal of a piece of content is to deliver something, so once it has been delivered, there is often no reason to touch it again. What is valuable in the moment is suddenly reduced to a thing that collects dust in some obscure folder that will not be accessed again.
When content simply sits after it has been published for its original use, it becomes outdated, which is especially problematic if it was made available to a public audience. A product leaflet that has been posted to a company website has valuable SEO traffic that directs prospects to the page; however, the leaflet still has the old company logo, old pricing information, and sells the business argument for trends that were on the market three years prior. The prospect immediately discerns that the leaflet has not been updated, and bounces from the page.
Internally, you don’t want to make a time estimation for a project you are managing based on the specifications of an old report – perhaps personnel has changed, the technology used to complete the previous project has evolved, and so on and so forth. Though this content was once valuable, the fact that it has not been maintained or updated, means that it cannot be trusted to complete the tasks of today.
The often forgotten step in the content lifecycle is the ability to reuse and repurpose text.
When this is not factored in, the content is not only outdated, the lack of reusability also contributes to workplace inefficiency. Employees spend a lot of time planning and creating content only to use it once. Their time would be better appreciated if it could be used across various channels and formats and for different stakeholders. An additional side effect is the inability to find documents and data. It’s very common to be frustrated in the process of looking for a specific document in a company CMS because an employee knows it exists, but they don’t know what it’s called, where it’s located, or who it was made by. In this case, even the most important documents in a CMS can be classified as junk if it’s not labelled or sorted well. The person needs to produce a new article from scratch, when the document they were looking for would have reduced their work by a significant percentage.
The solution to all these headaches is structured content which involves breaking content down into manageable chunks, such as sections, headings, lists, images, etc., that can be repurposed and reused across multiple channels.
Some of the major benefits of structured content include:
Increased Efficiency: Content reuse allows teams to quickly create new content without having to start from a blank slate. It reduces the amount of time spent researching and writing, ultimately freeing up resources for other tasks.
Consistency in messaging, tone, and style across multiple channels.
Improved Quality: Reusing content can help to ensure that content is accurate and up-to-date.
Risk Management: Similar to our example with our prospect from above, content that is maintained properly helps reduce the risk of low quality content getting in the hands of people who expect more.
Adding Semantic AI to the mix
Semantic AI is the ultimate solution to overcome these challenges and harness the power of structured content creation.
But what’s the secret? Let’s explore the core principles of Semantic AI together.
Taxonomies are the first step in achieving structured knowledge and data management based on a semantic framework. Creating a taxonomy based on a standard like SKOS is a great way to make accumulated knowledge more accessible and reusable.
Auto classification serves as an umbrella term for tagging – the practice of extracting and assigning descriptors to your data to create enriched “metadata” – and classification – the practice of categorizing and clustering content. The metadata and clusters help a company organize their CMS to rid them of any hassle they may have when trying to locate documents or create new ones.
Along with extracting tags from documents and data, a successful auto classification strategy requires a strong taxonomy to facilitate the categorization of these tags. Through the use of an entity extractor tool, keywords and labels are extracted from documents that are synced to the taxonomy. These tags can be automatically sorted into their corresponding classes and concept schemes in the taxonomy through predefined rules that have been set up in the thesaurus structure, or refined after manual review. The benefit to maintaining tags in a taxonomy is the consistency it provides through its hierarchical structure and controlled vocabularies.
Semantic concept tagging
In the world of semantics, metadata is derived from “semantic concepts.”
Companies do often have their own way of tagging documents, but typically their methods are both manual and based on simple keywording. When a tagging strategy is driven by simple text-based tags, the search engine can only retrieve information based on the exact terminology. Therefore, every word that the user enters in a search field should be extremely precise and relevant.
Concept tags, however, use a bundling of synonyms, multilingual labels, and its hierarchical structure within a taxonomy tree to broaden the scope of what can be searched for. The advantage to semantic concept tagging is that users can enter unspecific language or multiple keywords, and the search engine could retrieve the precise results that they want. Concept tagging can return results based on a much more diverse profile of attributes.
More about how this applies to components can be found below.
Semantic search and recommender systems
Taxonomies and concept tags are the building blocks for intelligent search platforms, virtual assistants such as recommender systems, or even enterprise-wide content and knowledge hubs.
Since findability of documents and data is so crucial, semantic search and semantically-powered recommendations are a defining feature of an intelligent structured content authoring system.
Semantic search uses the synonyms in concepts to retrieve results, performing faster and more intelligently than common search engines. A semantic recommender system takes it a step further by precisely tailoring results to user search queries, and can suggest “further reading” content to get a deeper dive into a product explanation. The semantic recommender system does not simply recommend “more of the same,” but enriches the experience with additional information.
The semantic search/recommender system contributes to these aspects of structured content authoring, (both from the employee and customer perspective):
- Relevant content can be retrieved regardless of the user’s personal input in the search field
- Additional related content can be suggested
- Information can be accessed from everywhere, regardless of its source
- Content creation resources are maximized through reusability of existing content
- Content is accessible and up-to-date
- Better user experiences through personalized recommendations
By harnessing the power of these complementary approaches, organizations can revolutionize their content discoverability, extract valuable insights, offer knowledge as a service, and establish intelligent content hubs.
PoolParty & Microsoft Docs
Microsoft Docs has used intelligent content to efficiently govern and scale up an enterprise AI strategy.
Achieving content reuse with intelligent content authoring
Altogether, these technologies ensure that content can be reused and repurposed from the start. A user only has to think of the document in terms of components – which is a more enhanced version of an outline that they are likely already using. Beyond the scope of headings and subheadings, each section should be considered for their meaning. In other words: “this section is about intelligent content so it should explain what it is plus also list the benefits.” The point of breaking it down like this is that the section, which serves as part of this webpage, could then be reused as food for a leaflet, a slide for a pitch deck, and a social media post.
The tagging process explained in the previous section helps facilitate reusability. A content component management system (CCMS) like Tridion allows the user to componentize the text as it is being written, and its integrated Semantic AI features can assign unique metadata to these components. This metadata not only describes the component, it also helps to indicate the different “angles” included within the content itself. Since a document is a collection of different pieces of information, these various components can be represented by the information they set out to convey.
If we take a look at our leaflet, for example, the document has been broken down into ESG Risks and Knowledge Model. These components have been tagged with their respective topics which can be searched for in the future. Perhaps a person wants to write a product description of the Knowledge Model in an offer to a prospect – they can search “knowledge model” in their CMS and be taken to this specific section in the pre-existing leaflet so that it can be copied into their offer.
The semantic metadata used in structured content authoring represents the multi-dimensional nature of documents.
Practical Use Cases: Intelligent Content for Technical Documentation and Support
One particular sector of business that is seeking out intelligent content is product documentation, where they can benefit from content reuse and semantic search/recommendations.
Search and ticketing systems
Too often, Help Documentation Portals are built on simple search engines, lacking the ability to retrieve accurate results or recommend relevant Help articles. Technical documentation helps consumers reference comprehensive step-lists and instructions when questions about complicated products and software arise; but what if the article a user is looking for cannot be found?
Source: “Reducing Support Costs Through Semantic Search in Technical Documentation” product leaflet
These search engines require that additional money and personnel resources be spent to solve customer issues via support and ticketing systems.
With a semantic search engine in place, the Help portal would return more precise results based on the semantic metadata. The user would likely find what they are looking for, plus additional helpful documents.
Even if the issue did get escalated to a a ticketing system, the semantic metadata could ensure that tickets are filtered to the correct representative who has the most knowledge about the specific issue. Customer tickets can be tagged according to a topic and sent along the right channels to the representative also tagged with this topic. Sophisticated rules based on metadata can help deliver the best quality interactions and services between organizations and customers.
Recommending and reusing content in technical documentation
Say a technical writer is writing a quick-start guide for an existing feature that contains a general summary of the feature plus detailed screenshots. A recommender system can pair the text that’s being worked on with release notes that were published when the feature was initially launched. These release notes help the writer understand the motivation behind this feature so that they can summarize the benefits and the “why” factor. The technical writer is able to form a more compelling text and they can also link back to the release notes to give the user more information about the feature as a whole.
Further along the content lifecycle, a hardware employee might use the product documentation written by this technical writer to fix a malfunction in the product. The quick-start guide helps them get oriented to the situation, but they find that more information could be added to the documentation based on their experience so they fill out a service report with all these details. Should this similar issue occur in the future, a recommender system will suggest both pieces of text to the next employee who encounters this problem, enabling better action on the issue.
In this case, the original piece of content not only feeds new content, it’s also useful in a different workflow.
With its ability to handle multi-format, multi-channel, multi-lingual, and multi-authored content, intelligent content authoring has become indispensable. Companies do not have to worry about content being lost and can increase the shelf-life of a piece of content, making it all the more valuable.