The European Data Conference on Reference Data and Semantics
On March 17th, 2021, Semantic Web Company (SWC) COO Helmut Nagy and SWC taxonomy, ontology, and knowledge graph expert Heather Hedden will host a two-part workshop titled “Building, enhancing, and integrating taxonomies” as part of the ENDORSE conference.
Here is a summary of the workshop submitted by its presenters:
Knowledge organization systems, and specifically controlled vocabularies (including taxonomies, thesauri, term lists, ontologies), play a key role in making content and data easier to find, whether within an organization or published externally. Knowledge organization systems are utilized in many ways in information management and retrieval: topic browsing, search support, discovery, filtering results, sorting results, curated content and alerts, content management workflows, data analysis, recommendations, etc. Many organizations already have various controlled vocabularies designed for different purposes but lack an enterprise-wide taxonomy, have taxonomies that are out of date, or have taxonomies that are under-utilized.
This workshop focuses on several basic methods to get started in either building a new taxonomy or in combining and enhancing existing controlled vocabularies.
The workshop’s first section is an introduction to knowledge organization systems with a focus on the suitability of different types for different purposes: term lists, name authorities, classification systems, hierarchical and faceted taxonomies, thesauri, and ontologies. Standards (especially ISO 25964 and SKOS) will briefly be mentioned. Participants will be quizzed on which kind of knowledge organization system is suitable for different kinds of situations and for different kinds of concepts (subjects, names, etc.).
The next section, which is the focus of the workshop, addresses different methods for coming up with concepts in a knowledge organization system, whether as a new KOS or to further build out and enhance an existing one. Because knowledge organization systems connect users to content, they need to be designed to take into consideration the users’ needs and inputs and take into consideration the specific corpus of content. So, we will consider both users and content as sources for taxonomy concepts and their labels. Methods of obtaining inputs for concepts from users include brainstorming workshops, stakeholder and sample user interviews, search log analysis, and card sorting exercises. A brainstorming session is more suitable for participants from the same organization, so our workshop’s main interactive activity will be a collaborative card sorting exercise through an online tool. Methods of deriving concepts from content include a manual content audit, identifying concepts as a trained indexer would do, and automated term extraction. We will have one interactive exercise of manual content analysis for concept identification and then present a demo of a tool for term extraction from a large corpus of documents.
The final section of the workshop will briefly consider the issue of combining existing knowledge organization systems in overlapping subject areas. They can either be merged into one, if there is no business need to keep them distinct, or they can be kept separate and linked at the individual concept level. If there is considerable overlap, they may be “mapped”, which means the concepts are designated as equivalent or nearly equivalent, and one controlled vocabulary may be used for the other, such as one at the tagged back end and one at the retrieval front end. Sometimes controlled vocabularies, which do not have equivalent concepts but rather related concepts, are linked with other related-type relationships. Participants will learn when each method of merging, mapping, and linking are suitable and the methods involved.
This workshop is suitable for both those who are building new knowledge organization systems and those who need to integrate or enhance existing knowledge organization systems.
Click the button below for more information about the conference, the program, and to get registered.