Within artificial intelligence and data-driven technologies, one concept that has gained significant attention and prominence is the knowledge graph. With its ability to represent and connect vast amounts of structured and semantically rich information, the knowledge graph has emerged as a fundamental framework for organising and harnessing knowledge. In this blog post, we will delve into the essence of knowledge graphs and ontologies, exploring their definition, components, and the role they play in achieving a deeper understanding of data and explainability of AI models.
A Knowledge Graph is a flexible, reusable data layer used for answering complex queries across data silos. They create supreme connectedness with contextualised data, represented and organised in the form of graphs. Built to capture the ever-changing nature of knowledge, they easily accept new data, definitions, and requirements.
In 2012, Google announced that soon, users of its search engine would be able to search for “things, not strings” (of text). In other words, Google would return what is known as structured information about people, places, events, movies, and other concepts, not just the traditional list of web links containing words matching the search terms. This information is drawn from what is known as a knowledge graph. [1]
It’s not just Google. Across the data science community, knowledge graphs have become a growing phenomenon in recent years, driving many applications, including virtual assistants like Siri and Alexa. At the same time, there’s some debate about what actually constitutes a knowledge graph (they’re a bit of a buzzword). But one common definition describes a knowledge graph as knowledge bases plus data integration. [2]
“things, not strings”
A knowledge graph comprises of three main components: entities, attributes, and relationships. Entities represent real-world objects, concepts, or instances, such as people, places, or events. Attributes describe properties or characteristics of these entities, providing additional contextual information. Relationships, on the other hand, establish connections and associations between entities
At the heart of knowledge graph representation lies the concept of a node-edge structure. Nodes represent entities or concepts, while edges capture the relationships or connections between these entities. When visualised as a graph, it allows for a clear and intuitive depiction of the interconnectedness and dependencies within the knowledge domain.
A widely adopted representation format for knowledge graphs is the Resource Description Framework (RDF). RDF represents knowledge as triples, consisting of subject-predicate-object statements. The subject represents the entity, the predicate denotes the relationship, and the object signifies the related entity or value.
In the realm of knowledge representation and organisation, ontologies play a crucial role in structuring information and enabling effective data integration and reasoning. An ontology holds the master knowledge scaffolding of an organisation: a complete, consistent data model of the business. Its a framework designed to make AI work more effectively. In this blog section, we will explore the concept of ontology, its purpose, and its significance within the realm of knowledge graphs.
Ontologies facilitate data integration by providing a shared understanding of the underlying concepts and relationships. They are also instrumental in knowledge engineering, allowing for the formal representation and organisation of domain-specific knowledge.
With speed and effectiveness of a company being at the forefront of industrial competition, building an ontology is not just another IT project. It is one that should matter to every CEO, CMO, and senior manager inside the company. Once you create the framework for the ontology, you can get more from your current investments in technology and apply emerging artificial intelligence techniques to drive your business. The ontology is the tool that teaches intelligent machines how your business runs. Without it, neither your systems nor your employees can truly understand how to access and organise the lifeblood of the business – the knowledge and information that provides value for your customers and the marketplace. [3]
Ontologies provide a strategic advantage in transforming your company into an AI-powered enterprise, enabling accelerated operations and enhanced decision-making. By leveraging AI capabilities supported by an ontology, your business can gain a competitive edge. It enables faster product and service delivery, efficient customer service, and the ability to capitalise on emerging market opportunities. The streamlined information flows facilitated by the ontology empower both automated systems and human decision-makers, driving improved outcomes.
Your ontology has the potential to accelerate your enterprise whether you work in retail, finance, healthcare, government, or any other sector. Additionally, it provides deep insights rooted in individuals’ underlying motivations rather than surface-level attributes like age, leading to product enhancements that align with those motivations.
Ontologies begin as a holistic understanding of the language of the business and the customer, and are then designed into processes, applications, navigational structures, content, data models, and the relationships between concepts. They contain language variations, alternative spellings, translations, acronyms, and technical terms. They can describe “is-ness” and “about-ness” – this is a contract, it is about a services engagement, it is also about this vendor, for example.
Ontologies can also support advanced capabilities to drive intelligent virtual assistants (bots). They can form the basis for inference engines – mechanisms to essentially answer a question that has not been preprogrammed into the bot. Bots powered by ontologies are faster to deploy, more scalable, and more cost-effective. Every aspect of business requires contextualised knowledge. The role of AI is to use the ontology to assist with this contextualisation
At its core, an ontology is a formal and explicit specification of shared conceptualisation. It provides a structured framework for representing knowledge by defining a set of concepts, their properties, and the relationships between them. Essentially, an ontology serves as a common vocabulary that enables effective communication and understanding between humans and machines.
An ontology consists of several key components, including concepts, properties, and relationships. Concepts represent the various entities or classes within a domain, defining their characteristics and behaviours. Properties describe the attributes or features associated with these concepts, specifying their roles and relationships. Relationships establish connections and dependencies between concepts, capturing the associations and interactions within the knowledge domain.
So knowledge engineering refers to the process of capturing, organising, and modelling knowledge to create intelligent systems or applications. It involves the systematic and structured approach of acquiring, representing, and maintaining human knowledge within a knowledge-based system. The overall goal of knowledge engineering is to support dynamic and intelligent information architectures, so that systems that can reason, learn, and make decisions based on the captured knowledge.
Knowledge graphs are highly relevant to artificial intelligence (AI) because they provide a structured and interconnected representation of knowledge, enabling AI systems to access and reason over vast amounts of information. By capturing entities, attributes, and relationships, knowledge graphs facilitate a deeper understanding of data and enhance the interpretability and explainability of AI models.
Knowledge graphs play a vital role in achieving explainability in AI systems by providing transparency and justifications for AI-based decisions. They enable semantic search, data integration, and knowledge discovery, making AI systems more intelligent and capable of making informed decisions based on interconnected knowledge.
“there can be no meaningful AI without IA”
The importance of information retrieval for generative large language models (LLMs) is widely acknowledged in the data science community. While LLMs rely on retrieving information to enhance their reasoning abilities, their stored knowledge is often unreliable and difficult to update.
These models excel at reasoning based on the context provided during inference. However, a bias has emerged in the first half of 2023, suggesting that vector search is the only way to achieve information retrieval for LLMs. This is incorrect, and limiting practitioners towards a single approach (vector databases) instead of focusing on the broader capabilities of knowledge graphs for information retrieval. It would appear this bias is impeding advancements in the interpretability of AI decision-making. [4]
While vector-based retrieval techniques, such as word embeddings or neural network-based models have their merits in representing and retrieving information, knowledge graphs offer some distinct advantages:
Knowledge graphs offer a structured representation of interconnected knowledge, fostering contextual understanding, explainability, reasoning, and data integration within AI systems. While vector-based retrieval techniques have their strengths in certain applications, knowledge graphs provide a distinct framework for capturing, organising, and leveraging knowledge to enhance AI capabilities.
NB: A nice introduction to vector based embeddings written by our colleague Narin Meher can be found here : https://www.bigspark.dev/word-embedding-word2vec-algorithm-implementation-using-sparkmllib/
References
[1]Singhal, A. (2012) Introducing the Knowledge Graph: things, not strings. https://blog.google/products/search/introducing-knowledge-graph-things-not/
[2]OCCRP, Things not strings. (2020) https://medium.com/occrp-unreported/things-not-strings-knowledge-graphs-for-investigative-reporting-9d8a26913f65
[3] Earley, S (2020) The AI powered Enterprise. LifeTree Media
[4]Harman, C. (2023) Beware Tunnel Vision in AI retrieval. https://colinharman.substack.com/p/beware-tunnel-vision-in-ai-retrieval?sd=pf
To find out more about bigspark fill out your details below, and we will be in touch.