Semantics is not just a branch of linguistics; it goes much further than that. It can offer us a set of resources that facilitate data management, helping us in the search for and understanding of the relationships that exist amongst data.
According to the Dictionary of the Royal Spanish Academy, semantics is a discipline that studies the meaning of linguistic units and of their combinations. As per this definition, it can be said that semantics is responsible for the meaning of language. Semantics also provides us with a universal framework to describe and link concepts.
You may be asking yourself what’s the deal with this language arts class. Aren’t you reading a document management blog? Don’t worry, you’re not at the wrong blog. It turns out that this subject that we studied in secondary school is applicable to document management.
The combination of semantics and document management opens the door to a whole world of applications in the context of metadata and the relationships between documents. Semantics technology allows you to reduce any type of information to simple pieces that can be easily linked amongst themselves and which can be taken advantage of using computing.
How does semantics change the way data are used?
It’s very common for companies and organisations to have data or documents that are isolated. It’s true that with certain designs of document management systems, you try to have a repository in which entities are defined. This design is prioritised over making the document management system into a big junk drawer where we stuff one-off documents.
Let’s look at an example so that this is clearer. In the case of insurance companies, everything can be saved in a document management system under a single document type in which we have a giant form with fields or metadata that, depending on the information we receive, can be used or not. For some documents, some metadata will be applicable; for other documents, other metadata. But normally, given that it makes much more sense for users, documents are grouped by entity.
In this very basic way, we are starting to use semantics. We define an aggregating entity called “accident”, made to store documents related to a specific procedure that occurs when a client makes a claim. We can link the accident’s supporting documents by means of a unique metadata identifier that allows us to link the documents from that file.
But we can go further with semantics. There is an accident because there is a client who has taken out a policy with us. Without that policy, there is no coverage for the accident. Therefore, another entity emerges—the “policy”—which is related to a “policyholder” who, in turn, is another entity. We can manage all of these relationships between entities as metadata, defining keys that link entities.
This makes sense for users. It also makes sense that from the accident, they would want to consult the policy and policyholder.
As I was saying, we can make a nearly infinite number of links between metadata, seeking a 360º vision of our business. What policies does a policyholder have? How many accidents? What products are applicable to them?
But just imagine, what if we didn’t have to define these keys and values that link the entities, but rather they were found automatically?
By means of computational language analysis, semantic technology can find entities of interest to us as well as the relationships amongst them.
Semantics can help us build a 360º view much more efficiently
Semantics offers us a way to understand the context of the documents and the data. But beyond to automatic cataloguing and building of relationships between documents that allows us to contextualise them, what are the possibilities?
Let’s continue with the example of insurance companies. Let’s suppose that the insurance companies have a black list. These are people who are suspected of having committed fraud; this is why they’re on the list.
Carrying out a computational semantics analysis of the text of our documents, we can find entities such as, for example, people (policyholders, claimants, intermediaries, etc.) Let’s suppose that, analysing millions of policies and with this blacklist, we can automatically detect some type of relationship between a policyholder and a person on the black list. Could we automatically trigger a fraud alert in our system?
We are talking about using semantics to build a vision that is too complex for people to create, given that the amount of data makes it humanly impossible. Semantics, however, offers us this possibility.
To sum up, this context that semantics provides us is a means of enriching and improving document management. The entirety of its applications are yet to be seen, but it is promising.