Semantic knowledge base
Reference data is about using common resources. It brings economy of
scale and it makes the exchange of data easier between systems.
This is important in terms of efficiency but it might present some
difficulties when we get down to implementation in particular
systems.
Let’s take a specific case like the Language authority
table (https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/language).
You can see how many concepts are in a concept scheme using a SPARQL query like this:
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
select count(?c) as ?count
from
<http://publications.europa.eu/resource/authority/language>
where
{
?c skos:inScheme ?scheme .
}
Each of those concepts inside the table has a preferred label that
will be displayed if you use the Language authority table as your
reference list of languages for the new application you are working on
(that can be an information system, website, or mobile app).
Of
course, in many cases, such a big list is too long
(impractical).
Depending on the target audience of the
application only a limited number of languages need to be
listed.
But each institution or system has its own needs. How
can you serve all needs with a single authority table?
How to organise the list to serve this purpose
There are multiple ways to group concepts and we will get through some of the options.
The first option that comes to mind, standard-based, is to use separate concept schemes. We do this for EuroVoc (https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/eurovoc). Yet, this option is not sufficiently flexible and becomes impractical in some usage scenarios (the use of concept schemes will be treated in a different article).
Collections are also a standardised mechanism, provided by SKOS to
group and organise concepts.
The definition given in the
specification of the SKOS data model (https://www.w3.org/TR/skos-reference/#collections)
says :
“Collections are useful where a group of concepts shares
something in common, and it is convenient to group them under a common
label, or where some concepts can be placed in a meaningful
order.”
As seen in this definition, collections of languages can
be considered to group languages but the collection approach cannot be
applied in the described case. On a case-by-case basis, systems and
applications might need to use sub-lists that span across multiple
collections and eventually will use just parts of some collections.
Many collections will need to be created and managing each one of them
will be a complex action.
What to do then? A new mechanism to group concepts, one that can be customised for each system, had to be defined and for that, we have added the useContext property. This approach is widely used (see for example the custom properties in Wikidata).
We see useContext as a method to provide each system with the possibility to define its own specific list of concepts based on a logic that might not be relevant for everyone else. This sounds like a custom list, but given that it is integrated with the standard authority table it still offers at least some of the advantages of a list that is centrally maintained and agreed upon as is the case of reference data.
To formalise this mechanism a new so-called technical table was
created, namely the Use context authority table (https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/use-context).
This is where we define the different values that the
useContext property can take (e.g., EURLEX, EUDOR, TED). Practically,
every major system that decides to use a subset of an authority table
can choose to define its own context getting as such the possibility
to opt-in or out of what value it uses from any of the tables.
We already have over 50 concepts in the Use context table and this
shows how useful the property is.
Of course, the next question is, how to make use of useContext in specific authority tables. The obvious answer is “choose a context and select values from the desired table based on the respective property content”.
Let’s say we want to use in our system a list containing the French labels of the languages that are recognised also by the TED platform. The corresponding useContext will be: TED
The script listing all the French preferred labels (skos:prefLabel) found in the Language authority tables under the TED context will take the following form:
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
PREFIX lemon: <http://lemon-model.net/lemon#>
select
?concept
?label
?definition
# chose the table you are looking into
from <http://publications.europa.eu/resource/authority/language>
WHERE {
# select the useContext
values (?useContext){
(<http://publications.europa.eu/resource/authority/use-context/TED>) }
?concept lemon:context ?useContext .
# indicate the properties and the language associated
OPTIONAL {
?concept skos:prefLabel ?label
.
FILTER
(lang(?label)="fr")
}
OPTIONAL {
?concept skos:definition
?definition .
FILTER
(lang(?definition)="fr")
}
}
order by ?concept
Just adjusting the values marked in bold that indicate the authority table to be interrogated, the language, and the specific useContext you can adapt the output to your own needs.
That is not all. If the specific needs of your system, speaking for European Union institutions, requires filtering that is different from the other major EU platform already listed in the Use context table, you can also request the creation of a new use context. You can do this by sending a request to OP-EU-VOCABULARIES@publications.europa.eu, describing the reasoning behind the particular needs of your context. Following Publications Office’s governance policy for the Use context authority table, the new value can be added to the table, giving you the possibility to define your own context for specific tables.
What should I choose?
The power offered to you as the owner of a particular Use context also brings responsibilities. You will have to define the exact rules and exceptions for the application of the useContext property to all the authority tables. Even more, you will need to follow the publication process of the authority tables (see schedule) and provide feedback on the treatment of new or changed concepts associated with each table in your specific useContext. That means that a decision to have your own useContext has to be taken carefully because it requires the allocation of resources for the maintenance of that context across the authority tables.
You always have to consider the fact that if you choose to use the specific context of another platform, that context might diverge in time from your own needs. The file types recognised by a platform like EUR-Lex are for example more limited than what is accepted in the context of TED and it is your responsibility to monitor if some new concepts are added in a table that is of use for your system but not included yet by the useContext you had choose to follow.
As seen, this powerful property offers advantages but there are also some negative aspects to be considered. The alternative option is to assume the generic advantages of the reference data and avoid using useContext altogether. This will give you the benefit that your system will be continuously aligned with the concepts that are maintained by the Publications Office of the European Union following the Corporate Reference Data policy that governs our activity.
Schlagwörter
Neueste
Terms used in our reference data catalogue
18. September 2024
How to extract a list of concepts from a vocabulary
18. September 2024
About the use of Authority tables
12. August 2024
Identifiers and how to use more than one
6. Oktober 2022
Labels and data models, why and how to use them
5. Mai 2022
Dissemination formats
24. Januar 2022
Federated queries
24. Oktober 2021
Semantic technologies in practice
23. Oktober 2021
About reference data
23. Oktober 2021
|
Beliebteste
Dissemination formats
57020 Aufrufe
Terms used in our reference data catalogue
55244 Aufrufe
Federated queries
55075 Aufrufe
About the use of Authority tables
53200 Aufrufe
Labels and data models, why and how to use them
52491 Aufrufe
Identifiers and how to use more than one
50444 Aufrufe
About reference data
37808 Aufrufe
Semantic technologies in practice
36004 Aufrufe
|