Publications Office of the EU
How to extract a list of concepts from a vocabulary - EU Vocabularies
DisplayCustomHeader
Semantic knowledge base - title

Semantic knowledge base

Knowledge Base Display

How to extract a list of concepts from a vocabulary

This is a tutorial for beginers in the usage the Publications Office vocabularies and of semantic technologies.

You are working on a report, a study or developing an web application. You need a list of values about a specific subject.

It can be a list of languages, measurement units, countries, statuses and so on.

You can either build the list yourself or get that list from a reference data vocabulary that is carefully maintained by the Publications Office of the European Union in it's catalogue.

 

Getting a list of concepts from a particular vocabulary is the most basic need you can have in relation with the datasets stored in our catalogue.

The present article tries to introduce you to the process of extracting such a list as a beginner.

 

We can consider for example a situation where you need a list of Procurement procedure types.

Our catalogue provides a carefully maintained vocabulary, available as an independent dataset, covering the different types of procurement procedures.

You can find it here Procurement procedure type - EU Vocabularies - Publications Office of the EU (europa.eu)

 

The first option is of course to copy/paste the information or download one of the files associated with it (there are multiple distribution formats available here). The website interface can help you in this regard.

 

Yet there are situations where you need the information presented differently or to limit the fields shown or you need to automate somehow your process.

The way to do that is to directly and dynamically extract data from our data repository (CELLAR) using a semantic query.

Queries like this can be embedded directly in your application, website and even in Excel tables.

 

Such a query is written using the SPARQL querying language.

 

We assume that you need a list of procurement procedure types with their standard names and definitions.

 

First you need to find the vocabulary in the catalogue and open its page (the Browse content page).

Looking on the page of the vocabulary we need to find the specific graph that identifies the list in the repository.

This appears under the name: Concept scheme URI

You can see it below marked in yellow.


 

Now you can open the interface that facilitate the interrogation of the catalogue.

To do that you ned to go to the following address: Virtuoso SPARQL Query Editor (europa.eu)

You will be presented with a standard interface that looks like this:

Interface of the SPARQL endpoint of the Publications Office
 

The main elements of the interface are marked again in yellow.

First you need to copy the script listed bellow in the Query text box of the interface.

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
#1. Here we indicate the columns that are needed in the resulting table

SELECT ?Label ?Definition

#2. On the line bellow we identify the vocabulary of interest
FROM <http://publications.europa.eu/resource/authority/procurement-procedure-type> 

WHERE {

       # The following lines are actually extracting the data
       ?c skos:prefLabel  ?prefLabel .
       OPTIONAL { ?c skos:definition ?prefDefinition }
       BIND (str(?prefLabel) as ?Label ).
       BIND (str(?prefDefinition ) as ?Definition).

       #3. Here we chose the language to be used
       #   A different language can be selected by replacing “en” 
       #   with the respective two letter ISO code
       FILTER (lang(?prefLabel )="en"). 
      }

ORDER BY ?prefLabel

You can see in the script some elements marked in yellow.

These are parameters that you can change as following:

  1. The fields that shall be displayed in the list (here we have selected the Label, the Definition). Depending on the vocabulary, there can be other properties available, but those two are in most cases accessible. Reading other articles in our knowledge base you can learn how to add different other properties here.
  2. The target vocabulary. Most vocabularies in the catalogue can be indicated here using the string from the interface as shown before.
  3. The language to be shown. Most of the vocabularies are available in all the official EU languages and the option you make is based on the code of the language (some examples beingL en, fr, de, it).

 

Once the script is inside the query window we can run the query by pressing on the Run query button.

The result will be a table displayed on the screen. You can now go back to the query interface (use the brownser back button) and adjust the language or chose another vocabulary and run the query again. Once you you acomodate with the process you will realise that it is not at all dificult to do this and in time you will learn to use it extensively. You are more the welcome to check other more advanced article in the knowledge base to better understand how to further develop this competence.

Tags
cellar authority tables reference data controlled vocabularies semantic technologies sparql
Most Recent
Dissemination formats 24 Eanáir 2022
Federated queries 24 Deireadh Fómhair 2021
Semantic technologies in practice 23 Deireadh Fómhair 2021
About reference data 23 Deireadh Fómhair 2021
How to prepare a publication package Previous