Semantic knowledge base - title

Semantic knowledge base

Búsqueda en la base de conocimiento

Visualizador de la base de conocimiento

Resource alignments: Main elements of a proper methodology

This article intends to describe the alignment outcomes, deliverables and methodology. The main idea is to perform automatic alignment between two RDF (mainly SKOS) datasets based on lexical content comparison. The expected result is a set of resource pairs (each from a different dataset) that shall be considered the same or similar, with various degrees of confidence.

Goals

One or multiple files alignment files (in SKOS or EDOAL formats)
One or multiple files containing evaluation samples
A report describing the preliminary dataset assessment, the designed process and parameters, the output alignment files and a final basement

Methodology

Preliminary assessment

In this step the asset pair or set of asset pairs are established and their initial state is assessed to define whether they are suitable as input for the automatic alignment software. Attention shall be paid to both technical and content quality, available languages, presence of duplicates, encoding, estimated pre-processing operations and other aspects. At this step is important to document the initial state of the resources, business relevance of the resources, some of their history, internal structure, then describe what are the final outcomes followed by an enumeration of intended operations to be performed.

Pre-processing

Based on the initial assessment the input datasets are cleaned up, normalised and transformed into a form suitable for the automatic alignment software.

Useful tools during the pre-processing phase are:

VocBench3: Sheet2RDF tool
KNimes
LinkedPipes ETL
SKOS Play from Sparna
OpenRefine
Custom Python scripts

Alignment design

Following parameters of the project are established in this step:

Main inputs: a pair of datasets or in case of batch alignments many-to-one or one-to-many (don’t do many to many)
Main outputs: SKOS and/or EDOAL formats
Matching rules:
- Exact matches: only based on perfect equality operator (expected one output) OR
- Close matched: based on a designed comparison operator (expected multiple outputs, one per degree of confidence: high, medium, low)

Comparison operator(s) design

The operators are encoded in SILK workbench as a Linking Task.

The main fields considered by the alignment comparison operator are linguistic in nature. This means that concepts such as language, word, spacing, sequencing, capitalisation, script, encoding, transliteration and others shall be taken into consideration. In case of SKOS datasets (most of them are expected to be such) the following properties are considered of primary relevance (with various weights):

skos:prefLabel, skos:altLabel
skos:definition, skos:scopeNote
rdfs:label, rdfs:comment

In designing the alignment procedure please consider the relevant factors from the systematisation presented below.

Etiquetas

alignments semantic technologies

Más reciente

How to extract a list of concepts from a vocabulary 23 de mayo de 2025

Federated queries 23 de mayo de 2025

Semantic technologies in practice 23 de octubre de 2021

Más popular

Federated queries 55078 Accesos

Semantic technologies in practice 36008 Accesos

How to extract a list of concepts from a vocabulary 5 Accesos

Semantic technologies in practice Anterior Federated queries Siguiente

Semantic knowledge base

Resource alignments: Main elements of a proper methodology

Resource alignments: Main elements of a proper methodology

Goals

Methodology

Preliminary assessment

Pre-processing

Alignment design

Comparison operator(s) design

Let’s collaborate

¿Necesita ayuda?

Síganos

Cuestiones jurídicas

Sobre nosotros

Recursos

Herramientas

Contactar con la UE

Redes sociales

Instituciones y organismos de la UE

Semantic knowledge base

Resource alignments: Main elements of a proper methodology

Resource alignments: Main elements of a proper methodology

Goals

Methodology

Preliminary assessment

Pre-processing

Alignment design

Comparison operator(s) design

Let’s collaborate

Oficina de Publicaciones de la Unión Europea

¿Necesita ayuda?

Síganos

Cuestiones jurídicas

Sobre nosotros

Recursos

Herramientas

Unión Europea

Contactar con la UE

Redes sociales

Instituciones y organismos de la UE