LOTERRE - Linked Open TERminology REsources

picto Loterre ?

Loterre ?

Loterre ?

Browse through this FAQs to find answers to commonly raised questions about Loterre.

Presentation of Loterre

What is Loterre?

Loterre (acronym for Linked open terminology resources) is a platform for exposing and sharing multidisciplinary and multilingual scientific terminology.
Based on a triplestore, it complies with open and linked data (LOD) web standards and FAIR principles, which aim to make data Easy to find, Accessible, Interoperable and Reusable.

Who is Loterre for?

The terminology available in Loterre can meet the needs of text-mining, semantic annotation, information retrieval or translation.
Access to the resources sotred in Loterre is open to all, with the subsequent use of each one being linked to the license that governs it.

What is Loterre used for?

Loterre gives access to scientific terminology resources and allows:

  • to browse them
  • to search them via a search engine, a SPARQL endpoint and an API
  • to download them in several formats

Loterre also offers online services, the content of which is detailed in the “Loterre services” paragraph of this page.

Who can expose a terminology in Loterre?

Loterre is not restricted to Inist terminological resources. It offers its services to other producers of terminological data, provided they have requested the service using the proposition form.

Interested producers are invited to read the Loterre Charter.

Loterre and FAIR principles?

The FAIR principles (Findability, Accessibility, Interoperability, Reusability) applicable to the scientific data were developed by Force11 and published by Wilkinson et al. in 2016 (The FAIR Guiding Principles for scientific data management and stewardship). The steps involved in FAIRification have been explained by GO FAIR.

These principles form a guide to good practice for the management and reuse of data and metadata by both machines and humans. However, they do not constitute a specification because they do not recommend any particular standard, technology or data format.

In addition, FAIR data are not necessarily “open” and may have different degrees of FAIRness and/or openness. LOD (based on semantic web standards) and FAIR (based on principles) should not be confused: see on this subject “Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud” (2017)

The terminology (meta)data presented in Loterre meets all the FAIR principles. Indeed, they are:

  • Findable: concept associated with a unique and persistent identifier (URI)
  • Accessible: free and open access protocol (http)
  • Interoperable: language for knowledge sharing and representation (SKOS)
  • Reusable: clearly defined user license (CC or Etalab)

They can also participate in the FAIRification of (meta)research data by promoting their semantic interoperability (through vocabulary or thesaurus concepts).

Similarly, the Check, Transform and Align services aim to facilitate the creation and enrichment of terminology in SKOS/RDF-XML format according to FAIR practices.

Loterre and the Inist terminologies that the platform hosts are reported in the FAIRSharing portal: https://fairsharing.org/collection/Loterre

Loterre and Linked Open Data?

Loterre aims to comply with the principles of LOD (Linked Open Data) as presented in 2006 by W3C (Tim Berners-Lee): terminology resources are considered here as organized sets of terms (designating concepts) that are freely accessible via semantic web technologies.

Linked Data or “web of linked data” is based on 4 basic rules:

  1. Identify each resource (or concept) by a URI (Uniform Resource Identifier)
  2. Use HTTP URIs (dereferenceable) to access information on these resources
  3. When dereferencing a URI, return to structured data using the W3C family of standards: model (triplets RDF) and languages (SKOS, RDFS, OWL…) to describe them; SPARQL to query them
  4. Link concepts (RDF data) belonging to different vocabularies or terminologies through alignments via their URIs, in order to create a network of RDF links and thus discover new relationships

By adding open licenses for the distribution and reuse of resources published on the web, Loterre complies with the rules of “Linked Open Data”.

T. Berners-Lee also proposed a progressive classification of LODs with 5 stars according to the following criteria:

* Data freely available on the web, with mention of an open license

** Data in a structured, machine-readable format

*** Non proprietary formats (CSV, JSON, …)

**** W3C open standards (RDF, XML, SPARQL) and URI as resource identifier

***** Data linked to other RDF data via alignments in the LOD Cloud

Finally, the resources integrated into the triplestore are intended, as far as possible, to comply with W3C best practices for the publication of related data (W3C, 2014).

All the resources exposed in Loterre are compatible with these rules and principles and can be classified as 4 or 5 stars:

  • Free (lo) and/or open (CC BY) license
  • Standard formats: SKOS, XML, JSON-LD, CSV
  • Alignable URIs

Loterre terminologies

What types of terminology are integrated into Loterre?

It is aimed, wherever possible, that the resources integrated into the triplestore conform to good practices of the W3C relative to the publication of linked data.

Ideally, they should respect the criteria allocated to “5 star” data: Linked Data – Design Issues

  • Data freely accessible on the web (with the mention of an open licence);
  • Data in a structured format (readable by the machine);
  • No-ownership format (CSV, RDF, etc.);
  • W3C and URI standards to identify each resource;
  • Data linked to other RDF data via alignments.

Schematically, the terminological resources exposed on Loterre can be of the type:

  • Vocabulary in the wider sense of the term: Glossaries, lists of terms, Thesaurus;
  • Taxonomies (classification plans and similar elements).

The data of lexical resource types, resources of content or ontology analyses, can only be integrated into Loterre if they are converted into the SKOS format, which can involve a loss of information compared to the original content.

The resources may:

  • Present a hierarchy: simple, multiple, or none;
  • Include groupings (collections, groups, fields, facets).
What is the linguistic coverage of the terminologies exposed in Loterre?

The resources exposed in Loterre may:

  • Be monolingual, provided the language is French or English;
  • Be multilingual, provided that one of the languages is either French or English.

The possibilities of posting and searching in a given language are linked to the characteristics of exposure/queries tools connected to the triplestore.

What types of licenses should govern the terminologies exposed?

The resources exposed in Loterre must have a license authorizing the availability and reuse of data, such as:

  • creative commons: http://creativecommons.fr/licences/
  • ODC-By: https://opendatacommons.org/licenses/by/1-0/
  • ODbL: https://opendatacommons.org/licenses/odbl/1.0/
  • Open Licence (Etalab): https://www.etalab.gouv.fr/licence-ouverte-open-licence
  • PDDL: https://opendatacommons.org/licenses/pddl/1-0/
  • PDM: https://creativecommons.org/publicdomain/mark/1.0/
What is the data model for the terminologies in Loterre?

The terminological data integrated in the triplestore are expressed according to a model of the type “extended SKOS”, which associates a certain number of categories belonging to other formats or languages (SKOS-XL, Dublin Core, Isothes, OWL, RDFS, etc.) to the SKOS standard.

Ontology of Loterre (click on the picture to reduce it)

How are the terminologies displayed in Loterre selected?

Terminologies proposed by third parties are subject to a review based on quality and format criteria. The owners of the Loterre site reserve the right to moderate the proposals received. They may refuse to integrate a terminology if they consider that it does not meet the criteria governing the platform.

What is the scientific cover of the terminologies exposed in Loterre?

The scientific cover of data exposed on Loterre is multidisciplinary and depends on one or several of the following scientific fields:

  • Sciences and technology
    • Mathematics
    • Algebra
    • Mathematical analysis
    • Numerical analysis, scientific computation
    • Combinatorics, ordered structures
    • Geometry
    • Probability and statistics
    • Dynamic systems, global analysis and analysis on manifolds
    • Group theory
    • Topology, manifolds and cell complexes
  • Physics
    • Acoustics
    • Crystallography
    • Electromagnetism, optics
    • Solid mechanics, fluid dynamics, rheology
    • Metrology
    • Atomic and molecular physics
    • Condensed-matter physics
    • Physics of gases and plasmas
    • Nuclear physics
    • Classical physics, quantum physics, statistical physics, relativity and gravitation
    • Thermodynamics, heat transfer
  • Earth and universe sciences
    • Aeronomy, meteorology, climatology
    • Astronomy
    • Geology, internal geophysics
    • Glaciology
    • Oceanography
  • Chemistry
    • Analytical chemistry
    • Theoretical chemistry, general chemistry and physical chemistry
    • Inorganic chemistry
    • Organic chemistry
  • Engineering
    • Aeronautics, transportation
    • Operational research, control theory
    • Electronics, computer sciences
    • Energy, electrical engineering, electrical power engineering
    • Chemical engineering, chemical and parachemical industry
    • Civil engineering, buildings and public works
    • Mechanical engineering
    • Metallurgy
    • Polymer industry, paints, wood
    • Telecommunication, signal theory and processing
  • Nanosciences, nanotechnology
  • Life sciences, environmental sciences
    • Biology, health
      • Molecular and structural biology, biochemistry
      • Genetics, genomics, bioinformatics
      • Animal cellular and developmental biology, zoology, veterinary sciences
      • Physiology, pathophysiology, systemic medical biology
      • Neurobiology
      • Immunology, microbiology, virology, parasitology
      • Epidemiology, public health, clinical research, biomedical technologies, pharmacology, toxicology, medical sciences
    • Environmental sciences
      • Plant cellular and developmental biology, botany
      • Evolution, ecology, population biology
      • Biotechnologies, environmental sciences, synthetic biology, agronomy, forestry, food industry
  • Humanities and social sciences
    • Markets and organisations
      • Economy
      • Finance, management
    • Standards, institutions and social behaviour
      • Law
      • Political sciences
      • Anthropology and ethnology
      • Sociology, demography
      • Information and communication sciences
    • Space, environment and society
      • Geography
      • Land and urban planning
      • Architecture
    • Human spirit, language, education
      • Linguistics
      • Psychology
      • Educational sciences
      • Science and technology in physical activities and sport
    • Languages, texts, arts and cultures
      • Ancient and french languages and literature, comparative literature
      • Foreign literatures and languages, civilisations, regional cultures and languages
      • Arts
      • Philosophy, religion sciences, theology
    • Ancient and contemporary worlds
      • History
      • History of art
      • Archeology

Loterre conception

What architecture is supporting Loterre?

The architecture of Loterre is based on a triplestore connected with a browsing tool and searchable via a SPARQL interface and an API.

 

To offer its users access to terminologies, Loterre calls upon various open-source tools:

  • a Jena-Fuseki Apache triplestore
  • Skosmos, to allow resources to be browsed on line
  • The Skosmos REST API to allow a distant software agent to query and recover data
  • YASGUI for the formulation of SPARQL queries
  • ezArk to simplify the distribution of ARK identifiers where necessary
Who created Loterre?

Loterre was designed by the Inist-CNRS

Loterre services

What services are offered in Loterre?

In addition to downloading terminological data, Loterre offers a range of services for terminology producers, whether or not they want to share their work in Loterre.
The Check, Transform and Align services aim at facilitating the creation and enrichment of SKOS/RDF-XML terminological files, in accordance with the FAIR data principles (Findable, Accessible, Interoperable, Reusable).

n.b.: The user data are not stored by Inist-CNRS.

What can be done with the “Align” service?

This service aligns (maps) a valid SKOS/RDF-XML file (source) with a Loterre terminology resource (target) specified by the user.
It checks whether a term (preferential or synonymous) of a concept A of the source vocabulary is identical to a term (preferential or synonymous) of a concept B of the target vocabulary (for the same language code). It processes files containing “skos:Concept” (short form of the RDF/XML syntax) or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]”.
Note that alignment is based on a comparison of strings, which must be identical on both sides, without taking into account the context; the alignments will therefore have to be validated by the user.

2 variants of this service are available:

  • service “Align a valid SKOS/RDF-XML file with a terminology hosted in Loterre by inserting the property “skos:exactMatch’ in the source file”:
    if a term (preferential or synonymous) of a concept A of the source vocabulary is identical to the preferential or synonymous of a concept B of the target vocabulary, a “skos:exactMatch” property is inserted in the concept A with the URI of the target concept in the attribute “rdf:resource”.
  • service “Align a valid SKOS/RDF-XML file with a Loterre vocabulary by producing an alignment file”:
    if a term (preferential or synonymous) of a concept A of the source vocabulary is identical to the preferential or synonymous of a concept B of the target vocabulary, a new file is created (RDF-XML format) and then for each alignment :

    • a “rdf:Description” record is created,
    • the URI of the source concept is set in the “rdf:about” attribute of this record,
    • a “skos:exactMatch” property is created in this record,
    • the URI of the target concept is set in the “rdf:resource” attribute of this property.

The records in the alignment file can be added as they are at the beginning or end of the source file.

What can be done with the “Annotate” service?

The “Annotate” service allows you to search, in a portion of text written in French, English or Spanish, for the presence of terms (preferred labels and synonyms) from a terminology hosted in Loterre.

It displays the text in which the detected terms have been highlighted and a table of corresponding terms and concepts, with their URI.

What can be done with the “Transform” Service?

The “Transform” service enables to obtain terminology in SKOS-XML format or to convert a terminology initially in SKOS-XML format into another format.
The modules offered by this service can be grouped into three types: correction, enrichment and conversion.

Correction

These modules are in particular intended to correct anomalies previously detected by the “Control” service.

Remove term duplicates in a SKOS/RDF-XML file

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

At the level of each concept, this service operates as follows:

  • All the preferred labels are kept;
  • Alternative labels are compared with each other and with the preferred label of the same language:
    • if the same term appears several times as an alternative label of the same language, only one occurrence is kept;
    • if the same term appears both as an alternative label and the preferred label of the same language, only the preferred label is kept.
  • Hidden labels are compared with each other, with the preferred label, and with alternative labels of the same language:
    • if the same term appears several times as a hidden label of the same language, only one occurrence is kept;
    • if the same term appears both as a hidden label and the preferred label of the same language, only the preferred label is kept;
    • if the same term appears both as a hidden label and an alternative label of the same language, only the alternative label is kept.

At the end of this process, check the file again using the Controlling a SKOS/RDF-XML file at the concept level service to ensure that there are no more duplicates.

Correction of symmetry anomalies of related concepts in a SKOS/RDF-XML file

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

If a concept A is associated to a concept B through the skos:related property, the concept B must be associated with the concept A because the relation is symmetrical. Cf. SKOS Reference (Axiom S23).

If this condition is not checked, this service allows inserting the missing “skos: related” property.

Note that this treatment does not apply to any sub-properties of the skos: related property.

Insertion of specific concepts

Files containing “skos:Collection” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Collection’]]” or “rdf:Description[rdf:type[@rdf:resource=’http://purl.org/iso25964/skos-thes#ConceptGroup’]]” are processed by this service.

The hierarchical relationship between a collection A and a collection B is expressed using the “isothes:superGroup” property. The presence of an “isothes:subGroup” property (which is the inverse relationship) at the level of collection B is not mandatory because it is inferred from the “isothes:superGroup” property.

However, the proper functioning of some applications (like Skosmos) requires the presence of both relationships. This service allows to insert at the level of the broader collection as many “isothes:subGroup” properties as narrower collections of this collection.

Enrichment

Insertion of a ‘ConceptScheme’ class and a licence in a SKOS/RDF-XML file

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

This service inserts two classes at the beginning of a SKOS/RDF-XML file:

– a “cc:License” class with the default CC-BY 4.0 Creative Commons license that should be changed if the resource is released under a different license.

– a “skos:ConceptScheme” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#ConceptScheme’]]” class with:

  • an URI derived from concept identifiers;
  • properties for metadata to be completed / modified by the user at the output file level:
  • English, French and Spanish titles (dc:title),
  • English, French and Spanish descriptions (dc:description),
  • English, French and Spanish subjects (dc:subject),
  • creator name (dc:creator),
  • license name (cc:license),
  • English, French and Spanish names of organization / institution to which the resource must be attributed (cc:attributionName),
  • web site of organization / institution to which the resource must be attributed (cc:attributionURL),
  • top-concepts (skos:hasTopConcept) if the resource is highly structured,
  • resource languages as calculated from language tags of preferred labels of concepts (dcterms:language with lexvo/ISO 639-3 code attribute),
  • creation date (dcterms:created),
  • last modification date (dcterms:modified),
  • version (owl:versionInfo).

After the fields have been generated, their textual content must be completed and validated by the user.

Insertion of a property ‘hasTopConcept’

 Files containing “skos:ConceptScheme” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#ConceptScheme’]]” are processed by this service.

This service inserts a “skos:hasTopConcept” property into the “ConceptSCheme” block for each concept that does not have a “skos:broader” property.

Do not use this service for unstructured or loosely structured resources.

Insertion of a property ‘topConceptOf’

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

This service inserts a “skos:topConceptOf” property into each concept that does not have a “skos:broader” property.

Do not use this service for unstructured or loosely structured resources.

Insertion of narrower collections

Files containing “skos:Collection” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Collection’]]” or “rdf:Description[rdf:type[@rdf:resource=’http://purl.org/iso25964/skos-thes#ConceptGroup’]]” are processed by this service.

The hierarchical relationship between a collection A and a collection B is expressed using the “isothes:superGroup” property. The presence of an “isothes:subGroup” property (which is the inverse relationship) at the level of collection B is not mandatory because it is inferred from the “isothes:superGroup” property.

However, the proper functioning of some applications (like Skosmos) requires the presence of both relationships. This service allows to insert at the level of the broader collection as many “isothes:subGroup” properties as narrower collections of this collection.

Assignation of ARK identifiers

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

This service allows the replacement of the identifiers (URI) of a SKOS/RDF-XML file by ARK identifiers built according to the recommendations of the California Digital Library (CDL).

An ARK identifier has the following syntax:

  • The NMA (Name Adressing Authority), its role is to make the URL clickable in a web browser,
  • The actual ARK identifier which consists of:
    • the “ark:/” label,
    • a NAAN (Name Assigning Authority Number) identifying the naming organization which is attributed on demand by the CDL.

The transformation is performed in two stages:

1- Replacement of the resource URI (at the level of the concept scheme) by the following generic URI: http://my_site.fr/ark:/NAAN/ABC. The old URI is kept in a “dc:identifier” field.

At the concept level, an 8-character alphanumeric sequence followed by a dash and then a “check sum” completes this prefix and constitutes a unique ARK identifier for each concept of the resource.

Prefix Unique identifier
http://my_site.fr/ark:/NAAN/ABC -CGT6ZZBQ-F

2- URI recalculation for:

  • each of the “skos:broader”, “skos:narrower”, “skos:related” relations,
  • the possible “skosxl:prefLabel”, “skosxl:altLabel”, “skosxl:hiddenLabel” properties,
  • the members of possible collections,
  • the possible “skosxl:Label” elements.

To generate ARK identifiers that comply with CDL recommendations (see details here), the generic URI must be replaced as follows:

  • Replace the sequence “http://my_site.fr” (Adressing Authority) by the good URL,
  • Keep the “/ark:/” label,
  • Replace “NAAN” (Name Assigning Authority Number) by organization NAAN,
  • Replace “ABC” by an alphanumeric short code corresponding to the resource itself.

Here is a real example: http://data.loterre.fr/ark:/67375/1WB

Note that in the absence of NAAN, the URI can not be considered an ARK identifier but can nevertheless be used without the ark:/NAAN/ part, the last part being a unique identifier.

Conversion

Loterre offers various conversion modules.

Transform a CSV file into a SKOS/RDF-XML file

This transformation allows to generate a SKOS file from a spreadsheet (Excel, OpenOffice, etc.) saved as CSV.

Loterre offers two variants of this service, depending on whether the field separator in the CSV file is a semicolon or a comma:

  • Transform a CSV file whose separator is a semicolon into a SKOS/RDF-XML file
  • Transform a CSV file whose separator is a comma into a SKOS/RDF-XML file

n.b.: with a CSV file whose separator is a semicolon use double quotation marks (” / quote) as text delimiter for fields that contain semicolons as ponctuation signs. Add quotation marks around such fields to avoid spliting of text at semicolon. If text contains quotes, they must be doubled.

The input file must:

  • use this separator «§§» for multi-valued fields (example: hormone§§drug),
  • use the following labels for the different fields:
Terminological data Label to use
xx = 2 digit ISO code for language (*)
Comment
Preferred label prefLabel_xx A “preflabel_en” is expected
Alternative label altLabel_xx
Hidden label hiddenLabel_xx
Definition definition_xx
Note note_xx
Scope note scopeNote_xx
Editorial note editorialNote_xx
History note historyNote_xx
Change note changeNote_xx
Example example_xx
Broader term broader_xx A “broader_en” is expected
Related term related_xx A “related_en” is expected
Group (collection) group_xx A “group_en” is expected
Exact match exactMatch
Close match closeMatch
Broad match broadMatch
Narrow match narrowMatch
Related match relatedMatch

(*) Replace “xx” by 2 digit ISO code for language; example “prefLabel_en” for the English preferred label. See list of ISO 639-1 codes.

The data is transformed as follows:

  • A SKOS/RDF-XML file is created to hold the entire terminological resource.
  • Each line except the first one becomes a “skos:Concept” , if an identifier is present, it is attributed to the concept; otherwise, a temporary URI is assigned to it in the “rdf:about” attribute.
  • The labels in the first line are converted to their SKOS counterpart, for example, prefLabel_en becomes “skos:prefLabel” with an attribute “xml:lang=”en””.
  • The content of each cell is put into the appropriate SKOS property. If the content is multi-valued, it is split into as many properties as values separated by the separator “§§”.
  • The related and broader relationships are processed in two stages: firstly, a “skos:related” or “skos:broader” property is generated for each related or broader terms then in a second step, it is the URI of the concept corresponding to the terms in question which is put in the attribute “rdf:resource” .
  • If the file has groups, a “skos:Collection” is created for each group.

In addition, the transformation also inserts two blocks at the beginning of the SKOS/RDF-XML file:

– a “cc:License” block with the default Creative Commons CC-BY 4.0 license that should be changed if the resource is released under a different license.

– a “skos:ConceptScheme” block with:

  • an URI derived from concept identifiers;
  • properties for metadata to be completed / modified by the user at the output file level:
  • English, French and Spanish titles (dc:title),
  • English, French and Spanish descriptions (dc:description),
  • English, French and Spanish subjects (dc:subject),
  • creator name (dc:creator),
  • license name (cc:license),
  • English, French and Spanish names of organization / institution to which the resource must be attributed (cc:attributionName),
  • web site of organization / institution to which the resource must be attributed (cc:attributionURL),
  • top-concepts (skos:hasTopConcept) if the resource is highly structured,
  • resource languages as calculated from language tags of preferred labels of concepts (dcterms:language with lexvo/ISO 639-3 code attribute),
  • creation date (dcterms:created),
  • last modification date (dcterms:modified),
  • version (owl:versionInfo).

If the concepts do not have identifiers, the default URI of the resource is “http: //www.mysite/vocabs/ABC”. It is also the root of the URI of concepts, relationships and possible collections. It must be replaced as follows:

  • Replace “http://www.mysite/” by the correct URL.
  • Keep “/vocabs/”.
  • Replace “ABC” by a short alphanumerical code that will identify the resource.

At the concept level, the URI is a concatenation of the resource’s URI with a unique identifier; at the collection level, the URI is a concatenation of the resource’s URI with the group name by replacing the spaces with “_”.

To switch to ARK identifiers, use the transformation “Assign ARK identifiers to a valid SKOS/RDF-XML file”.

Transform a SKOS/RDF-XML file to a CSV file

Files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]” are processed by this service.

Loterre offers two variants of this service, depending of the field separator desired in the resulting CSV file:

  • Transform a SKOS/RDF-XML file to a semicolon-separated CSV file”
  • Transform a SKOS/RDF-XML file to a comma-separated CSV file”

The output file can be imported into a spreadsheet (Excel, LibreOffice, etc.) for editing (see the import procedure in Excel below).

The data are transformed as follows:

A first row “column headers” is created from the elements (skos or other properties) used to describe the different concepts of the SKOS/RDF-XML file:

  • An “ID” tag is created for concept identifiers.
  • Properties with an “xml:lang” attribute are listed by concatenating the element name (without namespace) with the language code (for example, “skos:prefLabel/@xml:lang=”en'” gives the label “prefLabel_en”).
  • For properties that have an attribute other than “xml:lang”:
    • those corresponsding to the semantic relations (“skos:broader”, “skos:narrower” et “skos:related”) are translated into “broader_en”, “narrower_en” and “related_en”,
    • the others (mapping properties, etc.) are output with the name of the element only (without namespace, for example,”exactMatch” for “skos:exactMatch”).
  • Properties that have no attributes are output with the element name only (without namespace).
  • If the file contains collections, a “group_en” label is created. This label can be redundant if the concepts contain properties reflecting their belonging to groups (domain, microthesaurus, etc.).

Then, a line is generated for each concept of the file:

  • the value of the “rdf:about” attribute is put in the “ID” column,
  • the content of the textual elements (terms, definitions, notes, etc.) is put in the column corresponding to that element and to the language code of that element,
  • hierarchical and associative relations (links) are replaced by the corresponding English preferred terms,
  • the content of the other elements is output as is,
  • if the concept belongs to a collection, the English name of the collection is put in the “group_en” column.

It should be noted that:

  • the contents of the different fields are put between quotes (quotation marks) to avoid the problems of separation when these contents contain the semicolon as element of punctuation,
  • if the content of a field contains quotes, they are doubled to protect them,
  • the contents of multiple-occurrence fields (for example, “skos:altLabel”) are dropped into the same “cell” but separated by this separator “§§”.

To import a CSV file to Excel:

  • Create a new file in Excel (“File” / “New”).
  • Click on “Data” menu, choose “From Text” and then choose the file to import.
  • Import the file (“Import” button).
  • At the Text Importation Wizard:
    • choose “Delimited”,
    • at the “File origin” menu, choose “65001 : Unicode (UTF-8)”
  • Click the “Next” button:
    • At the “Delimiters” column, select “Semicolon”,
    • Keep quotes (“) as “Text qualifier”,
    • Check the imported data with the “Data Preview”,
  • Click the “Finish” button.

The file modified in Excel and saved as CSV file can be transformed into SKOS using the service “Transform a semicolon-separated CSV file into a SKOS/RDF-XML file” or “Transform a comma-separated CSV file into a SKOS/RDF-XML file” depending on the type of separator used while saving the CSV file.

Transform a SKOS-XML into a HTML file

This transformation allows to generate an HTML file from a valid SKOS file. It processes files containing “skos:Concept” or “rdf:Description” of type “Concept”.
Two variants are proposed by Loterre, depending on the language version chosen:

  • Convert a valid SKOS/RDF-XML file into an html file – French version
  • Convert a valid SKOS/RDF-XML file into an html file – English version

The terminology entries are presented in the alphabetical order of preference (French or English):

  • the terms (preferred and synonyms) in the chosen language
  • definitions and notes in the chosen language
  • relationships (generic, associated and specific terms)
  • preferential terms in other languages
  • possible membership groups
  • alignments
  • any bibliographical references
  • the source(s) of the concept

The richness of the information displayed will depend on the content and structuring of the original SKOS file.

Transform a SKOS-XML into a PDF file

This transformation generates a PDF file from a valid SKOS file.

Two variants are proposed by Loterre, depending on the language version chosen for the resource:

  • Transform a SKOS/RDF-XML file into a PDF file corresponding to the French version of the resource
  • Transform a SKOS/RDF-XML file into a PDF file corresponding to the English version of the resource

Several sections are produced depending on the content and structure of the file:

  • Alphabetical index
  • Detailed terminology entries (in French or English) with:
    • terms, definitions, notes
    • relationships (generic, associated and specific terms)
    • preferred terms in other languages
    • potential membership groups
    • alignments
    • any bibliographical references
    • the source(s) of the concept
  • The list of entries with:
    • the French preferential
    • the English preferential
    • the page
  • The tree structure if the resource is structured.
  • Collections if the resource contains groups.

Additional pages are inserted:

  • simple cover page with the title of the resource (French or English)
  • cover page with:
    • title (French or English) of the resource
    • version
    • last update date
    • description (French or English)
    • legend for detailed entries
    • CC-BY 4.0 license plus logo
  • 4th cover with:
    • title (French or English) of the resource
    • description (French or English)

Note that the cover pages can be replaced by editing the final file with a PDF editor such as PDF Sam Basic.

What can be done with the “Check” service?

The « Check » service permits an online chek of a SKOS terminology file validity.
It allows three types of checks to be performed: collections validity, concepts validity, concept scheme validity.

The color code of the detected anomalies indicates their degree of severity:

red background critical anomaly
orange background major anomaly
yellow background minor anomaly

Check of the collections validity

The service Controlling a SKOS/RDF-XML file at the collections level processes files containing “skos:Collection” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Collection’]]” or “rdf:Description[rdf:type[@rdf:resource=’http://purl.org/iso25964/skos-thes#ConceptGroup’]]”.

It performs a preliminary analysis of the resource to determine:

  • the number of “Collection” blocks;
  • the nature and number of elements (properties) that compose them.

It then performs the checks and returns the results in the form of a table detailing the types of anomalies detected.

List of checks carried out:

Code  Anomaly description
Col-0 The resource contains neither skos:Collection, nor isothes:ConceptGroup, nor rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Collection’]] or “rdf:Description[rdf:type[@rdf:resource=’http://purl.org/iso25964/skos-thes#ConceptGroup’]]” element.
Col-@0 Missing URI identifier. A collection has no identifier.
Col-@N Unauthorized attribute. Only the “rdf:about” attribute is authorized for “skos:Collection” or rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Collection’]] or “rdf:Description[rdf:type[@rdf:resource=’http://purl.org/iso25964/skos-thes#ConceptGroup’]]” element.
Col-2 Anomaly of structuration of collections. Despite the presence of an “isothes:superGroup” property in collections, no “isothes:subGroup” property was found in the corresponding super-collection. The menu “Insert narrower collections in a valid SKOS/RDF-XML file” can be used to fix the problem.
Col-3 Inconsistency at the resource identifier level. The value of “rdf:resource” attribute on “skos:inScheme” element of a collection is different from the resource identifier (ConceptScheme).
Col-4 Non-existent member. A member of a collection is not a concept of the resource. Create the corresponding concept or delete this member.
Col-5 Collection identifier contains an unauthorized character (white space, apostrophe, double quotation mark, left bracket, right bracket).

Check of the concepts validity

The service Controlling a SKOS/RDF-XML file at the concept level processes files containing “skos:Concept” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#Concept’]]”.

It performs a preliminary analysis of the resource to determine:

  • the number of “skos:ConceptScheme”, “skos:Concept”, “skos:Collection” and “skosxl:Label” blocks;
  • the nature and number of elements (properties) that compose them;
  • the languages of the resource.

It then performs the checks and returns the results in the form of a table detailing the types of anomalies detected.

List of checks carried out:

Code Anomaly description
D-Id Uniqueness of the identifier (URI) of a given concept. Two different concepts can’t have the same identifier.
E-0 Presence of an empty field (property). May interfere with the continuation of the control program. Can also prevent import into some terminology editors.
@-0 Presence of an empty attribute.
R-A1 Clash between semantic relations: the same concept is a related concept and a broader concept (direct or indirect) of the current concept. Cf. SKOS Reference (S27 Integrity Condition).
R-FX1 Reflexive hierarchical relation: a concept is its own broader concept.
R-FX2 Reflexive associative relation: a concept is its own related concept.
R-31 Clash between associative and hierarchical relations: if concept A has concept B as a narrower concept and concept C as a related concept, then C can not be a narrower concept of B because concept C can not be linked simultaneously to concept A by two disjoint relationships “skos:narrowerTransitive” and “skos:related”. See details in SKOS-Primer and SKOS Reference.
R-32 Clash between associative and hierarchical relations: if concept A has concept B as a broader concept and concept C as a related concept, then C can not be a broader concept of B because concept C can not be linked simultaneously to concept A by two disjoint relationships “skos:broaderTransitive” and “skos:related”. See details in SKOS-Primer.
R-B3 Cycle between semantic relations: the same concept is a narrower concept and a broader concept of the current concept.
R-A2 Clash between semantic relations: the same concept is a related concept and a narrower concept of the current concept. See SKOS Reference (S27 Integrity Condition).
R-NS Non-symmetric associative link (skos:related). See SKOS Reference (Axiom S23); see Symmetry correction
R-0 Relation (skos:related, skos:broader or skos:narrower) that targets a non-existent concept.
R-OR Orphan concept: a concept that is not a top-concept, and which has neither broader nor narrower concepts.
CS-0 Concept not linked to the concept scheme through “skos:inScheme” element.
CS-3 Inconsistency at the level of the resource identifier (URI). The value of “rdf:resource” attribute in “skos:inScheme” element is different from the resource identifier.
LP-0 Missing preferred label for one of the resource languages.
LP-N1 More than one value preferred label per language tag. See SKOS Reference (S14 Integrity Condition)
LP-LA1 Preferred label / alternative label duplicate within the same concept. See SKOS Reference (S13 Integrity Condition)
LP-LC1 Preferred label / hidden label duplicate within the same concept. See SKOS Reference (S13 Integrity Condition)
LP-LP2 Same preferred label for two different concepts.
LP-LA2 Preferred label / alternative label duplicate between two different concepts.
LP-LC2 Preferred label / hidden label duplicate between two different concepts.
LA-LA1 Alternative label duplicate within the same concept.
LA-LA2 Same alternative label for two different concepts.
LA-LC1 Alternative label / hidden label duplicate within the same concept. See SKOS Reference (S13 Integrity Condition)
LA-LC2 Alternative label / hidden label duplicate between two different concepts.
LC-LC1 Hidden label duplicate within the same concept.
LC-LC2 Same hidden label for two different concepts.

Check of the concepts scheme validity

The service Controlling a SKOS/RDF-XML file at the concept scheme level processes files containing “skos:ConceptScheme” or “rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#ConceptScheme’]]”.

It performs a preliminary analysis of the resource is performed to determine:

  • the number of “ConceptScheme” blocks;
  • the nature and number of elements (properties) that compose them.

It then performs the checks and returns the results in the form of a table detailing the types of anomalies detected.

List of checks carried out:

Code Anomaly description
CS-N Missing ConceptScheme element. The file contains neither a skos:ConceptScheme element nor a rdf:Description[rdf:type[@rdf:resource=’http://www.w3.org/2004/02/skos/core#ConceptScheme’]] element. The menu Insert a skos:ConceptScheme can be used to fix the problem.
CS-0 Missing concept scheme identifier (URI).
CS-1 Unauthorized attribute in ConceptScheme element. Only rdf:about attribute is authorized.
CS-2 Missing skos:hasTopConcept properties. Despite a great structuration of the resource ConceptScheme element lacks skos:hasTopConcept properties to list top-concepts . The menu “Insert hasTopConcept properties” can be used to fix the problem.
CS-3 Inconsistency at the level of the resource identifier (URI). The value of rdf:resource attribute in skos:inScheme element of a given concept is different from the resource identifier.
What can be done with the “Download” service?

The “Downlaod” service will allow you to download the full content of one of the resources stored in the Loterre triplestore, in a PDF, CSV, SKOS/XML or JSON-LD format.

From the Loterre platform, you can also:

One of your questions still remains unanswered? Ask it directly with the contact form.