This repository contains a DCAT-AP v1.2 schema plugin for GeoNetwork 3.8.3.
This plugin is replaced by https://github.com/metadata101/dcat-ap.vl for version 4.
- W3C Data Catalog Vocabulary (DCAT), Fadi Maali, John Erickson, 2014.
- DCAT Application profile for data portals in Europe (DCAT-AP) v1.2
- Interoperability between metadata standards: a reference implementation for metadata catalogues, Geraldine Nolf, W3C SDSVoc 2016. [paper] [slides]
This plugin has the following features:
- XML Schema for DCAT-AP: GeoNetwork is capable of storing metadata in XML format. The plugin therefore defines its own XML Schema (see the schema folder) for DCAT-AP that is used for the internal representation of DCAT-AP fields. To limit the data conversion needed, the XML Schema was designed to fully resemble an XML/RDF syntax of DCAT-AP.
- Harvester: To import RDF metadata into GeoNetwork, something needs to be done to accomodate the many different formats (JSON-LD, Turtle, RDF/XML, etc.) and structure (nestings, ordering, etc.) that RDF data can take. Therefore, an DCAT-AP harvester was written to "normalize" the RDF metadata, such that it fits in the XML Schema for DCAT-AP that was defined for the plugin. The harvester works as follows: it downloads RDF metadata from a remote catalogue (curl), converts that into XML using a SPARQL SELECT query, converts that into DCAT-AP XML (XSL conversion), and imports this into GeoNetwork using the GeoNetwork API (curl).
- indexing: The plugin maximally populates GeoNetwork's existing index fields for a consistent search experience.
- editing: A custom form was created following the guidance in the GeoNetwork form customization guide. The form uses the controlled vocabularies required by DCAT-AP. These are located in the folderthesauri and can be imported in to GeoNetwork as SKOS classification systems using standard GeoNetwork functionality.
- directory support for licences: Licences are directory entries. The subtemplates folder contains sample licence templates. These licences can be imported via the 'Contribute' > 'Import new records' dialog. Select 'Directory entry' as type of record. Alternatively, templates can be edited via the 'Contribute' > 'Manage directory' form.
- viewing: A custom 'full view' to visualise DCAT-AP records.
- multilingual metadata support: The editor, view, and search benefit from the already existing multilingual capabilities of GeoNetwork.
- validation (XSD and Schematron): Validation steps are first XSD validation made on the schema, then the schematron validation defined in folder dcat-ap/schematron. Two rule sets are available: schematron-rules-dcat-ap, and schematron-rules-dcat-ap-recommendations.
- Export in DCAT-AP RDF format: The plugin exports DCAT-AP RDF metadata using the GeoNetwork API (/geonetwork/srv/api/0.1/records), which can in turn be harvested by e.g. CKAN. To get the records in DCAT-AP RDF format, use the standard /geonetwork/srv/api/0.1/record endpoint (the DCAT/RDF output to GeoNetwork). Consider cherry picking from this pull request to enable paging, or apply the patch as explained in the next section.
Use GeoNetwork version 3.8.3. See the instructions on software development in the core-geonetwork project.
git clone --recursive https://github.com/geonetwork/core-geonetwork.git
git checkout 3.8.x
To include this schema plugin in a build, copy the dcat-ap schema folder in the schemas folder, add it to the schemas/pom.xml and add it to the copy-schemas execution in web/pom.xml.
The best approach is to add the plugin as a submodule into GeoNetwork schema module.
cd schemas
git submodule add https://github.com/metadata101/dcat-ap1.1.git dcat-ap
git submodule init
git submodule update
Add the new module to the schemas/pom.xml:
<module>iso19139</module>
<module>dcat-ap</module>
</modules>
Add the dependency in the web module in web/pom.xml:
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>schema-dcat-ap</artifactId>
<version>${gn.schemas.version}</version>
</dependency>
Add the module to the webapp in web/pom.xml:
<execution>
<id>copy-schemas</id>
<phase>process-resources</phase>
...
<resource>
<directory>${project.basedir}/../schemas/dcat-ap/src/main/plugin</directory>
<targetPath>${basedir}/src/main/webapp/WEB-INF/data/config/schema_plugins</targetPath>
</resource>
Commit these changes.
Apply the patches to the geonetwork core. You may need to manually apply specific hunks of a patch.
cd .. (core-geonetwork)
git am --ignore-space-change --ignore-whitespace --reject --whitespace=fix schemas/dcat-ap/core-geonetwork-patches/*.patch
Build and run the application following the Software Development Documentation. You'll need to have Java JDK 1.8 and Maven installed.
mvn clean install -DskipTests
After build, you will find a WAR file in /web/target/geonetwork.war
that you can deploy in your Tomcat 8.x servket container. Alternatively, you can run GeoNetwork with the Maven Jetty plugin by executing the following command:
cd web
mvn jetty:run -Penv-dev
Samples and templates can be imported via the 'Admin Console' > 'Metadata and Templates' > 'DCAT-AP' menu.
Make sure to import the thesauri located in schemas/dcat-ap/src/main/plugin/dcat-ap/thesauri
as they are required for editing dcat-ap records.
Subtemplates for the dct:license
field are available at schemas/dcat-ap/src/main/plugin/dcat-ap/subtemplates
.
The plugin uses dct:identifier to store a uuid that is used as (internal) metadata identifier. The metadata identifier is stored in the element dcat:Dataset/dct:identifier. When saving a record, this uuid is appended to the dataet URI, provided that the metadata (template) contains a dataset URI that ends with a uuid and the record is not harvested. Admittedly, it would be more correct to use a dcat:CatalogRecord/dct:identifier as metadata identifier, leaving dcat:Dataset/dct:identifier as a pure dataset identifier.
<dcat:Dataset rdf:about="https://opendata.vlaanderen.be/dataset/9fde14b3-4654-44f5-970e-be0a986cf4eb">
<dct:identifier>9fde14b3-4654-44f5-970e-be0a986cf4eb</dct:identifier>
Comments and questions to the issue tracker.
This plugin would merit further improvements in at least the following areas:
- Default view: the default view currently does not display distribution information, this is only included in the full view.
- Dirk Debaere (Informatie Vlaanderen)
- Mathias De Schrijver (Informatie Vlaanderen)
- Geraldine Nolf (Informatie Vlaanderen)
- Bart Cosyn (Informatie Vlaanderen)
- Stijn Van Speybroeck (Informatie Vlaanderen)
- Gustaaf Vandeboel (GIM)
- Mathieu Chaussier (GIM)
- Stijn Goedertier (GIM)
- Mathieu Van Wambeke (GIM)
- An Heirman (GIM)
The work on this schema plugin was funded by and carried out in close collaboration with Informatie Vlaanderen (AIV).