This project is not covered by Drupal’s security advisory policy.

Purpose of this module

DCAT is becoming the de facto metadata format to describe "datasets" (e.g. easily processable CSV, XML, XLS files) on open data portals.
However, copying the (meta)data from "normal" websites to open data portals is often a manual process, which is both time-consuming and error-prone.

The purpose of this module is to extract medata (title, description, links) from published nodes and generate a "DCAT" file that can be processed by open data portals and crawlers, without the need for specific content types or "heavy" modules.

How it works

When cron runs, the module checks for content types with a File field and queries the database for published nodes with publicly visible files.
If the file extension indicates that the file is "machine-friendly", a link to this file is added to the dcat.n3 export file, along with some basic information (title, body...) of the node.

Restrictions

  • not ready for production (yet)
  • i18n is currently not supported
  • the module only works for nodes with a File field (i.e. the module does not scan the body of nodes to extract links... yet)

Roadmap

Version 1.0:

  • Basic export of DCAT Catalog

Version 1.1:

  • Support i18n / multilingual nodes with File field/module

Version 1.2

  • Performance enhancements (aka when a node is created / updated / deleted)
Supporting organizations: 

Project information

Releases