This converter translates the Geonames vocabulary to a stripped-off RDF representation.

Source files

Parent directory for source files in the AnnoCultor source repository.

Data

Geonames describes millions objects, with an RDF file of more than 6 Gbytes. They are available as a dump txt file. In this file each record describes a place (called Feature) with a dozen or two properties. These records are already represented in RDF, as small RDF files. These files are merged into a single large text dump, where each record is separated with a line containing its identifier.

Download this dump and save it to annocultor/demos/converters/getty/input_source/.

Class GeonamesDumpToRdf separates them and groups per continent and per country. Run it with script TODO. It should produce a number of files in annocultor/demos/converters/getty/input_source/

These files are valid RDF files and can be loaded to a triple store as-is. However, many applications only use a fraction of the fields provided by Geonames.

Converter

The Geonames converter filters out several fields, drastically reducing the size of the RDF.

This converter shows a peculiar way of using AnnoCultor. Typically, AnnoCultor is used to convert XML files or SQL databases to RDF. In this converter we convert RDF to RDF. In fact, we process the source RDF document at XML level and perform usual conversion.

The conversion parameters are separated in a file, as usual.

This converter illustrates the following AnnoCultor features:

  • translating RDF to RDF
  • multiple source files
  • property rename
  • sequence of rules
  • qualified identifiers.

Acknowledgments

This converter was developed with support from the Europeana project}.