[Proposal] The transformation of different distributed datasets to Linked Open Data with TransforMap

Tags: #<Tag:0x00007fd5e1b1bef8> #<Tag:0x00007fd5e1b1ba98> #<Tag:0x00007fd5e1b1b660>

Hi all,

I’ve sketched a proposed workflow how to convert data from existing datasets (that are distributed and do not know of each other) into linked datasets with distributed attributes.

I’ve sketched the manual way, where each POI is manually merged via a proposed TransforMap Editor software by the user. Automatic merges are a looong more way to go and very complicated - postponed for the moment.

The goal is to transform 1-3★ - datasets into full 5★ Linked Open Data!

So how should the system work?

Before the editor can work, we have to collect existing open datasets (OSM, GreenMap, and other ones from our collection), and create inter-linked taxonomies between them (into our Taxonomy DB). This is needed because we want to compare the different attributes of different datasets and merge them.
We also have to specify APIs, how to fetch spatial data from different distributed databases for each of the datasets used as input.

What is the workflow - described in words:

  • The user wants to add a POI, searches either via name or by clicking on a map.
  • The editor queries all existing databases for POIs near the found/specified coordinates
  • A list of found datasets from different databases is presented to the user, the user checks all that match the POI he want to add to TransforMap.
  • The editor merges the different attributes from different databases (looks up corresponding fields in our Taxonomy DB), and presents the user to choose between attributes (if they differ), and/or add new ones.
  • When the user is finished, the editor uploads the different attributes to the databases where they belong to. It also uploads links to the object on other databases to each database, creating a linked dataset.

The source files for this graph can be found on github, called “transformation_to_linked_data”.

the editor” is the key part in merging attributes. Of course it should be possible to add completely new POIs too. Its proposed operating modes are described in another Discourse post.

As always, please comment!
Especially looking at @almereyda, @josefkreitmayer :smile:


you are getting better and better with the graphics. Excellent work. I do not yet fully understand, but would be happy about an explanation.

I am curious, how does one specify that?

This is the step I am most concerned about risks, if the partner would accept our imports / cooperate. How can we make a proposal to cooperate with us?

that is really nice

I wonder, if it would make sense to contact the guys, that started geohack within the context of wikipedia. It is much broader in topic, but also a reference to a point in many geo-databases.

Here a link to an example
here the link to my original post in Discourse.

H Michael,
Really like the idea of the transformap editor.

One of the things we are really interested in at Transition Network, is how we can use Structured mark up within online tools.

I would be really interested in understanding how you are defining ‘Linked data’?

Our understanding is that Linked data isn’t defined at source, but by using an internationally defined set of agreed standards.
For example Schema.org has a schema known as ‘Thing’ (link)which within this group has the subset of ‘place’ which could be used to define POI’s (this is also a Google, Yahoo, MS agreed schema). Is this something that is being considered with the creation of your editor tool when looking at how the code that is generated, is being presented?

Many thanks


Ade Stuart
Web Manager Transitionnetwork.org
e: adestuart@transitionnetwork.org

Developer answer: We write a document where other developers can look up the specifications
Manager answer: We have to work with the partners providing databases on getting them to open an API – or to import their data in a temporary database of us, where we have specified the API ourselves.

It should be noted that currently we do not have the personal resources to build the editor - at first all our databases like the TransforMap Tag DB, the Taxonomy DB and the Media DB must be up and running.
When this system is working, it will be filled by the SSEDAS imports. We then have a working system to present to the partners, which is already filled with some interlinked data.

Which external databases were mentioned currently:

  • OpenStreetMap: Either we can use a general TransforMap user (as wheelmap does) - but this is to be discussed with the community! Or each user of our editor has to have an account at OpenStreetMap too.
  • Wikidata/DBpedia: This is a (?) because I don’t know yet if POI data is accepted there. But Wikidata allows anonymous edits, or we can use a TransforMap user.

If we have a working system to present, negotiation will be much easier!

I think geohack is pretty dead by now - the main purpose seems to be to link a coordinate to many different mapping systems.
But if we get Wikipedia on board this way, go for it :slight_smile:

We definitely want to stick to standards like schema.org whenever possible!

Currently I didn’t write any specific standards into the sketch, because protocols are open for debate!
This is because I am not a specialist for LOD, we have people like @almereyda for that :smile:

I am the OpenStreetMap guy, I can only say how linked data is done there:

  • linking from OSM to other sources is easy: You specify e.g. “wikidata=Q123456” or “image=http://domain.tld/path/image.png” on an object. All values in OSM are freetext, so it would be possible to link to RDF resources too.
  • linking to OSM is not so easy: Although every OSM feature has a link, e.g. way 90346253, these IDs are not guaranteed to be static. E.g. if a user deletes a feature, and another one readds it, the ID changes. Therefore, Overpass Permanent IDs are proposed. In short: You link to a feature type (e.g. amenity=public_bookcase) at coordinates: 47.0727,15.401 ± 50m.