Using the Overpass-API and Overpass-Turbo to query TransforMap Layers.

this might be a bit overkill, but why not as temporary solution - if semantic mediawiki data can be exported to an osm xml file, an osm3s server can be using that data:
https://wiki.openstreetmap.org/wiki/Overpass_API/install

There is an quite easy way from any database to OSM:

  • export as csv
  • open in JOSM using the “opendata” plugin, change column headers (keys) according to OSM standards
  • save as osm xml :smile:

But we have to adapt the web map for handling non-OSM-data a bit…

  • adapt the links for each object to its source if not OSM
  • add an “export” functionality for these data
  • add attribution and explaining texts.
  • big TODO: handling duplicates^^

great. i guess this can be automated to be run periodically; osm xml is simple, e.g.

 <node id="1831881213" version="1" changeset="12370172" lat="54.0900666" lon="12.2539381" user="lafkor" uid="75625" visible="true" timestamp="2012-07-20T09:43:19Z">
  <tag k="name" v="Neu Broderstorf"/>
  <tag k="traffic_sign" v="city_limit"/>
 </node>

and init_asm3s.sh (update_database) just updates the osm3s DB on fs it seems.

obviously, duplicate detection is more complex…

I only know of one automated periodically import: Danish housenumbers. But this is a very simple case because there is a convention to only use nodes and there is a very limited set of tags.

Automating this is not so easy as it sounds, some manual work is required every time - there will be conflicts that need to be manually resolved. (Half-) Automated updates are possible, yes - but will need (human) resources for each run.

I was thinking of this more as as periodically fully automated run of pushing a separate data set from semantic mediawiki to a separate osm3s instance (not into the official OSM, just only using the same DB server - osm3s), which then can be used as an alternative to / augmenting the OSM-overpass-API instance. That way, existing maps could gradually merge into OSM, and also augment OSM-based POI sets with their own data.

e.g. periodically: semantic mediawiki XML API -> xslt transform into osm-xml file -> import in osm3s-server at api.transformap.co
and the map fetches the POIs either from overpass-api.de/api/ or from api.transformap.co, or from both and combines them.

But I might be completely off-base, and/or other options much simpler…

Setting up an Overpass-Server (osm3s) ourselves for items not fitting into OSM is indeed on our TODO-List :slight_smile:
I imagine one instance for events, one for people (e.g. community connectors) or any other interesting stuff.

But in my opinion, every piece of data that has a place in OSM should end up there. I don’t think anyone wants to keep two databases holding geodata up to date… or thousands, as this is the current state of the maps :wink:

I like the way you think @stefan, you would be very welcome in our monkey circle! May I invite you introduce yourself in the Who-Is-Who-Board?

2 Likes

I post invented the same idea, while thinking about a aggregation layer and reading, how overpass-api works.

Since we use only a small subset of OSM stored data, we could have exports running from OSM the same way as from all other possible databases.

We will have to find out, how overpass api handles ids. We have a problem, if they are used as numbers, but if it could be arbitrary strings, we could use them as uuids, aggregating every other data into key value pairs.

@dev I started to build a docker image for Overpass-API, which seems to compile fine:

@Scrum, @stefan I managed to install overpass-api and overpass-turbo, which we could feed with our own data layers: http://overpass.transformaps.net/

(the current data set is an actual dump of berlin region planet.osm)

The ID type is 64bit integer, so we have to invent some way, how to generate IDs for all the layers.
Suggestions:

  • (a) 16bit prefix for the layer (upper bits), 48bit copied id for other services, if ther ids are too big, then we can run some hash function on it
  • (b) hash function, fed with “layer name”+“foreign id”

For easy querying I suggest to map all ways, and relations to nodes (which would preferid generation (b) ?) with center coordinates and to keep the original type (node, way, relation) inside a key (osm:id=“wayNNN” or something similar).

Hi Michael,

sorry for completely ‘dropping off from the planet’ - I figured I couldn’t commit enough time on a sustained basis for this project in the past months to be a meaningful contribution. Will do when that’s possible though!

Thomas, nice to see things are going forward, and this idea revived!

Re Identifiers: I don’t see a requirement to obfuscate the source layer of the point, on the contrary it would be nice to be able to tell from the ID where a point comes from, which speaks for (a).

So is 48 bits for the ID space enough? The probability of a collision when drawing k randoms from N numbers (birthday problem) is roughly k^2/(2N); if we have 10k points in the map this calculates to:
P = 10000^2 / (2
2^48) = 10^8 / 2^49 ~= 10^8 / 5.629499 *10^14 = 1 / 5.629499 * 10^6 = 1 / 5629499
so it’s after all two times as likely as winning the 6/49 lottery. Lucky us!

What I’d suggest, though, is to add a salt (a random string, fixed per source layer) to the source ID, as source ID spaces may well be overlapping. So the final ID would be something like
IDnew = “layer name” & 0xFFFF << 48 + MD5(“layer salt” + “source ID”) & 0xFFFFFFFFFFFF

We shoud add properties to data, which describe where it comes from. So we will be independend on reading something out of the id value.