evaluation of editors from projects in the mapping of mappings collection

(Josef Kreitmayer) #1

We want to see, which of the projects in the “mapping of mapings” has interesting editors, to

  • probably build on one of them
  • get into a discourse about the technical infastructure they use
  • probably even collaborate
  • other we currently cannot think of

@species @toka what would you see is important questions, to easily evaluate?
It would be great, if some non-technician would like to take that task.

What to search for:

  • OSM based use of tiles and geocoder (is recorded in the collection)
  • good looking (personal impression)
  • active project (personal impression)
  • substantial number of POIs (personal impression)
  • has a public editor (research on the site)

Anything else, that is easily evaluable by a non-technician?

(Michael Maier) #2

I’ve reduced the list of the mapping of the mappings (owncloud:/TransforMap/Communities/mapping-of-the-mappings(sem-mw-export).csv) to a lot fewer entries. I’ve removed all non-geographic mapping entries as well as all Google-based and disfunctional.
There are now ~100 entries in the file “mapping-of-the-mappings-to-evaluate.csv” in the same folder. There is now an added column “have public editor”, where I have started to manually check on the websites for the presence of an editor.

I’ve gone through the first 23, and would be happy if someone could continue the work (wink @josefkreitmayer).
I marked some entries with “??”: there only was a non-English version of the website, which I could not check due the lack of language skills.

(Y'a la merde!) #3

This is an interesting posting I would have loved to be notified about. It explains to me the recent wave of mapping platform postings in this Discourse. Additionally, it allows me to conclude how mappings.csv came into existence.

These are exactly the workflows which we should feel a need for properly documenting and creating collaborative webinterfaces for.

(Michael Maier) #4

Thanks for the reminder about documenting the workflow of the mappings of mappings csv generation.

As the semantic mediawiki was used a little unlucky (a template instead of categories) for structuring the data, I had to do a complicated manual/only partly automated way for converting the XML extract into a csv-file.
The problem with templates is, that the information we need (the template field entries) are not exported as xml nodes, but are in one big field with “|” as separator, so automatic XML->csv-converters failed.

So my approach was an ugly hack:
I used a lot of search/replace with the unix command “sed” to convert the Mediawiki XML export into an OpenStreetMap XML file, converting the “|”-sepatated “data columns” into OSM-tags, from where they finally can be exported as csv.

A minor problem was how to add coordinates to where only location names exist, this was done manually in loffice calc later.

step-by-step workflow:

  • open xml in editor (i used vim, the “:%s …” are vim’s seach/replace commands)
  • remove head and tail of XML manually
  • convert all tabs (\t) to four blanks
  • :%s/}}/|description=/g
  • :%s/</text>//g
  • :%s/</title>//g
  • :%s/</comment>//g
  • :%s/ <comment>/|comment=/g

remove all unneeded xml lines:

  • cat file.xml|grep -v -E “[<]/?(ns|id|revision|parentid|timestamp|username|contributor|minor/|sha1|model|format)?[>]” | grep -v -E “}}</text>|<text xml:space=” > outfile.xml

convert all enters:

  • sed ‘{:q;N;s/\n/\n/g;t q}’ better.ml > noenter.ml
  • sed ‘s/ <title>/\nname=/g’ noenter.ml > splitintolines.ml
  • sed ‘{:q;N;s/=\n/=/g;t q}’ splitintolines.ml > desc-umgebrochen.ml
  • sed ‘{:q;N;s/\n|/|/g;t q}’ desc-umgebrochen.ml > enterforweg.ml
  • sed ‘{:q;N;s/\n\n/\n/g;t q}’ enterforweg.ml > enterendweg.ml
  • sed ‘{:q;N;s/|/\n/g;t q}’ enterendweg.ml > backtocolperline.ml
  • sed -E ‘s/^name=(.*)/</node>\n<node>\n<tag k=“name” v="\1" />/g’ backtocolperline.ml > osmfirst.ml
  • sed -E ‘s/^([a-zA-Z0-9 ])=(.)$/<tag k="\1" v="\2" />/g’ osmfirst.ml > osmbody.xml

we now have roughly an OSM file with a lot of nodes and the attributes as their tags (key=value).

In vi, complete the XML structure manually:

  • move first “</node>” to bottom
  • add “<?xml version=‘1.0’ encoding=‘UTF-8’?><osm version=‘0.6’ generator=‘species’>” on top
  • add “</osm>” to bottom
  • corrected non-xml stuff where the search/replace regexes failed
    manually (~10 cases iirc)

Add fake coordinates, to be a correct OSM XML

  • sed -E "s/<node>/<node lat=‘0’ lon=‘0’>/g"
    osmbody.osm.xml > osmcoord.osm.xml

open in JOSM, save as geojson

open in QGIS, save as csv

open in libreoffice, now we at least have a table…

  • remove nonworking/obsolete entries, as well as mediawiki
  • add “coverage” from website where missing, use “meta” for
    non-geographic “maps”
  • merge similar entries (worldwide -> global)
  • sort to column “coverage”
  • manually add coordinates
    (either from JOSM, set node, CTRL-SHIFT-C copies coordinates)
    or from osm.org’s search
    – I used 0/0 for all “meta” and somewhere in the atlantic for
    "global" coverage.
  • paste to column 1 (is pasted comma, separated, on save gets
    surrounded by “”)
  • save as csv

in vi, remove ^", and s/",0// to remove second column rename first column lat, second lon

If @toka reads what I have done with the Semantik Mediawiki export, he would surely shake his head and presented a much better solution.

If we want to do exports regularly, we surely should find a a better solution. I’ve done it this way because I thought we would need it only once.

Geodata storage evaluation
harvest 15mmm entries from the mediawiki to digest