Data Standardization + Linked Open Data - a comment by Jack Townsend


(Jon Richter) #1

Everything below is pulled from a forwarding by @alabaeye to our Mailing List. And yes, @pmackay, you are mentionned, too :wink:

Please feel invited to discuss those words here.


For me, what’s most exciting about what your doing, what makes by far the biggest contribution is creating an effective data standard for location-based sustainability resources, with buy-in from lots of people and eventually uptake and provision of data in that standard by many providers. That is an amazing idea, really powerful. This will enable many more providers to publish information and enable lots of other people to do important things with it, things that we couldn’t even imagine = serendipitous reuse. This standard can build upon existing POI standards, by adding sustainability related fields. Here’s a recommendation from Paul Mackay (CCed), who was also in Berlin, and a London Cleanwebber, and should be in a good position to advise also.

http://popoloproject.com/

I think the process that Popolo have used for defining standards is a good pragmatic one, doing the minimum to be a very usable spec but reusing as much as possible and not having a hugely formal standards defining process.
Seems like a good direction to consider.


Whilst creating a map of maps is a noble end in itself, it’s main value, for me, is as a catalyst of the data standardisation process.

Data standardisation is a difficult social process. Technically it’s straightforward. You are creating shared rules - akin to the ones that govern an information commons. The main contribution of 14mmm can be to find some agreement to these difficult questions: what fields are in, which are out? What options are available? What do the terms mean? Are they intuitive and international, can people without training use them?

Linked data, or some aspect of it, may well be the best medium for this, but I don’t want to push that too hard, it is still quite early days for linked data technologies. The important thing is the standardisation/ data model, and that is independent of the technology, could also be in JSON, or XML. I would though, recommend creating a data model in OWL, the linked data ontology language, which is straightforward to use with the Protege software, and doesn’t commit you to using the rest of the linked data suite. Perhaps the best reason to use linked data would be all the reference data that you would get for free. Having great reference data, that is maintained and updated is vital.

Have just had lunch here in London with linked data expert Amy (CCed) who came to the Berlin workshop, and she has kindly offered to play with some of the data sources to knock together a simple data model and prototype.

Thinking on, here are some ideas for ordered deliverables

  • Create a rough draft data model for location-specific sustainability resources.
  • Produce a first draft centralised map with data from a few different sustainability networks, extracted from APIs, RSS feeds etc.
  • Hold workshops to develop and agree data standards 1.0
  • Match data against leading reference datasets, to improve quality and linkability.
  • Offer gathered data to world for reuse as (linked) open data, XML, JSON etc.
  • Produce a centralised map with data from many different sustainability networks
  • Enrich the data by pulling in lots of related data from across linked data world (e.g. Geonames location descriptions, Wikipedia explanations of key concepts)
  • Persuade different providers to open up their data and provide it in standardised format
  • Demonstrate that linked open data is an important resource for progressing sustainability (the W3C recommended way to connect diverse open datasets to provide requisite interdisciplinary to progress sustainability.)

Happy to chat about any of this for more details.


Tax 2.0: reworking political identities
Consuming the data aggregated
(Jon Richter) #2

1 Preliminary notes

Before going too much into detail with the message above, I find it neccessary to provide little additional context about what happend in the last months with respect to the Mapping of Mappings.

Feel free to skip these sections.

1.1 Subjective Process History

  • Back in January, before the MMM process really took off, Klaus Präter already mentionned the importance of Semantic Web Technologies / Linked Data for our work by passing a handout of an RDF/XML serialization :slight_smile: , that made completely no sense to anyone back then.
  • OuiShare Labs and Fest had great influences on the overall process. Firstly, Labs connected us to an international network of developers, especially regarding SENSORICA’s work on Open Value Networks (OVN) and Linked Data aficionado Elf Pavlik. Secondly, the mapping workshop at the Fest then bootstrapped the international community with its global@ mailing list and connections to Shareable, Digital Social Innovation research and the Network of Networks (only to Facebook).
  • The MMM Conclave at the fringe of Open Knowledge Festival brought in a diverse audience with valuable discussions on philosophical, legal and technical implications of our work. We collected arguments for linking data and explored ways on how to build on the existing work.
  • Now, after Degrowth, we see many anticipations getting real and design the working processes for the different trajectories.

1.2 Organizing Networks

By self-organizing our communication infrastructure, we also came across Loomio, who, together with people from the OVN initiative, work on an Open App Ecosystem, which kind of resembles to Poplus who also had a representative at the conclave, that intends to federate vocabularies betwenn interoperable applications.

Through the discussions on Linked Open Data and Corporate biases in Linked Open Data standards I came across OVNs existing work on

Further on a link led to an image of a classification of classification systems

via What is Big Structure?, where you also find an article series about the history and possible future of Linked Data and the Semantic Web.

which beautifully visualizes nicely how the development leads From Taxonomies over Ontologies to Knowledge Graphs.


2 Response to Jack

Since the beginning of MMM, the initial idea is to persue interoperability between maps of alternative economies and social organisation. By Mapping All Alternatives we intend to minimize the redundance of creating new taxonomies and mapping efforts to a bare minimum, as many activists from the field felt they’d be doubling work already.

Surely a standard needs adoption and is rather a process than a product (via), but modelling the diverse taxonomies would already be considered valuable work to understand collective economies better.

To prevent misunderstanding, the Popolo standard linked above is not a full-fledged POI standard, despite it features an area property.
POI Standards would then be, as far as I can tell,

Related working and interest groups are

at the OGC and some W3C community groups like

@pmackay Were you also down in the boat house?

:+1: for reusing existing specifications. Unfortunately the MMM community is still a little allergic to technical work, why we need people like @amy, @pmackay and you to help us defining what we need to do, so we can mediate between those who know about the field and those who design the vocabulary. Ideally, both groups align.

This work has already been started with the OpenStreetMap Taxonomy draft, yet it still suffers from the constraints that derive from OSM’s key:value properties and best practices. I suppose further iterations on this will produce first answers to your questions as well as adding new ones.

This is also where I see the strenghts of the Popolo process, as it includes a survey of existing standards and also covers niche questions like Multilingualization, which is important for us.

This is where I start to see problems arising. Protégé is not a user-friendly application, despite it offers all the features needed. Also collaborating on MarkDown documents via git is totally out of scope.

I wonder if we could use existing work, like Elf’s Portable Linked Profiles Editor and build upon this to create easy to use, collaborative ontology editors. Everything I’ve seen so far disappoints me. With JSON-LD we can also serialize OWL; if we split the @context from the file like Popolo does, we end up with easy to read JSON.

@pmackay How did you organize the community process around Popolo, which tools were in use? Can we imagine a simple bottom-up approach that doesn’t imply using specialized software without big buttons?

When thinking about mapping different taxonomies onto each other, I always also have to think of AKSW’s d:swarm application which aims to be a data integration and data modelling tool.

@amy, please note that we already did early steps on flattening some of the JSONs we could grasp with R, trying to force them into a comparative relational table structure. Little documentation exists, now at https://github.com/14mmm/mmm-schema-examples, but there’s even an intern of @dreusser dedicated to this.

I think it would make sense to synchronize efforts in this respect. I will add anyone to the GitHub organization, also sharing its administrative efforts, until then: Pull Requests.

  1. Create a rough draft data model for location-specific sustainability resources.
  1. Produce a first draft centralised map with data from a few different sustainability networks, extracted from APIs, RSS feeds etc.

  2. Hold workshops to develop and agree data standards 1.0

  3. Match data against leading reference datasets, to improve quality and linkability.

  4. Offer gathered data to world for reuse as (linked) open data, XML, JSON etc.

  5. Produce a centralised map with data from many different sustainability networks

  6. Enrich the data by pulling in lots of related data from across linked data world (e.g. Geonames location descriptions, Wikipedia explanations of key concepts)

  7. Persuade different providers to open up their data and provide it in standardised format

  8. Demonstrate that linked open data is an important resource for progressing sustainability (the W3C recommended way to connect diverse open datasets to provide requisite interdisciplinary to progress sustainability.)

  9. Done. See the OpenStreetMap Taxonomy

  10. :+1:

  11. :+1:

  12. Which are the leading reference datasets? The LOD cloud?

  13. :+1:

  14. :+1:

  15. Next level.

  16. I think this needs to be done earlier. Especially regarding correct licencing of available datasets, so they can be reused.

  17. I believe we could maintain profiles on the Digital Social Innovation platform so we have the network of participants in this process already available as linked data and can build from there.


(Paul Mackay) #3

Hi Jon,

To respond to various bits:

Do you think the OVN and Open App Ecosystem projects are doing things that cover anything this project is directly trying to do, or is it that they are related and we should try to coordinate all the conversations?

I was in Berlin but unfortunately didnt make the MMM meeting.

Agree Popolo is not a POI standard, its much more a recommendation about the process and structure rather than that any of the actual Popolo standards would apply here (maybe some of the Organisation stuff is applicable).

One problem with the OSM taxonomy drat is that it ties it to the OSM format as you say. I think its a useful exercise to try defining an abstract model, then consider how it maps to concrete representations such as OSM, JSON-LD, XML, CSV, etc.

An OWL data model could come later maybe, if developing that is a barrier to just getting on with it? What’s the minimum needed to start?

I’ve not been directly involved in Popolo, I just think its a damn good approach for a number of reasons to define this kind of stuff. I am however working with localgov folks in the UK to define Localo, a similar spec for local government services, with lots taken from Popolo for inspiration.

For the steps:

  1. can we try doing this as an abstract model? Happy to try to start an example.
  2. what would be an example of what you’d like to create here?
  3. yes I think its the LOD cloud.

(Adrien Labaeye) #4

I’m missing a lot here, but learning a lot too!!
Thanks for this thoughtful piece @almereyda!
I’m adding an entry on Linked Data in our Glossary so that 14MMM/transformap participants can get an idea of what it is.


(Jon Richter) #5

As this process describes itself as an elk, we’re working at different paces here. Some throw-ins:

  1. I would ask you to wait a little, as @dreusser’s intern just sent me, on her last day, the results of her work and test transformations that I have to review and publish. To minimize redundancy.
    Also, there’s been another request at the german mailing list to include another taxonomy from the Commons Abundance Network. Still, it even mentions Spiritiual connection which turns it a little edgy from my POV.
    But, well, in fact, @pmackay feel free to draft anything you like, as everything in this process is still in constant change and needs any meaningful input.

  2. This is planned to be the Map of Maps. So we need different self-hosted source maps that expose their data set as API or allow any access to it, so we could mirror it in a CKAN instance, and the stuff from OSM via their Overpass API. We ( @Michael + @species ) haven’t figured out yet where this will go, but www.onyourway.at is a first prototype.


(Jon Richter) #6

Well, that wasn’t too much at all:

Additionally I’ve met with Kei Kreutler and Elf today at OKLab Berlin and we were able to exchange little pieces of information.

She especially pointed me to the Humanitarian OSM Tags as a good example for an OpenStreetMap Taxonomy extension.

It also shows us how tedious the work on such taxonomies and different iterations/versions can be, which finally end up in a specification.

To be frank, I don’t exactly know where to go from here.
My wishlists, implying a threefold organizational scheme of this process:

  • 14mmm.org MMM : Mapping the Mappings : research and development : +Networking the Networks.

Mapping the Mappings : MMM

  • 10% : hand-crafted Linked Data version of the mapping the mappings directory.
  • 50% : CKAN up and running for migrating away from Google Docs.
  • 70% : clearance of licences of datasets by a licence inventory and if not applicable, personal contact.
  • 100% : Automated federation of participating MMM maps into the TransforMap.

Taxonomies / Ontologies / Vocabularies

  • 10% : collection of existing taxonomy specifications and abstracting away.
  • 20% : collection of improvised taxonomies in use and manual transformation to abstracted draft.
  • 50% : CKAN > d:swarm > MMM automated chaining and transformation of sources and schemas.
  • 70% : Impulse for an OSM initiative to provide LD @contexts to its data.
  • 100% : collaborative vocabulary interface that supports versioned dynamic, changing schemas even between datums within a given data bucket (i.e. schemaless schemas).

TransforMap itself

  • 10% : First Overpass-API-based onYOURway interface from @Michael.
  • 50% : Visualizing several layers coming from the MMM initiative.
  • 100% : Dynamic live syndication of independent mapping infrastructures into the aggregation UI.

As you see I distinguish clearly between the metamapping, the vocabulary discussion and the resulting map, as I imagine these processes will walk at different speeds and have to be resynced from time to time, which means plain patient waiting, especially regarding the proposed milestones v0.5 and v1.0.

We could kind of iterate on these proposals in the currently non-existing Mapping the Mappings wiki page. Until Tuesday we’re occupied with the CHEST application, but should find time afterwards to go back to documentation.

Interesting to see how process and topic merge into one, as de- and reconstructing the ideas is already the process. Weird.


Architecture and Road Map, follow-up Potsdam
(Jon Richter) #7

As stated above and in many other places, the technical production of a standard is a social process. To encompass actual ideas like Organized Networks and advanced licencing schemes into the organizational and communicative flow design, I found it quite helpful to have a look at Civic Patterns, too.


(Jon Richter) #10

Could you try to express what is lacking?


(Adrien Labaeye) #11

Nothing is lacking, there is just a lot I don't understand... :) but that's all right!

On Sep 28, 2014 3:01 AM, "almereyda" <foyer@14mmm.org> wrote:

(Paul Mackay) #12

Is there a list of the taxonomies you have discovered in this project? As distinct from the large number of mapping projects? Is it worth comparing several of the most significant to consider differences and what a possible harmonized version might look like?


(Jon Richter) #13

Not yet. We ( @Michael and me ) only recently found out we also need a Mapping of Taxonomies. He did something in this direction already a while ago, see our Taxonomy discussion. Hope he’ll provide the FreeMind file soon.

I started to compile the rough work of @dreusser intern, who just left, into the GitHub repo mentionned above. He also selected those seven first, so let’s start with them.


(Jon Richter) #14

Maybe this post about Open food interoperability: entities, unique IDs, and semantic equivalence and the book Linked Data Patterns can help you more to get started. At least I will have to work through the latter.


(Adrien Labaeye) #15

another contribution from Jack:

IATI is a similar process of standardisation for doing good with the open data. It’s a standard for transparency to international aid projects and spending. http://www.aidtransparency.net/

Tim’s a really interesting person, also a Soton PhD. Always very very busy with various projects, but if you need some very specific advice on the joys of getting well-meaning communities to agree common data standards, he’s a veteran.

Will respond to the responses to my post soon, great that lots of interest! :slight_smile:

@Tim, Adrien, is part of a group trying to create a standard and map for sustainability resources by location, such as recycling points, bicycle hire etc. http://tranformap.co

tim’s email: tim [a] practicalparticipation.co.uk


(Jon Richter) #16

@bob-haugen mentionned something very similiar over at Loomio’s:

P.S. a conversation about how to evolve TransforMap into the direction of value networks might be useful. It’s going in that direction anyway, with fulfills_needs now, and Offers, Requests. and Processes on the roadmap.

I have been asked to prepare a dossier regarding OVNs and NRP in relation to TransforMap, so I will definately have to answer those questions.


Edit, December, the 4th @amy I recently discovered there’s also another list of possible sources (Google Docs) for the mmm-schema-examples by @josefkreitmayer. We should be able to translate it into English and and later incorporate it in the 20% step above.


(Bhaugen) #17

Jon says this is the place for Lynn and me to plug in and try to help to converge TransforMap and the emerging OVN vocabulary. I suggest starting with the OVN vocab and where it might most easily intersect with the TransforMap vocab rather than the NRP software, which is huge and growing. And even the OVN vocab might seem too big to start with. I understand that TransforMap already has the concept fulfills_needs now, so you are already tippy-toeing into value network territory.

I think we can make this easier by going one baby step at a time, starting from where you are, and building on what you have already done. If you have suggestions of more starting places in your current set of concepts, please let us know.

Assuming people do actually want to discuss this set of topics, should we add to this thread, or start another? Or what?

More to come, if and when you want it.


(Benjamin Brownell) #18

Signing on to this thread - it’s a long and interesting one!

I’d be happy to branch off to a fresh one from here, and approach more of the particulars of standardization, ontology mapping, and application frameworks in the context of Transformap and related initiatives.

For reference, I am working quite a bit with Metamaps.cc platform recently to explore metadata systems. More and more, I am imagining that kind of an interface for live ‘ontological programming’ within information systems and dashboards / maps. Something like a “pattern language operating system”…?

As challenging as it may be, I am in favor of reaching for high levels of abstraction from data points and tags in order to understand the fundamental cultural, ethical, and cognitive processes that are at work when we apply these systems of meaning as a scaffold for communication and computation. Ultimately, I think that’s what makes for a truly transformative tool and process.

-Ben
browsearth.org
villagelab.info


How we found a name for our working structure
(Thomas Kalka) #19

Interesting and a lot to read.
I’m always dreaming of a way of condensing threads step by step while reading them.
Is there any support for this in discourse?


Visual Navigation in Discourse
(Jon Richter) #20

Now that I see this message again, I am very thankful for the very early and precise demarcations a dissection of our Mapping Aim in Linked Data terms would create a lot of work for us. Any civic patterns around to scale that Organized Network well?

While we are learning from different languages, as such the sociocratic or commonifying dialects, for example, the main confusion may most of the time be caused by mere misunderstandings. Why we ultimately focus on the establishment of shared vocabularies with analogue semantics.


(Paul Mackay) #21

Can you explain this a bit more please? Not fully understanding it (ironic perhaps)…


(Jon Richter) #22

What happened in the last two years is a very thorough investigation of the overall expectations towards TransforMap. We often feel we cannot deliver nor keep up with these images. Then, the diversity of approaches in linking data is plentiful, yet incomplete. The area we are working in is unchartered land and our resources and experiences are finite.

We now understand the primacy of the social process over its technical partner. While longing for the stars is only human, an improved understanding of the ubiquitous incompleteness of self-hosted automated systems means alternative approaches to link data are in no way to be considered of lesser quality than pure Linked Data as the W3C understands.