Work on SSE data by the Institute of Solidarity Economics


(Matt Wallis) #1

The Institute of Solidarity Economics have published the ISE Strategy for Data. We’d love some feedback here from @almereyda, @species, @mariana, @toka, @bhaugen (and … ) and hope to use the TransforMap discourse as a primary place for discussion about what we’re doing.


Proposal for converging Semeoz work and Transformap wiki of maps
Ways to publish Open Data on the Web
(Tim DAVIES) #2

This is great to see developing. A few quick reflections:

Standards + scraping

In the strategy you say “For example, Google is able to provide the opening hours of shops because the websites for those shops use a standard vocabulary for describing their opening hours.”

As I understand, that is only partially true:

(a) For some opening hours data I strongly suspect they are using machine learning to parse data, even when it has not been marked up appropriately. I believe they also provide tools to help webmasters correct information they have got in their systems (I would need to check what webmaster tools currently do around this to be sure of exact mechanics)

(b) Google are able to get people to use markup correctly because of it’s sheer market power. If you put the wrong information in your markup, you’ll find out quickly enough from customers complaining about incorrect results on Google etc.

These are important things to have in mind when learning from the way in which major information brokers have ‘disciplined’ websites to provide them with the information they want.

Take a pragmatic approach to Linked Data

As you have already noted, much of the technology around Linked Data isn’t as mature, or well maintained, as one might hope.

There is a tricky balance to strike between embracing the positive politics of ideal linked data (the distributed AAA - ‘Anyone, can say Anything, about Anything’ approach) and working with the messy reality, where network effects kick in, and a few centrally defined vocabularies dominate in a more-or-less lowest-common-denominator sort of way.

There is also a tricky balance to strike between using the full linked data stack (RDF, Triple Stores, SPARQL, reasoning etc), and just encouraging greater use of shared standards, and then providing tooling to help people work with data in whatever format is most familiar to them (usually flattened CSV type formats, or simple JSON).

Although solving data integration problems via linked data often offers an elegant solution, experience suggests to me that it often leads new projects towards premature optimisation for working at a global scale, that limits their ability to take small steps towards practically available data, and integrating across a reasonable number of starting sources.

Trusted registries

I would agree these are useful. I would suggest worth starting simple here. Just a list of links will often be enough to get things started, without needing rich meta-data registries until their is user demand.

User/community need

Can you make the ‘user side’ of the strategy more prominent. Right now there is a reasonable emphasis on supply side - but thinking more in the strategy about the community processes for identifying and supporting use and user needs might be relevant.


(Bhaugen) #3

I love

<skos:Concept rdf:ID=“A21”>
skos:altLabelU</skos:altLabel>
skos:altLabelActivities of extraterritorial organizations and bodies</skos:altLabel>

But, as @timgdavies wrote, more on the user side please. What are the plans for using the vocabulary and collected data?


(Matt Wallis) #4

@timgdavies, @bhaugen, thank you for your comments.

@almereyda, thanks for the notes about URLs that you posted to the WordPress discussion - I hope I have now incorporated them.

Good point. I will ponder how much detail i want to put in this overview post, but I agree that what I say at the moment is an over-simplification.

I agree that there will always be data around that does not conform to the standard ideal stack. I think the way to deal with this is to provide adapters to convert non-ideal data into ideal data. Of course, as usual, the devil is in the detail here, and things have to be taken on a case by case basis.

Yes, indeed. This does concern me. I am very much in favour of taking small steps. Hopefully, this can be achieved by carefully planning to do just enough, sprint by sprint, and by taking some short-cuts that favour pragmatism over elegance. But given the description of our main user story below, I think we need to grasp some linked data nettles quite early. I’d be very happy to hear an alternative opinion! I do not have your experience in this area. If there’s any chance, @timgdavies, of meeting to go over the details of this approach, I would be extremely grateful - I think we could cover a lot of ground in a couple of hours.

I agree that this needs better explanation in the strategy document. Thanks for drawing attention to it. First, I’ll attempt an explanation here, and something will grow from this into the document on our website:

The main user story that drives this work concerns ourselves, ISE, as the user. Let me explain.

ISE wants to know who is acting within the SSE. We want a list. We want a map. But we also want to act in a way that benefits the SSE as a whole, and we want to do everything in a open way. We do not want to set ourselves up as a “gatekeeper” arbitrating over who is in and who is out of the SSE! This has led to several conclusions:

  • We do not want to create a centralized database, in which the data is “owned” by ourselves.
  • We want initiatives to have the choice to be responsible for their own data, or to delegate that task.
  • We hope that apex organizations will want to make their data available in a standard way, and we want to make it as easy as possible for them to do so. We are working with Co-ops UK, for example.
  • So the data must be (permitted to be) distributed.
  • If we to go through the pain of figuring out how to do this, then we want others to benefit from the pain we have gone through! That means providing documentation, tools, reference implementations, etc.
  • The data we want to collect initially should be as limited as possible - for example geo-location (we want to be able to put things on a map), and a link to a website. Realistically, we know we will need more than that, but the principle is to start as small as possible, and grow from there.
  • We want a system that supports the concept of trust (and therefore mistrust too!) - if you want to know where to go to buy a sandwich, then you probably also want to have confidence that you are not being lured to a MucDonalds on the grounds that it sometimes uses organic lettuce! It should be up to the user to decide who they trust - a decision similar to working out what to believe on the web in general.

Now most of this is about establishing a way of sharing distributed data whose provenance is evident. It is independent of the content of the data. It is about establishing infrastructure to support the data. Is that approach too technology-led? Is there an alternative that satisfies the requirements in the bullet points above, or provides a step-by-small-step transition towards satisfying those requirements?

The content of the data determines which applications can be supported - if we have geo-location then mapping applications can be supported, and so on. This can be developed in parallel with the work on “infrastructure”.

What? :confused: Not sure I follow, but I like the quote!


(Bhaugen) #5

Activities of extraterritorial organizations and bodies

What? :confused: Not sure I follow, but I like the quote!

From http://www.maltas.org/ess/standard/activities.skos


(Mariana Curado Malta) #6

Please don’t use maltas.org link but the purl.org permanent link http://purl.org/essglobal/standard/activities


(Mariana Curado Malta) #7

This is indeed GREAT!

Don’t mixup what Google and others are doing (using their own vocabularies e.g Schema.org or the facebook vocab) with Linked Open Data. They have in fact two different goals.

What Google does is for humans end-users. Linked Open Data is a large open space with available data - organised data - can be used by machines in so many different ways (the sky is the limit).

Where exactly do you want to be? You can do both, but be sure where and how.

You could also talk about our RDF vocabulary http://purl.org/essglobal/vocab/ - and in our Vocabulary Encoding Schemes:

We are working hard to have a HTML file that presents all this information in a more structured way. We are facing issues with purl.org since its backoffice service is down already for many months and we still need to finish some configurations of these resources.

I would like you to call the Application Profile that the task-force ESSGlobal has developed as “DCAP-SSE” (meaning Dublin Core Application Profile for the Social and Solidarity Economy - see [1]) or metadata application profile for the SSE. Not to be confused with the ESSGlobal RDF Vocabulary, they are two different things.

Regards, Mariana

[1] http://dcevents.dublincore.org/IntConf/dc-2015/paper/view/372/361


(Bhaugen) #8

Please don’t use maltas.org link but the purl.org permanent link http://purl.org/essglobal/standard/activities

Thanks for the correction. I clicked on a link from http://solidarityeconomics.org/2016/02/16/ise-strategy-for-data/ and maltas.org is where I was redirected to. Should have copied the url from the purl lnk. Will do so, should I have another similar occasion.


(Matt Wallis) #9

Many thanks for your comments @mariana.

You are not the first person to comment on my mention of Google. I think I need to change it! I mentioned Google as an example that most people will have encountered that demonstrates the connection between distributed open data and something useful being presented to a user. I don’t want to assume much technical know-how among the readers of the ISE Strategy for Data - but it would seem that the Google example is ringing alarm bells with readers who have deep technical knowledge!

If anyone can provide a different example that avoids the problems with the Google example, I’ll replace it in the strategy article.

We want the data to be consumed by machines (which might then present it to humans, on maps, for example).

Yes. We’ll do that in a different article - one that gets into more depth. The ISE Strategy for Data is very much an overview of what we’re doing. The detail will follow, once we get there. I have made a few changes to the section headed “Standard Vocabulary” - please let me know, @mariana, if you’re happy with it.

That sounds like a major pain for you! I’m new to implementing linked data systems - will this affect us when we start using DCAP-SSE?


(Josef Kreitmayer) #10

@mariana,

could you share a link to the latest human readable ESS global categories?

I am not so much aware of the LOD / RDF vocabularies, but are currently building the human readible categories / taxonomy for the SSEDAS project.

It is a category system for the SUSY-map of SSE initiatives in 23 european countries, based on the work with 26 partners in these countires.

Also @jnardi is in that conversation (not in the latest part, as the first draft of the taxonomy is due this week, and I was unwell for several days, so we have a slight delay)

I think the thread here started by @mattw and the work of the Institute of Solidarity Economy is highly interesting, even though I do not fully understand the technical aspects.

Please have a look at the conversation here, and mark it as “watching”, so you get updates on the conversation: combining, comparing, merging, developing onward our existing vocabularies


combining, comparing, merging, developing onward our existing vocabularies
combining, comparing, merging, developing onward our existing vocabularies
(Matt Wallis) #11

There a several human-readable docs on http://purl.org/essglobal/wiki.

The presentation of the vocabulary (machine or human readable, or both, for certain humans!) is a separate issue to it’s content (as I’m sure you already know!). I would vote for using content of DCAP-SSE and the ESSGlobal vocabulary as a starting point for the SSEDAS project, even if there is not currently a plan to publish it in a machine-readable way. That way, we are verifying or improving what is already there.

I am pleased :slight_smile:


(Mariana Curado Malta) #12

I am sorry for the late reply @mattw. I am literally chasing my work, never seems to be done!

@mattw said: If anyone can provide a different example that avoids the problems with
the Google example, I’ll replace it in the strategy article.

Give the example of Europeana, a very good example!

@mattw said: I have made a few changes to the section headed “Standard Vocabulary” - please let me know,

I think that now it is perfect. Thank you.

@mattw said: That sounds like a major pain for you! I’m new to implementing linked
data systems - will this affect us when we start using DCAP-SSE?

It is indeed. They are sorting it out. The system works as it is, but we can’t log in to configure new things. I am working currently on the a file where i list in a very organised way a human readable ESS global vocab categories/properrties (as @josefkreitmayer asks) and I can’t then configure it in purl.org in order for a common browser to read (if it is a machine is OK, we have it in RDF/turtle or RDF/XML but if it is a human it is still not implemented). I also am finishing to code the PRODUCT VES, and i will need also to configure it in purl.org.


(Mariana Curado Malta) #13

Again I am sorry for the late reply.
(@jnardi please see this post.)

You can find all the documentation in [1] as @mattw said. I am working in a HTML document (similar to [2]) iin order to simplify reading and have all organised. But purl.org is mining my work! Let’s hope for it to come back soon with its administrator section.

@josefkreitmayer:
I am not so much aware of the LOD / RDF vocabularies, but are currently
building the human readible categories / taxonomy for the SSEDAS
project.

You should be aware! That is what LOD is all about: interoperability! The more you use LOD vocabs,the more interoperable you will be. If you are really on the SSE context you should consider have a serious look to [1] since it was a very serious work we have done (RIPESS) during 5 years - and we are still doing. It would be very important for the SSE community to re-use most of this work (DCAP-SSE, ESSGlobal vocab and VES’) since you would be interoperable with RIPESS and the capitalist world - that is also very important, don’t you think?

Regards, Mariana
[1] http://purl.org/essglobal/wiki
[2] http://dublincore.org/documents/dcmi-terms/


(Matt Wallis) #14

Thanks, @mariana, great video. I have now included it in our ISE Strategy for Data posting.

I agree absolutely. @josefkreitmayer, LOD vocabularies contain human-readable strings - they are important even if you are not dealing directly with the syntax for making these strings machine-readable. Re-using a vocab means that we all uses the same words to mean the same things :slight_smile: . @jnardi, I seem to remember that ESSGlobal and SSEDAS were brought up in the same breath at the Panorama meeting in Florence, but I can’t remember by whom, but I thought it was from someone directly involved in SSEDAS who said that ESSGlobal was being used in that project. Maybe I am mistaken?


(Josef Kreitmayer) #15

@mariana,

thank you for taking the time, especially as you mention, that this ressource seems to be scarce for you at the moment.

As I fully agree with you on the importance of enabling interoperability, I tried really hard to understand, and therefor took about 2 hours to sort out the english human readable aspects of the wiki files. Just finished before the weekend. I produced it in an excel file and marked all those I think could be connected with SSEDAS taxonomy items.

Attached is the respective excel file.
I also copied in a google spreadsheet to have it directly online. (there the marked is yellow)

In conversation with @almereyda we found out, that it is important to get an understanding of the different levels, the various ontologies look at the world. He has an excellent book about Geodata with a great article about Geo-semantics. The different ontologies have very different perspectives of the world, with different purpose. @almereyda, I will sometimes fall back into calling it taxonomy to not produce a breach of nominators.

  • Most of the ESS Global items, especially “product and services” and “activities” serve a macroeconomic perspective, and are in itself already aggregators, more than tags. These perspective is on macroeconomic clusters of economic activities.

  • The SSEDAS taxonomy draft is more focused on the individual initiative, with denominators, specifically to SSE initiatives (co-housing, community gardens, give away shop, …), that describe the activity itself on the ground. Some of them already integrate SSE qualifiers (as community gardens, e.g.) others, as shops need to be combined with qualifiers.

  • I could integrate into the SSEDAS taxonomy draft the following aspects of the SSE vocabulary

  • some of the qualifiers
    (many qualifiers came from the SSEDAS parnters, and I tried my best to integrate them)

  • some of the labour aspects,
    (whereby I have to be honest, at that point, I just got the files from http://www.maltas.org/wiki-essglobal/doku.php and was not aware of the following items, which I think is a different animal: http://vowl.visualdataweb.org/webvowl/#iri=http://purl.org/essglobal/vocab/ )

  • form of organization


(Mariana Curado Malta) #16

The main “animal” is our DCAP-SSE - http://purl.org/essglobal/wiki

Once we have defined this DCAP we realised that we needed to define terms in order to describe the model. that is how ESSGlobal RDF Vocab and the controlled vocabularies Products& Services, Qualifiers, etc. appeared.

Please have a look on the HTML human readable document here [1] - it has it all. It is a draft, I am working on the Products & Services (many items!) - this Products VES still needs to be coded in RDF/XML. Does this helps you?

Mariana

[1] https://www.dropbox.com/sh/w16zcw6x34fsi7x/AADrhABAfXj1itpqGGrRazvLa?dl=0