Geodata storage evaluation


#22

the article mentionned in the issue:
Switching user database on a running system - labs.spotify.com


(Jon Richter) #23

Please be aware CartoDB is a web mapping platform that uses PostgreSQL with the PostGIS extension. It is nowhere near being a database itself.


(Michael Maier) #24

I would oppose that strongly. We are not in the situation that Spotify is with 75M users, especially users that are extremely sensitive to outages (as they are paying for the service). OSM has 2.3 M users, and its DB configuration is very similar to Spotify’s: 1 write-DB, many read-DBs. If there is an outage in the write-DB, then you can’t edit, that’s true. But all other services remain unaffected. That’s a problem that is just soo far away in the future (to guarantee 100% edit-availability), that in our situation with very limited funds, I would not care about in the moment.


(Josef Kreitmayer) #25

So basically CartoDB is an application, that integrates with PostgreSQL as database?

In very easy terms, that would mean CartoDB = (building on) PostgreSQL or would there also be other PostgreSQL uses, that would not integrate CartoDB?


@almereyda,
what do you think about the PostgreSQL write/read setup?


personally from an absolute non-technician standpoint, I share:


(Michael Maier) #26

At first, as much as I appreciate your insights, it would have been very helpful if you would have been attending the meeting, especially as you signed up in the framadata. /rant off.

If we have to implement them ourselves, then I agree. If a solution provides it out-of-the-box, we can take it now.

could you specify this in detail?

Good point - I will add it to the requirement criteria matrix.

+1 for step-by-step.


Thanks for the reminder, I’ve documented it in this thread.


I’ve started the requirement matrix here in this ethercalc (Just accept the expired certificate).


2016 02 08—21 | Hackathon | Witzenhausen
(Michael Maier) #27

Most high-end FOSS GIS-solutions build upon postgres/postgis, e.g. also the OpenStreetMap stack (the so-called “rails port” I would suggest as POI DB) uses postgis both for the main DB as well as for the rendering servers.


(Matt Wallis) #28

I was one of those people, but I am uneasy about the value (and the interpretation here) of my contribution, because I do not understand properly the use to which this database will be put. I seem to have entered the discussion at a point where all that needs to be done is to refine the implementation details, and where the “higher” level questions about use cases and //why// we are doing what we are doing have already been answered.

For example, in this thread (or links from it), I have found the following:

@species in https://tree.taiga.io/project/transformap/us/71

“Scales for at least 100 million POIs”

This sounds like a big centralized database. I may be wrong.

@gandhiano in https://tree.taiga.io/project/transformap/task/206:

“For me having a decision on the geodatabase to use is necessary until the next hackathon begins on Feb 8th. It is a precondition for allowing a fruitful work with maxlath and to allow us to move forward in shaping and implementing the TM stack.”

The two comments above cannot both be describing the same instance of a geodatabase, as the first requires a decision by Feb 8th.

In the absence of proper understanding, I have come up with some hypotheses:

  • Guess 1: Perhaps my lack of understanding is down to the fact that I am quite new here. If I keep digging through discourse/taiga/mediawiki/1Xmmm/github issues, then I will find the answers.

  • Guess 2: There are two paths being followed: The first is the long term view as expressed by @almereyda. The second is to extend the OSM demo further, partly to work around the limitations of OSM. If this guess is correct, then it would be really useful to be explicit about which path is being followed when the two paths cause conflicting conclusions.

  • Guess 3: We need to get experience of geodatabases in order to realize our long term goals, so let’s start playing with them now in order to get that experience. This will eventually feed back into the long-term vision.

Are any/all of the guesses right?

Is there an external force (e.g. commercial/funding/deliverables etc) that is driving this decision? If so, it would be really useful to acknowledge it because these things usually have an impact on the technology under construction, and it gets really confusing if the existence of such external forces is not made explicit.

On the subject of funding, my participation here is funded by the Institute for Solidarity Economics. One of our goals is to empower initiatives within the Social and Solidarity Economy to get them selves “on the map” (in the widest sense) by providing data about themselves. We aim to do so by eventually providing tools and reference implementations that can drive the creation of distributed Linked Open Data. We do not want to create a centralized database. We want to use the standards and recommentations as described in Linked Data: Evolving the Web into a Global Data Space. We will use (and adapt where necessary) the application profile described by ESSglobal. We hope that in time geographic mapping is but one example of many applications that use this open data. We have a lot of work to do in order to meet our goal, starting with a more detailed look at ESS global, and experimenting with publishing linked data on the web.

So, our goals are closely aligned with the view expressed here:


Pointers to methodic work practices in form of questions
(Michael Maier) #29

Hi @mattw,

sorry for the long delay, the Witzenhausen Hackathon was quite intensive, and we all needed a little rest.

Each of the “forces” in the TransforMap community has a little different opinions about “our” database, for what it should be used, what should be stored in it. There was only one common requirement: “we need a database”. The primary use would be to store geo-data, but we also need a graph-database where can store our (and others) Taxonomies and interlink them.
The meeting you attended (the goal was to define requirements for a DB) brought us a big step forward on aligning our requirements inside the TransforMap community.

Because “what TransforMap is” is such a big topic, there are several things that are expected that TransforMap should provide:

  • Provide a Point of Interest (or places as we called them during the hackathon) storage for communities, that don’t have or want to develop and take care of a mapping system themselves.
  • Provide an “extra tag storage” where one can annotate POIs stored elsewhere with his own tags (this is one of my personal requirements inside TransforMap) - see my proposal here.
  • Provide a big “cache”, where the POI data of other sites and mapping systems are cached, as backup and for performance reasons. This “cache” or ETL (Extract-Transform-Load)-Hub as @almereyda calls it should also provide a common API for different data sources.
  • Create a machine-readable and interlinked Taxonomy system, where different ontologies are stored and linked together (this system was not discussed in the last meeting).
  • And for sure something else I forgot others expect from TransforMap :wink:

As I am used to work with worldwide datasets at this scale (OpenStreetMap), I only wanted to include future-proof, scalable solutions here. No pun against decentralisation intended here.

I would say it is more our lacking documentation of our two-year-long process that makes it difficult to follow :wink:

Actually, these two should merge in the long term. The approach of the demo maps is currently to use OSM as primary datasource (but it is theoretically able to query our recently set up new API too).

We need at least some playground for our API to work on, that is also true.


Yes, there are actually some deadlines, some from the CHEST project, and some for the SUSY (or SSEDAS) project, where some members of TransforMap are active to provide the mapping part for SSEDAS with TransforMap technology.

Especially for SSEDAS we need to provide a POI storage, because not everyone of the 26 partners involved have or can manage his own database of POIs.


Pointers to methodic work practices in form of questions
(Jon Richter) #30

During

I had also been made aware of the possibility of Wikibase as a geodata store by a geo visualisation of the query interface.