The ETL (extract, transform, load - hub) is relevant, when it comes to data aggregation from various sources.
ETL describes the tool-chain and process to extract mapping data from various partner´s databases, transform it into a format, that is processible for another application, and load it into the aggregation system.
There is basically 3 possibilites (that we found up to now),
to produce an ETL serving our purpose of map aggregation:
1.1. extract by scraping based on heuristics + mannually written adapterscustomized to the partners´ database structure.
- Heuristics analyse content and scrape the relevant data + manually written adapters, that fit the database structure of the partner commoning the data.
- that data the gets copied in a queue and gets uploaded in daily (e.g.) intervals
1.2. extract by scraping based on heuristics + a standard adapter, where partners can structure their database for.
- Heuristics analyse content and scrape the relevant data + we build one, (or several?) standard adapter(s)
- they find their own way in build, structure or restructure their database
2. we convince partners to build their data in an inter-operable way (geo-json).
- As the data are stored in a standard format in the first place, it does not need to be transformed
- data is right away available for aggregation and exchange