The analysis shows: TF/IDF based ranking
yields more accurate results, especially for queries with unordered
address elements or only partially specified addresses.
The paper focused mainly on the difference of the results when both are feeded with unordered addresses. If the problem of unordered addresses does not exist (at at is in our excel template), Nominatim and Elasticsearch do not differ much.
And, more important, they only used addresses for the test that are present in OSM - my main conclusion from the paper for us is that Elasticsearch didn’t perform any better if the queried address isn’t present in OSM.
Btw.: I hate videos. you cannot search for it, you have to invest THE WHOLE 30min to know that there is nothing important for you :-/
So what was in the video:
They use for geocoding the following source:
- OpenAddresses, we also plan to use (PD license)
- Quattroshapes, only neighborhood data (which is CC-BY, therefore we cannot use because OSM requires PD)
- Geonames, mainly places like villages (which is also CC-BY 3.0)