Ahhhhhh! I’ve been blogging for over a week now on mostly non-technical stuff. Need to get my hands dirty a bit to keep it a fun exercise. In the last post we identified a great opportunity for a mini-app to search supermarkets near a certain location. Let’s see how we could actually make that happen. Not that I’ll be able to dive into code right away, but we’ll at least do some conceptual technical work.
Wouldn’t it be great to have an API on the Programmable Web that would return me 10 supermarkets closest to a specific location? Yeah right, dream on! At best we might find a service that does something like that for the US or UK. Google then? Did a quick check on my home location and Google returns a decent amount of supermarkets in the vicinity. But from the results returned, 3 are not actual supermarkets you can visit to buy things, but rather administrative addresses of business probably related to supermarkets. And the results are missing a big new shop that opened half a year ago near my city. The quality of the data returned is not adequate enough according my standards.
Actually, the only place I can find a clear reference to the new shop location is at the website of the supermarket chain brand itself. The same single page on the website actually list all the store locations for the chain, and I couldn’t resist the temptation to look at the HTML source of the page. This page contains all the data I need for this specific chain, including latitude and longitude, wow! Creating a little page visitor script that periodically pulls down the page and inspects it for changes in the list would be a great way to automatically keep up to date. And the owners would not mind right? I’m doing them a favor distributing their shops location more openly and up to date. Well, when we get there I will let them know anyway, and see how they react.
Hmmmm, but if we want to repeat this way of working for the other brand chains we need to at least scope the list of brands we will perform store location indexing on. Looking at our information model, the supermarket brand is also an entity that has no further dependencies, thus is a good place to start.
Supermarket brands in scope
How to get a list of active supermarket brands in the Netherlands? That was quite simple. Googling on ‘List of Supermarkets’ quickly pointed me to Wikipedia’s List of supermarket chains in the Netherlands. Since the list is quite small and static, it does not make sense to put a lot of effort into automating extraction of such a list from the Wikipedia page or from another data source for that matter. I could rather just copy the list of brands, which directly gives me a nice means to control the scope of brands the app is going to support. After having improved the quality of the Wikipedia list by cross-referencing some other sites with likewise list of Dutch brands, this is going to be my initial brand scope:
- Albert Heijn
- Bas van der Heijden
- Dirk van den Broek
- Jan Linders
During this exercise it has also become apparent that some of these brands have different so-called formulas. Albert Heijn has the AH XL formula, Coop has Compact and Super formulas. This might influence the information model if the formula has implications on the price of products, because that makes our assumption invalid that a brand has a price for a product. We need to find this out at a later stage, and can easily overcome the situation in case the model needs to change by renaming ‘brand’ to ‘brand formula’ and having each formula coupled to a price instead of the brand directly.
It also became clear that the brands are owned by a number of bigger holdings, and that holdings are collectively buying their products in about four big conglomerates. But this is typically invisible from a consumer perspective, thus not relevant for our app. So we can fill in the 25 brands in a table. And we can start to make a visitor per brand that will retrieve the store locations for us. This is estimated to return about 4000 store locations in total within our scope, which is confirming that the shop entity needs to be kept up-to-date mechanically. You don’t want to maintain that amount of data by hand 🙂
With that information available we’re going to find a way to determine shops closest to a specific location of the user … in a next post!