Visual Crossing Blog

Using distance-based heat maps and hotspots for improved data analytics

Heat maps offer users analytical capabilities

Heat maps and hotspots can be of great value to business users but they often get pushed aside in the GIS world as being statistically insignificant. The complaints of heatmaps relate to with smoothing algorithms as well as the interpolation of data points as not being based on real data. This is true in many cases, but we will discuss the Visual Crossing implementation of heatmaps and why it is different and how we enable our customers to gain significant business value.

Terms

First a note about hotspots vs heatmaps. Many times, these concepts are used interchangeably which is incorrect. Those that know the difference tend to always side with hotspots because of their statistical significance over the guesswork and estimations that are heatmaps. From a Visual Crossing perspective, here are the definitions of the two visualizations:

Hotspots:A fixed grid algorithm that breaks the entire landscape into a grid of a set size and shape (squares, hex, etc). The data enclosed in those cells is then aggregated to show exactly the value of that cell. In some cases, you can choose to gain additional value from neighboring cells to improve the overall value of any given cell.

Heatmaps: Point based ranges of influence that overlap and aggregate value to give the user a potential worth of a specific location. The value of every pixel is calculated based upon overlapping influence from all other points.

Real-world Need

Let's take a simple example of a retail establishment who is looking to find the best location for a new store. To do this intelligently they would like to base their decision upon concrete data. There are a couple of options to do this, one is demographics of a well-known geopolitical boundary set like income data by Zip Code or if you want to get more granular you can go down to a more detailed level such as Census Block. The other option is to use point level data that may come from internal sources collected over time.

NOTE: as you consider which solution is best, keep an eye on the tear drop pins that show prospective locations. Which location would you choose for your store?

Geopolitical Boundaries Solution

Let's start with Zip Code data whether it is an aggregate of point level data or a generalized metric that comes with a demographic set. In either case the final metric value that thematically colors the Zip Code basically is declaring that all store locations within that Zip Code are all the same. This may be ok for a general analysis but is this the best that we can do?

Hotspots Solution

Hotspots are similar in nature to geopolitical boundaries except for a couple of extra goodies. First you can set the granularity to account for overgeneralization of each shape. This is great but the more granular you go the visualization starts to lose it's effect. Who wants to put a store where the hot market is 10 meters x 10 meters? As you look around your preferred cell you may see that there is a mix of colors and performance around your cell. Your mind cannot aggregate those easily to give an exact number.

So now we can use a second function of Hotspots and ask it to interpolate value from the areas around it. The more we ask it to include areas around us, the more our cells start to look like Zip Codes in size and value. It is certainly an improvement over Zip Codes but it doesn't quite get to the heart of our analysis.

Visual Crossing Heatmap Solution

To explain how we create our heatmaps let's start this exercise with a single point of data. It is a customer that has purchased $XXX from our other locations in the past. From using other Visual Crossing tools, we can see that our customers are willing to travel 7 miles to get to a location. For this single point, I can draw a radius around this data point and state with some certainty that if I were to place a store within the bounds of that radius the customer would come to our location.

Now picture two customers where their influence radius will overlap. If you can place a store in a location that could attract both customers, this would be your ideal location. This location is identified by the intersection of the two circles that represent the customer influence. That intersection needs to be drawn in such a way that the value is aggregated for that intersection and you can see from the map image that it is.

As we add in more customers to influence the location, the more complex the visualization gets.

What if we had thousands of points? We would need an algorithm that overlaps all the areas of influence and aggregates and thresholds the data for us by color. When we see a nice blue anywhere in our area, we know that this represents the highest amount of customer value that can come to a location from 7 miles. All places in the blue offer the same value unlike a dark blue zip code or interpolated hotspot.

The result is a pixel-by-pixel look at the value of any location on the map that takes into account all other locations surrounding it. This is statistically significant. There is no guessing, there is no interpolation of data. It is not possible to get a more granular analysis. Now that we know the value of every pixel on the map which location would you choose?

For this particular analysis, we can see that the yellow (moderate cost) pin representing a prospective site within the blue hotspot is the clear winner as it has a high potential for bring value from within 7 miles.

Influence Types

Regarding the knock on other heatmaps that are not based on a statistical influence, we also offer two different styles of influence: Distance based and Pixel based. The latter of which is not statistically significant and is just there to produce a good looking heatmap. Sometimes a pretty picture is preferred by the customer and sometimes it is a good estimate just to get you closer in for the real distance-based analysis. Depending upon your need you may opt for one vs the other or a hybrid approach.

Discussion on Metric Values

One thing that many heatmap implementations don't take into account is the value of a point. Most generic heatmaps treat a point as a point and you are mostly looking to count points or access to them. Visual Crossing allows you to utilize any metric value in your heatmap influence. Not all customers are the same to a business. You bring the metric of your choice and we will utilize those values in the computation of pixel area values.

Availability of Locations

One nice feature of having the pixel-by-pixel knowledge is that you can overlay all available real estate locations onto your heatmap. This allows you to compare cost for the value of an area. This feature can get lost in boundary analysis because you are never quite sure what the value is of a specific location. With our heatmaps, if you find a bargain piece of real estate in a hot location, that may be your best bet.

Summary

Just to be clear, we are not saying that heatmaps are the best solution for every scenario. In fact, our hotspot visualizations provide a great value as well. But we want to make two things clear: 1) Heatmaps are statistically significant 2) Not all heatmap implementations are the same. At Visual Crossing we try to give you options and leave the choice up to the analyst who knows their data best. Whether you are working in MicroStrategy, Microsoft Excel, SAP Lumira or SAP Lumira Design Studio, our heatmaps provide all of our users the same analysis. So bring us your lists of points and we will help you make sense of the data