Using Weather Data in Business Intelligence – Part 1

How to find weather data that works well with business intelligence analysis

Business intelligence (BI) is the process of analyzing business data to find patterns and trends that can help organizations make better, more intelligent decisions. The goal is to find insights that explain past performance and can be used to drive more efficient and profitable actions going forward. Some of the most valuable business insight comes when external data is correlated with business warehouse data. External data such as historical weather records can shed light on previously inexplicable business trends and patterns leading to valuable insight that can be immediately applied to business activities. Weather forecast data can be used to drive future activities and decisions turning insight into valuable business actions that lead directly to return on investment (RIO) for the BI system.

In this series of three articles we will first discuss how to find weather data that is compatible with both your existing business data and your existing business intelligence systems. Then, in part two we will discuss strategies to analyze historical weather data in conjunction with your BI tools and with your data. Finally, part three will discuss how to apply the insights learned from the historical analysis to make more intelligent decisions using business patterns and weather forecast data.

Finding weather data that is compatible with business intelligence analysis

Many standard sources of weather data are difficult to use for business intelligence analysis. There are several reasons for this. One common reason is that the data is difficult to query in bulk. For business intelligence you typically need to query thousands of historical weather observations for hundreds of business locations. This query process needs to be automated via an API and ideally that API can be integrated directly into the business intelligence platform directly. Another reason is that weather data output forms can be complex, proprietary, and hard to use. You also need a data source that can cover both the time range and geographic range that matches your business. And finally, you need a data source that can be easily joined with business data.

The need for automated queries

Many common and free web-based weather services suffer from a manual query process or a query process that very difficult to automate easily. For example, in the US, the National Weather Service will supply weather data directly. However, using their web query page, the process is slow and manual. You must manually define your query, add that query as an “order” to your “shopping cart,” supply an email address, submit an “order” for the data, and then wait minutes or longer for an email to arrive containing your results. Alternately, you can download and parse their source data files directly. However, in order to take this route, you must be ready to download gigabytes of data for every day that you need data, write code to process their custom format, extract the required records, normalized the data, etc.

Another common provider of weather data is consumer-level websites. These sites typically allow a user to enter a single location and date. Then, with a series of clicks and a proliferation of ads, they show the weather data in some custom format. Using this type of provider for business intelligence would involve making the queries one by one and then screen-scraping the results into some usable format. And, as soon as the web site changes its format, the entire scraping and data extraction process needs to be rewritten.

Instead, for business intelligence use, you need a commercial-quality weather data provider that provides a fully automated API. This API needs to deliver results for hundreds of thousands of locations and dates in seconds. In addition this API needs to be well documented and stable so that it can be reliably used within an enterprise system.

A compatible output data format is essential

Another common reason that makes weather data sources incompatible with business intelligence use is their output format. We’ve already mentioned data services that provide gigabytes of raw data files in custom formats. We also mentioned the issues with trying to screen-scrape data from sites that are not designed for automation. However, even amongst commercial weather data solutions, some are much easier to consume in BI than others. Ideally you want to find a data services that supplies data directly in a format that your BI tool can consume without manual or risky conversion steps.

For this reason, Visual Crossing Weather makes data available in various formats to best meet every customer’s specific needs. For users of Microsoft products such as Excel and others that easily consume data in Excel format, Visual Crossing Weather provides data in the native XLSX format. For tools that prefer to work on raw, tabular data or import the data into a database, data is available in CSV format. For users of BI tools such as SAP or TIBCO that prefer a structured output format, the data results are provided in OData format. And finally, for those who prefer to write a custom wrapper to import the data into a custom app or data gateway, you can choose JSON, a format that can be easily parsed in all modern scripting and coding languages.

Data coverage over both geography and time

A third reason that limits weather data use with BI is coverage. To be useful the weather data provider needs to be able to supply data that fully matches the business data being used for analysis. When considering coverage, one must consider both geographical coverage as well as temporal coverage. Also the depth of coverage is important.

At a minimum, the data provider must be able to supply data that covers the geographic territory in which the business operates. This is fairly obvious. A global business must be able to get quality data for locations all around the world, not just us cities. However, business data often has very specific locations and getting data for those exact locations may be critical for understanding the underlying business metrics. Consider the City of Los Angeles, for example. The city comprises a large area (over 500 square miles!) and different locations within it often have different weather conditions. A good BI weather data provider should be able to take the exact location of an LA store down to the street address or even latitude and longitude and provide data from the closest weather station and even interpolate the best nearby stations to find the most accurate possible weather conditions. The same holds true for customer locations, travel routes, etc.

Coverage over time is also important for business intelligence analysis. Often valuable business data goes back years or in some cases decades. It is not useful to settle for weather data service that only provides data back in time weeks or months. A long history window ensures that you can get the maximum value from your existing data and BI analysis.

Depth is also important in the time dimension. Some data services only report weather conditions at a daily or hourly resolution. However, often BI transaction data records an exact minute for each record. The weather conditions at 8am are often very different than those at 6pm. Even the conditions at 8:01 can be different than those at 8:59. For this reason, weather data at the hourly level is typically a minimum for useful business intelligence analysis. Even better is sub-hourly data. Sub-hourly data allows weather conditions to be reported as frequently as every 5 minutes. This high-resolution data can help you determine cases such as if customers are visiting simply to get in the dry out of a rain shower or because the shower has passed and they are ready to get out and shop again.

Visual Crossing weather solves both of these problems without additional hassle or complexity. The standard weather data API provides weather reports from more than 100,000 reporting stations and can interpolate conditions down to a specific point anywhere in the world. It also offers sub-hourly weather reports that can provide exact conditions from many reporting stations as frequently as every 5 minutes. This ensures that the BI user will always get the best possible match between their business records and weather conditions.

Data that can be joined easily with business data

If a weather data services simply gives you a dump of raw weather records, it can be a difficult and error-prone process to join these records back to the business data. A BI-savvy weather data service provider should have a simple way to use an ID from the business data to tag each returned weather record. That makes the matching process simple and ensure that weather conditions are matched perfectly with the proper business records.

Visual Crossing Weather solves this problem by allowing an ID field to be passed as part of every weather query both in the web query GUI and the API. In this field, you can pass a unique ID for the business location or the exact business record being queried. This ID will then be passed back in the results and can be used to easily match results to business records.

Next in Part Two

Now that we have identified the features that make a good BI weather data provider, in the second part of our three-part series, we’ll discuss how to analyze our historical business data and historical weather data together to find useful patterns and correlations.