Weather and Flood Data Services | CSC ClimatEdge
CSC's ClimatEdge offering provides probabilities about climate and extreme weather using public government data sets. This gives customers a distinct advantage because they can base strategic business decisions on well-honed climate probabilities. This is of paramount interest to industries such as insurance, which seek weather risk management to maintain profitability.
This paper, A Big Data Approach to ClimatEdge, compares the tornado and flood data service offerings of ClimatEdge against the big data attributes of velocity, volume and variety, concluding that ClimatEdge requires big data technology and expertise to handle flood data (tornado data does not meet big data criteria). A big data reference framework is presented that allows flood, tornado and future ClimatEdge offerings to scale in processing large and complex data sets, thus enabling a wider variety of business cases to be developed into offerings. The framework examines a variety of components for infrastructure, data acquisition, storage, indexing, analytics and presentation for both structured and unstructured data.
CSC's ClimatEdge offering is focused on providing probabilistic weather and climate analytics from public government data sets. By using various algorithms, both proprietary and public, CSC plans to offer its customers a distinct advantage by allowing strategic business decisions to be made based on probabilistic outlooks in climate and extreme weather. Possible offerings for the general insurance industry include the probability of future tornado, hail, or flood occurrences in a particular geographic grid cell, as well as hail activity in the recent past.
This paper focuses on contrasting the tornado and flood analytics with respect to the attributes associated with big data: velocity, volume, and variety. A hail offering is likely similar enough in variety to tornado, and in volume to flood, that analyzing it separately does not add further insights or value. The computation of tornado probabilities involves running a published algorithm against a relatively small amount of structured data on modest hardware. In short, it does not meet the criteria for big data. Flood probabilities, on the other hand, require substantially larger and more diverse data such as newspaper reports, hydrologic data from stream gauges, and parcel information in addition to rainfall and other climate data. In contrast to the tornado analytic, it does meet several criteria for big data. The minimal platform capable of delivering the tornado probabilities is not capable of calculating flood probabilities. Without an investment in big data technology and expertise, the range of ClimatEdge products will be limited.
Next, a reference framework implementation is presented that will allow future ClimatEdge offerings to scale in processing both large and complex data sets, thus enabling a wider variety of business cases to be developed into products. The most important framework characteristics are flexibility in implementation technologies and a loose coupling between the layers. There does not exist a single application or technology capable of addressing every business scenario; therefore, a collection of specific technologies that can be integrated together is presented. As technologies evolve, the goal is to be able to replace any one piece of technology without overly affecting the other layers.
Across the globe, the big data movement is changing how meteorologists are storing and analyzing data in response to global events. For example, efforts by the Korean Meteorological Administration are underway to upgrade the ability to predict weather patterns and the severity of weather events across the South Korean peninsula. IBM is engaged in similar work in Rio de Janeiro in preparation for the 2014 summer Olympics, with goals of accurately predicting short-term weather . While focus in near-term weather forecasting is expected to remain strong, there has been growing interest in how climate change affects future weather events. What if the vast repositories of weather and climate data could be collected, stored, and analyzed to produce probabilities of catastrophic events months, not weeks, in advance?
With the encouragement of the National Aeronautics and Space Administration (NASA), CSC’s ClimatEdgeTM service was developed by its National Public Sector (NPS) division in 2012 to explore the commercial potential of publicly held climate and weather data. The initial offering focused on forward-looking climate reports for commodities markets. The reports were written using a cursory qualitative analysis of the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) data with commentary by subject matter experts in climate sciences. The monthly reports included Global Agriculture, Global Energy, Sugar and Soft Commodities, Grain and Oilseeds, and Energy/Natural Gas . Interviews conducted with individuals associated with the original ClimatEdge offering described a number of lessons learned and decisions that led to a strategy shift to a quantitative product for the next version [personal communication, 2013]. Potential customers were less interested in commentary on qualitative analysis than originally thought. As part of its move to the Big Data and Analytics incubator, the ClimatEdge team made the decision to focus on quantitative analysis utilizing existing CSC commercial sales channels.
During the 2012 calendar year, the United States had 11 separate weather events where losses totaled more than $1 billion each, making it the third highest loss year due to natural catastrophes since 1980 [CSC communication from the National Oceanic and Atmospheric Administration (NOAA), 2013]. With premiums increasingly unable to cover losses incurred from extreme weather events, the overall profitability of the insurance industry is at risk. With products such as POINT IN and Exceed, CSC has significant sales inroads with the general insurance industry  . With established channels, the general insurance industry became a prime target for the second version of ClimatEdge.
In order to understand the weather & flood data offerings for general insurance, the business model must be explored. The following equation can be applied to the general insurance industry:
Risk = Impact · Probability
CSC recognized that without a substantial quantitative update, ClimatEdge could not address the probability of events occurring; thus the overall risk could not be established. Given the interest of federal agencies with a wealth of publicly available climate data, such as NOAA and NASA, CSC realized a business opportunity existed. A ClimatEdge offering based on quantitative analysis of publicly available data targeted towards minimizing risk for the general insurance industry would be a natural fit, as it would serve the desire of federal agencies while creating commercial business opportunities for CSC. The next step was to develop a technical approach to retrieving, storing, and analyzing the wealth of available data.
Without investing in and developing a scalable solution for big data storage and analytics, ClimatEdge offerings will be limited to markets served by either qualitative analysis or quantitative analysis against small well-structured data sets. In order to analyze the larger and more structurally complex climate data sets, a framework of scalable technologies will need to be developed that can address both simple and complex offerings of any size. With the proper framework in place, CSC can offer solutions derived from analytics on data having one or more of the following big data characteristics: velocity, volume, or variety to an ever increasing number of industries.
Two potential and contrasting ClimatEdge offerings will be explored that illustrate the different requirements necessary to store and analyze the requisite climate data. The first is shown to not be representative of a big data problem and is solvable by simple technology. The second has all the characteristics of a problem in need of a big solution. A flexible framework is then proposed that illustrates how to store and process larger and more complex data sets, such as those seen in the second offering.
The large number of severe weather events in 2012, such as tornados, floods, and other natural catastrophes, caused $160 billion dollars worth of damage in the United States . This is near the total of the previous 10 years combined. Throughout that same 10-year period, insured losses totaled over $65 billion. Tornados, while a worldwide phenomenon, occur with greater frequency in the United States due to the meeting of cold dry Canadian air with warm moist air from the Gulf of Mexico. Losses from these severe storms alone have accounted for more than half of all insured catastrophe losses since 1990 . Several presentations at the Extreme Weather Congress in January 2013 focused on flooding as a prime source of damage. In the United States, floods are the most common natural disaster and average about $6 billion in damages each year. While the National Flood Insurance Program provides limited coverage to homeowners and businesses that qualify, a sizable secondary insurance market exists . Globally, floods rank only behind earthquakes as the world’s costliest natural disasters . The Data Services group, an organization within the Big Data and Analytics incubator, in conjunction with industry analysts and other internal groups, has identified several opportunities in which an updated ClimatEdge offering can benefit the general insurance industry [personal communication, 2013]:
- Probability of future tornado occurrence
- Probability of future hail occurrence and recent forensics*
- Probability of future global and domestic flood occurrence*
*not part of ClimatEdge
This paper examines the tornado and flood probability analytics as having contrasting requirements at either end of the attribute scale associated with big data. The analytics necessary for producing a hail analytic are an offshoot of the tornado algorithm and the data sets required are also similar in complexity to the tornado sets. A detailed examination of the hail analytic would, therefore, not yield significant additional insight as compared to tornado and flood.
Download the full paper A Big Data Approach to ClimatEdge (PDF) to continue reading.