Building a Black Box for Your BusinessOctober 25, 2013 By: Brian Gilmore, Splunk Inc.
As organizations become more data-aware and data-driven, the infrastructure-generated data that drives an enterprise becomes as important as any other business data. The challenge for the enterprise is to create a mechanism to effectively capture and analyze this data to deliver thoughtful insights.
To put this into perspective, consider airplane crashes. When an airplane goes down, search teams go to great lengths to find and recover the black box—a single device that can tell experts everything they need to know about the sequence of events that led to the critical moment of failure. From the data stored in the flight data recorder and the cockpit voice recorder, events and sensor readings can be correlated to build data models that provide insight into the disaster. Intelligence from that insight often provides the foundation of a playbook that aims to prevent similar disasters from happening again.
Disaster recovery and failure forensics are no different in industrial operations. The critical data may not be as hard to find, but it is still siloed in difficult to access systems. Too often, these archaic systems also fail to provide compelling analytic and visualization capabilities. As a result, sensor and event analysis via correlation is frequently impossible. Until recently, no single platform has existed to access, analyze, and most importantly understand the diverse data generated by networked industrial systems and sensors.
As these platforms emerge, however, businesses demand more and more real-time access and analysis. So you're seeing a corresponding increase in demand for sampling, throughput, aggregation, and presentation.
In the past, sampling rate was a compromise between data needs, network bandwidth, and disk space. When networks were slow and storage was measured in the thousands of dollars per gigabyte, engineers and system integrators rightfully designed systems to be as conservative with sample storage as possible.
Despite improvements in technology, numeric values from sensors are still often sampled and stored in 15 min. or 1 hr. intervals, and only the most critical sensors or pre-calculated aggregations (minimum, maximum, or average readings for a number of sensors) are passed over the network to be stored on disk. While these approaches mitigate the impact of sampling on network bandwidth and data storage capacity, they also hamper efforts to gain insights into the operation of key devices and systems.
In addition to this lack of precision and granularity, businesses often store sensor data in relational database tables, one sensor per table. Charts and other visualizations are sometimes primitive and require printouts and transparencies for correlation. Because of the difficulty in accessing and aggregating data, stored data is often unavailable or worthless when a company tries to build a high-definition picture of processes and operations.
Improving the Value of Data
Fortunately, enterprises can take several small steps to drastically improve the operational and business value of sensor data. First, strive for change of value or faster sampling. Operators and engineers would never settle for recording only those system events that occur at regular intervals, so they should not have to do so with numeric or Boolean values either. Modern serial gateways now incorporate processors that can handle much more frequent polls of downstream devices and sensors. Because disk space is so inexpensive, saving valuable sensor data to a commodity disk is literally like turning coal into diamonds.
Second, data log everything, and timestamp and record it in a human-readable format using semantic logging best practices. Knowing exactly when a sample occurs is important, so strive for at least millisecond accuracy with your timestamps. Sample time is as important as sample value. Without both, neither is particularly valuable.
Third, adopt strict naming conventions, but allow for flexible metadata tags. Geolocation, system name, device and point name, computerized maintenance management system identifiers, and serial numbers are excellent metadata fields to include with the raw sample events. This allows you to easily recall, evaluate, and aggregate the data using common terms.
Finally, adopt a big data strategy for your sensor data. Companies that have this strategy in place realize incredible insight and tremendous value from their data. For example, Splunk customer New York Air Brake's (NYAB's) Train Dynamic Systems Division has harnessed operational intelligence by collecting large volumes of data from freight train locomotives and braking systems. The system calculates and analyzes forces occurring on each of the couplings between freight cars in real time to present an optimized driving strategy to the engineer. Using Splunk for real-time data analysis allows them to notify key personnel about system and driver anomalies, present customized analytics dashboards that give operational insight, and continuously improve optimization algorithms. NYAB is providing insight that eliminates inefficient physics in the operation of those trains and has a potential to save customers a billion dollars or more in fuel costs a year.
Planning for the Future
Always keep in mind the future analytics value of the data. Design and build sensor networks as part of a big data strategy if you can, but don't underestimate the high-value modifications that you can make to existing sensors and process control systems to integrate them with big data systems. Use semantic logging, and store your data in NoSQL-style systems that will allow random access and flexible analytics, aggregations, and visualizations. Finally, integrate these applications, your data sets, and the insight generated from them into your daily operations and take another step toward being a truly data-driven organization.
Most Read Articles