DA & Control

Knowledge Discovery from Sensor Data

March 1, 2006 By: Pang-Ninh Tan, Michigan State University Sensors


Extracting useful knowledge from raw sensor data is not a trivial task. Conventional data analysis tools might not be suitable for handling the massive quantity, high dimensionality, and distributed nature of the data. Recent years have seen growing interest in applying data mining techniques to efficiently process large volumes of sensor data [1].



Data mining, a critical component of the knowledge discovery process (Figure 1), consists of a collection of automated and semi-automated techniques for modeling relationships and uncovering hidden patterns in large data repositories [2, 3]. It draws upon ideas from diverse disciplines such as statistics, machine learning, pattern recognition, database systems, information theory, and artificial intelligence.

Figure 1. The overall process of knowledge discovery from data (KDD) includes data preprocessing, data mining, and postprocessing of the data mining results
Figure 1. The overall process of knowledge discovery from data (KDD) includes data preprocessing, data mining, and postprocessing of the data mining results

Data mining has been successfully used on sensor data in applications as diverse as human activity monitoring [4], vehicle monitoring [5], and vibration analysis [6]. This article presents an overview of such techniques and describes some of the technical challenges that must be overcome when mining sensor data.

Preprocessing Steps

Before data mining techniques can be applied, the raw data must undergo a series of preprocessing steps to convert them into an appropriate format for subsequent processing. The typical preprocessing steps include:

  • 1. Feature extraction—to identify relevant attributes for a data mining task using techniques such as event detection, feature selection, and feature transformation (including normalization and application of Fourier or wavelet transforms)
  • 2. Data cleaning—to resolve data quality issues such as noise, outliers, missing values, and miscalibration errors
  • 3. Data reduction—to improve the processing time or reduce the variability in data by means of techniques such as statistical sampling and data aggregation
  • 4. Dimension reduction—to reduce the number of features presented to a data mining algorithm; principal component analysis (PCA), ISOMAP, and locally linear embedding (LLE) are some examples of linear and nonlinear dimension reduction techniques

Data mining can be broadly classified into four distinct tasks, which will be discussed in the next four sections.

Predictive Modeling

The goal of predictive modeling is to build a model that can be used to predict—based on known examples collected in the past—future values of a target attribute. There are many predictive modeling methods available, including tree-based, rule-based, nearest neighbor, logistic regression, artificial neural networks, graphical methods, and support vector machines [2]. These methods are designed to solve two types of predictive modeling tasks: classification and regression. Classification deals with discrete-valued target attributes; regression, with continuous-valued target attributes. For example, detecting whether a product on the assembly line is good or defective is considered a classification task, while predicting the amount of rainfall in summer is a regression task.

1 2 3 4 5 


Add Comment











Twitter Feed

Find It Fix It Forum

Sensors invites you to join the Findit-Fixit Forum, where you can get answers to your sensing questions—concerning technologies, products, methods, applications, and services--and also offer help to your fellow engineers. The Forum covers all kinds of topics, from the basics to the extraordinary.

Join the discussion!


© Copyright 2014 Questex Media Group LLC. All Rights Reserved. Sensorsmag. Privacy Policy | Terms of Use

If you are having technical difficulties or considerations, please contact the webmaster.