In the first series of posts, we presented you old and established maintenance models. Before diving into predictive maintenance, we want to show you methods to manipulate data which can be used for predictive maintenance. Therefore, the next four posts will explain the concepts of Anomaly Detection, Fault Classification, Fault Prognosis and Digital Twins. Once these concepts are clear, we go a step further and show how we design our analytics and shape them to become useful products.

In order to explain Anomaly Detection, let us first define an Anomaly, which oddly enough is not an easy task. Anomalies are abnormalities, outliers, data from different distributions. Consider the measured data in the figure above, most of the points lie in two areas whereas very few are off those areas. Therefore those are most probably anomalies whereas the other areas represent normal operating areas.
In case of an asset, anomalies refer to outlying sensor values or behavior of a system. Therefore, one would think that anomalies are easy to detect by their uniqueness and sparsity. Unfortunately, often the problem is not easy due to multiple operating points and working conditions. 

Take for example a battery pack of an electric vehicle. It consists of many individual cells connected in a specific arrangement to supply the vehicle with energy at a certain voltage and current range. Modern battery management system (BMS) will monitor the voltage of each cell as well as the current during charging and discharging.
The data from the BMS can be stored using cloud services (such as AWS or Azure) and can then be used to find anomalies. As more battery systems are operated, more data is gathered. This makes it easier to distinguish operating points from abnormal behavior. Most of the battery packs will show well functioning cells, with the voltage, current and temperature measurements all staying within a certain operating range.
Now suppose the measurements of a cell start to deviate from normal operating ranges, it most-likely experienced a malfunction which needs further attention.. Therefore the Anomaly Detection algorithm which is implemented will trigger and flag that battery pack so that the appropriate actions can be taken.  

There are quite a few different mathematical techniques that identify these abnormalities. The most commonly used techniques are density based approaches, such as the one-class Support Vector Machine (SVM), Isolation Forest, Local Outlier Factor (LOF), K-Nearest Neighbors (KNN) and Density Based Spatial Clustering for Applications with Noise (DBSCAN). 

In order to give an insight into Anomaly Detection, we go into the details of the Local Outlier Factor model. This model determines whether a data point is an outlier based on a specific ‘distance’ metric. Dependent on how many data points you consider, the algorithm looks at the ‘k’ data points which are closest to the testing sample. It calculates the `reachability distance` for all ‘k’ nearest samples in order to acquire the average `reachability distance` for the test sample. As a third and last step it compares the average reachability distances of the test sample to the average reachability distances of all its k nearest neighbors. If the average reachability distance is higher than the average reachability distances of its neighbors, it is labelled as an outlier.