News Menu

Distributed fault detection for large-scale dynamic systems

Distributed sensing and data processing may enable in-flight monitoring of aerospace vehicles, revealing damage quickly and accurately.

15 December 2008

Qi Cheng and Pramod Varshney

As the number of aircraft and satellites grows, continuous tracking has become increasingly critical. Monitoring these vehicles in flight could prevent catastrophic failures and reduce maintenance and downtime costs. One possible solution is to use lightweight, low-power, and rugged sensors to check thousands, or even millions, of measurement points.¹ Such detection might reveal failures early enough to prevent disasters, and could provide truly predictive awareness data on the state of a vehicle.

Increasing demands for robustness, cost-efficiency, and reliability have fueled interest in fault detection for dynamic systems.^2,3 Most of the existing strategies are centralized, which means that all sensing data is collected and processed at one unit. This not only causes excessive communication and computational burdens, but also creates a single point of failure. To get around these problems, we have developed large-scale distributed detection and data fusion methods. Our approach provides localized information processing at the sensor level, reducing network bandwidth, validating sensor data and integrity, and standardizing formatting and reporting.

We considered the distributed fault detection problem, in which multiple sensors monitor the dynamic state of the system (see Figure 1). Normal and faulty behaviors can be modeled as two hypotheses. Due to communication constraints, our model assumes that sensors can only send binary data to the fusion center. We designed local detector and decision fusion rules to minimize the probabilities of missed fault detection and false alarms. We chose a quantitative analytical modeling approach because it is more amenable to performance analysis. With the dynamic system state-space model, we can predict the state under both normal and faulty hypotheses using knowledge of past observations. For linear and Gaussian systems, the conventional Kalman filter (KF) is optimal for prediction. Models capturing observational noise and the evolution of the system, however, may have complex nonlinearity and non-Gaussian distributions, precluding analytical solutions. A class of Monte Carlo-based methods called particle filters (PFs) can solve this type of problem. Particles can provide a complete representation of the states' posterior probability density function.

Figure 1. System diagram for distributed fault detection. x_t: System dynamic state at time t. y_t^M: Local observation at time t and sensor M. u_t: Local decisions at time t and sensor M. u_t: Global decision at time t.

Figure 2. Probability of error versus the number of sensors, M, for a distributed fault detection algorithm that assumes independent sensor observations. As the correlation between observations increases (higher Q₀/R), fewer sensors are needed for good performance. Q₀: Process noise variance under normal conditions. R: Measurement noise variance. Pe_ bound: Lower bound of the probability of error.

Under the assumption of independent and identically distributed observations, we developed a simple and efficient distributed fault detection algorithm. The method is based on state estimation via particle filtering for a general state-space model.⁴ In experiments, the PF-based algorithm performed better than the KF-based approach for non-Gaussian observational noise. A moderate number of particles are sufficient for our fault detection problem. Local decisions are correlated because of common state process noise. Nevertheless, a system design that assumes independence between sensor observations achieves good detection performance over a wide range of possible correlations (see Figure 2). For small values, a large number of sensors can improve performance substantially. For larger correlations, a relatively small number of sensors can achieve near optimal performance.

Next, we exploited the correlation between sensor measurements⁵ to improve fault detection (see Figure 3). When measurements are highly correlated, designs with the independence assumption may perform even worse than a single-sensor system. However, multiple sensors perform better when we consider the association between observations. When we cannot obtain exact associations in a nonlinear non-Gaussian system, we used tractable design methods with two correlation models. These methods perform well with small approximated correlation error. As an upper bound we use a model that assumes full correlation between states conditioned on local past observations, while the lower bound uses the assumption of complete independence.

Figure 3. Performance comparison of exact correlation, independence, correlation model 1 (CM1), 2 (CM2), and single-sensor designs. P_e: Probability of error.

We foresee a time when a large number of distributed sensing and processing nodes will be employed in vehicles like airplanes. Realization of this vision hinges on the system's ability to quickly handle massive data sets and detect deviations from normal. So far, we have formulated distributed fault detection for dynamic systems as a hypothesis-testing problem at each time step. By making certain assumptions, we can obtain effective suboptimal solutions with manageable complexity. These solutions can also help monitor critical infrastructure like bridges and transportation systems. However, the current algorithms only use observations of one snapshot. To build on this work, we will consider using accumulated observations from several time steps.

Qi Cheng

Electrical and Computer Engineering

Oklahoma State University

Stillwater, OK

http://wsnl.ecen.okstate.edu/

Qi Cheng received a bachelor's degree in electrical engineering from Shanghai Jiao Tong University in 1999. She received her MS and PhD degrees in electrical engineering from Syracuse University in 2003 and 2006, respectively. From 1999 to 2000, she worked as a systems engineer in Guoxin Lucent Technologies Network Technologies Co. Ltd., in Shanghai, China. Since August 2006, she has been an assistant professor of electrical and computer engineering at Oklahoma State University.

Pramod Varshney

Electrical Engineering and Computer Science

Syracuse University

Syracuse, NY

http://www.ecs.syr.edu/research/SensorFusionLab/People/varshney/

Pramod Varshney received a BS in electrical engineering and computer science and an MS and PhD in electrical engineering from the University of Illinois at Urbana-Champaign in 1972, 1974, and 1976, respectively. Since then he has been with Syracuse University, where he is a distinguished professor of electrical engineering and computer science and the research director of the New York State Center for Advanced Technology in Computer Applications and Software Engineering. He is a fellow of the IEEE and has received numerous awards. He serves as a distinguished lecturer for the IEEE Aerospace and Electronic Systems Society, and was the 2001 president of the International Society of Information Fusion.

References:

1. S. Ofsthun, Integrated vehicle health management for aerospace platforms, IEEE Instrument. Meas. Mag., pp. 21-24, 2002.

2. A. Willsky, A survey of design methods for failure detection in dynamic systems, Automatica 12, pp. 601-611, 1976.

3. M. Basseville, Detecting changes in signals and systems -- a survey, Automatica 24, no. 3, pp. 309-326, 1988.

4. Q. Cheng, P. Varshney, J. Michels, C. Belcastro, Distributed fault detection in dynamic systems, IEEE Trans. Aero. Electron. Sys. 44, no. 1, pp. 227-242, 2008.

5. Q. Cheng, P. Varshney, J. Michels, C. Belcastro, Distributed fault detection with correlated decision fusion, IEEE Trans. Aero. Electron. Sys., accepted.