A common goal of the papers in this thesis is to propose, formalize and exemplify the use of Bayesian networks as a modelling tool in reliability analysis. The papers span work in which Bayesian networks are merely used as a modelling tool (Paper I), work where models are specially designed to utilize the inference algorithms of Bayesian networks (Paper II and Paper III), and work where the focus has been on extending the applicability of Bayesian networks to very large domains (Paper IV and Paper V).
Paper I is in this respect an application paper, where model building, estimation and inference in a complex time-evolving model is simplified by focusing on the conditional independence statements embedded in the model; it is written with the reliability data analyst in mind. We investigate the mathematical modelling of maintenance and repair of components that can fail due to a variety of failure mechanisms. Our motivation is to build a model, which can be used to unveil aspects of the “quality” of the maintenance performed. This “quality” is measured by two groups of model parameters: The first measures “eagerness”, the maintenance crew’s ability to perform maintenance at the right time to try to stop an evolving failure; the second measures “thoroughness”, the crew’s ability to actually stop the failure development. The model we propose is motivated by the imperfect repair model of Brown and Proschan (1983), but extended to model preventive maintenance as one of several competing risks (David and Moeschberger 1978). The competing risk model we use is based on random signs censoring (Cooke 1996). The explicit maintenance model helps us to avoid problems of identifiability in connection with imperfect repair models previously reported by Whitaker and Samaniego (1989). The main contribution of this paper is a simple yet flexible reliability model for components that are subject to several failure mechanisms, and which are not always given perfect repair. Reliability models that involve repairable systems with non perfect repair, and a variety of failure mechanisms often become very complex, and they may be difficult to build using traditional reliability models. The analysis are typically performed to optimize the maintenance regime, and the complexity problems can, in the worst case, lead to sub-optimal decisions regarding maintenance strategies. Our model is represented by a Bayesian network, and we use the conditional independence relations encoded in the network structure in the calculation scheme employed to generate parameter estimates.
In Paper II we target the problem of fault diagnosis, i.e., to efficiently generate an inspection strategy to detect and repair a complex system. Troubleshooting has long traditions in reliability analysis, see e.g. (Vesely 1970; Zhang and Mei 1987; Xiaozhong and Cooke 1992; Norstrøm et al. 1999). However, traditional troubleshooting systems are built using a very restrictive representation language: One typically assumes that all attempts to inspect or repair components are successful, a repair action is related to one component only, and the user cannot supply any information to the troubleshooting system except for the outcome of repair actions and inspections. A recent trend in fault diagnosis is to use Bayesian networks to represent the troubleshooting domain (Breese and Heckerman 1996; Jensen et al. 2001). This allows a more flexible representation, where we, e.g., can model non-perfect repair actions and questions. Questions are troubleshooting steps that do not aim at repairing the device, but merely are performed to capture information about the failed equipment, and thereby ease the identification and repair of the fault. Breese and Heckerman (1996) and Jensen et al. (2001) focus on fault finding in serial systems. In Paper II we relax this assumption and extend the results to any coherent system (Barlow and Proschan 1975). General troubleshooting is NP-hard (Sochorov´a and Vomlel 2000); we therefore focus on giving an approximate algorithm which generates a “good” troubleshooting strategy, and discuss how to incorporate questions into this strategy. Finally, we utilize certain properties of the domain to propose a fast calculation scheme.
Classification is the task of predicting the class of an instance from as set of attributes describing it, i.e., to apply a mapping from the attribute space to a predefined set of classes. In the context of this thesis one may for instance decide whether a component requires thorough maintenance or not based on its usage pattern and environmental conditions. Classifier learning, which is the theme of Paper III, is to automatically generate such a mapping based on a database of labelled instances. Classifier learning has a rich literature in statistics under the name of supervised pattern recognition, see e.g. (McLachlan 1992; Ripley 1996). Classifier learning can be seen as a model selection process, where the task is to find the model from a class of models with highest classification accuracy. With this perspective it is obvious that the model class we select the classifier from is crucial for classification accuracy. We use the class of Hierarchical Na¨ıve Bayes (HNB) models (Zhang 2002) to generate a classifier from data. HNBs constitute a relatively new model class which extends the modelling flexibility of Näive Bayes (NB) models (Duda and Hart 1973). The NB models is a class of particularly simple classifier models, which has shown to offer very good classification accuracy as measured by the 0/1-loss. However, NB models assume that all attributes are conditionally independent given the class, and this assumption is clearly violated in many real world problems. In such situations overlapping information is counted twice by the classifier. To resolve this problem, finding methods for handling the conditional dependence between the attributes has become a lively research area; these methods are typically grouped into three categories: Feature selection, feature grouping, and correlation modelling. HNB classifiers fall in the last category, as HNB models are made by introducing latent variables to relax the independence statements encoded in an NB model. The main contribution of this paper is a fast algorithm to generate HNB classifiers. We give a set of experimental results which show that the HNB classifiers can significantly improve the classification accuracy of the NB models, and also outperform other often-used classification systems.
In Paper IV and Paper V we work with a framework for modelling large domains. Using small and “easy-to-read” pieces as building blocks to create a complex model is an often applied technique when constructing large Bayesian networks. For instance, Pradhan et al. (1994) introduce the concept of sub-networks which can be viewed and edited separately, and frameworks for modelling object oriented domains have been proposed in, e.g., (Koller and Pfeffer 1997; Bangsø and Wuillemin 2000). In domains that can approx priately be described using an object oriented language (Mahoney and Laskey 1996) we typically find repetitive substructures or substructures that can naturally be ordered in a superclass/subclass hierarchy. For such domains, the expert is usually able to provide information about these properties. The basic building blocks available from domain experts examining such domains are information about random variables that are grouped into substructures with high internal coupling and low external coupling. These substructures naturally correspond to instantiations in an object-oriented BN (OOBN). For instance, an instantiation may correspond to a physical object or it may describe a set of entities that occur at the same instant of time (a dynamic Bayesian network (Kjærulff 1992) is a special case of an OOBN). Moreover, analogously to the grouping of similar substructures into categories, instantiations of the same type are grouped into classes. As an example, several variables describing a specific pump may be said to make up an instantiation. All instantiations describing the same type of pump are said to be instantiations of the same class. OOBNs offer an easy way of defining BNs in such object-oriented domains s.t. the object-oriented properties of the domain are taken advantage of during model building, and also explicitly encoded in the model. Although these object oriented frameworks relieve some of the problems when modelling large domains, it may still prove difficult to elicit the parameters and the structure of the model. In Paper IV and Paper V we work with learning of parameters and specifying the structure in the OOBN definition of Bangsø and Wuillemin (2000).
Paper IV describes a method for parameter learning in OOBNs. The contributions in this paper are three-fold: Firstly, we propose a method for learning parameters in OOBNs based on the EM-algorithm (Dempster et al. 1977), and prove that maintaining the object orientation imposed by the prior model will increase the learning speed in object oriented domains. Secondly, we propose a method to efficiently estimate the probability parameters in domains that are not strictly object oriented. More specifically, we show how Bayesian model averaging (Hoeting et al. 1999) offers well-founded tradeoff between model complexity and model fit in this setting. Finally, we attack the situation where the domain expert is unable to classify an instantiation to a given class or a set of instantiations to classes (Pfeffer (2000) calls this type uncertainty; a case of model uncertainty typical to object oriented domains). We show how our algorithm can be extended to work with OOBNs that are only partly specified.
In Paper V we estimate the OOBN structure. When constructing a Bayesian network, it can be advantageous to employ structural learning algorithms (Cooper and Herskovits 1992; Heckerman et al. 1995) to combine knowledge captured in databases with prior information provided by domain experts. Unfortunately, conventional learning algorithms do not easily incorporate prior information, if this information is too vague to be encoded as properties that are local to families of variables (this is for instance the case for prior information about repetitive structures). The main contribution of Paper V is a method for doing structural learning in object oriented domains. We argue that the method supports a natural approach for expressing and incorporating prior information provided by domain experts and show how this type of prior information can be exploited during structural learning. Our method is built on the Structural EM-algorithm (Friedman 1998), and we prove our algorithm to be asymptotically consistent. Empirical results demonstrate that the proposed learning algorithm is more efficient than conventional learning algorithms in object oriented domains. We also consider structural learning under type uncertainty, and find through a discrete optimization technique a candidate OOBN structure that describes the data well.