A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs
Identifiers
Share
Metadata
Show full item recordAuthor/s
Madsen, Anders L.; Jensen, Frank; Salmerón Cerdán, Antonio; Karlsen, Martin; Langseth, Helge; [et al.]Date
2014Abstract
The framework of Bayesian networks is a widely popular
formalism for performing belief update under uncertainty. Structure re-
stricted Bayesian network models such as the Naive Bayes Model and
Tree-Augmented Naive Bayes (TAN) Model have shown impressive per-
formance for solving classi cation tasks. However, if the number of vari-
ables or the amount of data is large, then learning a TAN model from
data can be a time consuming task. In this paper, we introduce a new
method for parallel learning of a TAN model from large data sets. The
method is based on computing the mutual information scores between
pairs of variables given the class variable in parallel. The computations
are organised in parallel using balanced incomplete block designs. The
results of a preliminary empirical evaluation of the proposed method on
large data sets show that a signi cant performance improvement is pos-
sible through parallelisation using the method presented in this paper.