Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block Designs
MetadataShow full item record
Author/sSalmerón Cerdán, Antonio; Madsen, Anders L.; Jensen, Frank; Langseth, Helge; Nielsen, Thomas D.; [et al.]
In this paper we propose a method for scaling up filterbased feature selection in classification problems. We use the conditional mutual information as filter measure and show how the required statistics can be computed in parallel avoiding unnecessary calculations. The distribution of the calculations between the available computing units is determined based on balanced incomplete block designs, a strategy first developed within the area of statistical design of experiments. We show the scalability of our method through a series of experiments on synthetic and real-world datasets.
Design of experiments
Balanced incomplete block design
Conditional mutual information
Statistical design of experiments