Report Number: CS-TR-97-1587
Institution: Stanford University, Department of Computer Science
Title: Ensembles for Supervised Classification Learning
Author: Matan, Ofer
Date: March 1997
Abstract: This dissertation studies the use of multiple classifiers
(ensembles or committees) in learning tasks. Both theoretical
and practical aspects of combining classifiers are studied.
First we analyze the representational ability of voting
ensembles. A voting ensemble may perform either better or
worse than each of its individual members. We give tight
upper and lower bounds on the classification performance of a
voting ensemble as a function of the classification
performances of its individual members.
Boosting is a method of combining multiple "weak"
classifiers to form a "strong" classifier. Several issues
concerning boosting are studied in this thesis. We study SBA,
a hierarchical boosting algorithm proposed by Schapire, in
terms of its representation and its search. We present a
rejection boosting algorithm that trades-off exploration and
exploitation: It requires fewer pattern labels at the expense
of lower boosting ability.
Ensembles may be useful in gaining information. We study
their use to minimize labeling costs of data and to enable
improvements on performance over time. For that purpose a
model for on-site learning is presented. The system learns by
querying "hard" patterns while classifying "easy" ones.