Mixture Using Variance (ICML06 publication)

Using Query-Specific Variance Estimates to Combine Bayesian Classifiers

Authors: Chi-Hoon Lee, Russ Greiner and Shaojun Wang

Abstract:

Many of today's best classification results are obtained by combining the responses of a set of base classifiers to produce an answer for the query. This paper explores a novel "query specific" combination rule: After learning a set of simple belief network classifiers, we produce an answer to each query by combining their individual responses, using weights based inversely on their respective variances around their responses. These variances are based on the uncertainty of the network parameters, which in turn depend on the training datasample. In essence, this variance quantifies the base classifier's confidence of its response to this query. Our experimental results show that these "mixture-using-variance belief net classifiers" MUVs work effectively, especially when the base classifiers are learned using balanced bootstrap samples and when their results are combined using James-Stein shrinkage. We also found that our variance-based combination rule performed better than both bagging and AdaBoost, even on the set of base classifiers produced by AdaBoost itself. Finally, this framework is extremely efficient, as both the learning and the classification components require only straight-line code.

Keywords: Belief nets, Graphical Model, Machine Learning

Chi-Hoon Lee, Russ Greiner and Shaojun Wang,
Using Query-Specific Variance Estimates to Combine Bayesian Classifiers.
International Conference on Machine Learning (ICML'06), Pittsburgh, June 2006.
(Or see here)
(Warning: we recently found that some of the empirical results reported are problematic.)

Extended version (pdf)
(Eqn 11 on p4: should involve... [1/ θ_+e|-c - 1])

Extra information
- Proof: Derivation for James-Stein shrinkage
  (see Section 3.1 (page 2-3) of extended version)
- Tables omitted from the conference paper
  - Table 1 (Section 4.1):
  - Table 2 (Section 4.1):
  - Table 3 (Section 4.2)
  - Table 4 (Section 4.2)
  - Table 5 (Section 4.3)
    - Comparing kNBs and kTANs when data samles are either Balanced or Skewed.
  - Table 6 (Section 4.4)
    - Comparing MUV(NB, Ada, js) and Adaboost(NB)

Quantifying the Uncertainty of a Belief Net Response --- webpage