Mixture Using VarianceSome of today's best classification results are obtained by combining base classifiers to produce a single label for an instance. This paper considers a number of ways of learning a set of simple belief networks, each either naïve Bayes or tree-augmented naïve Bayes, then combining their responses to produce a single answer to a query, using estimates of their respective variances. Our results on UCI datasets show that these ``mixture-using-variance Bayesian classifiers'' MUVs work effectively, especially when the different classifiers were learned using balanced bootstrap samples and when their results are combined using James-Stein shrinkage. We also show further improvement by learning a set of these composite MUVs, and then for each query returning the answer produced by the single MUV that has the smallest total variance -- a form of instance-specific model selection. Budgeted LearningAloak Kapoor; Dan Lizotte; Omid Madani There is often a cost associated with acquiring training data. We consider the situation where the learner, with a fixed budget, can only see the data that it has `purchased' during training. Our learner is allowed to sequentially specify which particular test to run on which specific individual, until exhausting the budget. One obvious option is the simple "round robin" approach: run all tests on each individual, for as many individuals as possible. (So if the budget is $200 and there are 10 tests, each costing $5, this method would run all 10 tests on 4 individuals.) Alternatively, the learner could decide which test on which individual sequentially, based on the results of the prior tests --- eg, first run test#1 on patient#1 then, if that returns "+", run test#2 on patient#8 (otherwise run test#7 on patient#5), and so forth, until spending all $200. In any case, after collecting $200-worth of data, the learner then uses this accumulated data to produce a classifier. Our preliminary results confirm that the standard "round robin" performs significantly worse than many other algorithms in that these alternatives lead to better classifiers. Our novel "biased-robin" algorithm appears the best. We plan to continue investigating this budgeted learning task, both theoretically (to better understand the challenges of this task, towards determining which algorithms are likely to be effective) and empirically, exploring diverse application domains, from medical- and bio-informatics, to software engineering (deciding which tests to run to debug a class of software systems). Learning Belief Net ClassifiersWei Zhou; Yuhong Guo; Jie Cheng; Bin Shen; Xiaoyuan Su While (Bayesian) belief networks (BNs) are generative models, capable of modeling a joint probability distribution over a set of variables, they are typically used discriminatively for some classification task --- eg, to predict the probability of some disease, given some evidence about the patient. We are currently exploring ways to learn an effective BN-based classifier from a datasample. In general, learning a BN involves first finding an appropriate structure (which identifies which variables depends which others), then finding the parameters for this structure. Our existing results deal with each of these separately: the ELR algorithm that uses gradient descent to produce the parameters that optimal conditional likelihood, and a "Biases^2 + Variance" approach that identifies a good structure. We plan to explore ways to combine these approaches, as well as experiment with other approaches, such as maximizing margins. Collaborative Filtering, using Hierarchical Probabilistic Relational ModelsPersonlized recommender systems, which recommend specific products (eg books, movies) to individuals, have become very prevalent --- see the success of widely used systems like Amazon.com's book recommender and Yahoo!'s LAUNCHcast music recommender system. The challenge faced by these system is predicting what each individual will want. Many of these systems use collaborative filtering (CF): If person P1 likes many of the same things that person P2 likes, and if P2 liked object O7 (say "StarWars"), then we might guess that P1 will also like O7. A pure CF is unable to use information about P1 (eg, "teenage male") nor about O7 (eg, "SciFi, Action"). By contrast, a content based recommender could use such information, which would allow it to find rules of the form "teenage males typically like SciFi movies"; however such systems would be unable to explicitly use information about colleagues. We are developing a system that can use both types of information ---
collaborative and content --- to make predictions about a new
Our preliminary results show that standard PRMs are competitive with
the best recommenders, and that hPRMs strictly dominate PRMs. We plan to
continue developing ideas that extend our current hPRM, towards obtaining
world-class performance for this task, then to apply these ideas to other
tasks, such as learning patterns in microarrays.
Robert Price,
Gerald Haeubl,
Paul Messinger
Customer interfaces, whether physical retail outlets or online
storefronts, help shoppers to search the assortment of products sold by
the store. The design of an effective interface requires careful
tradeoffs amongst both products and navigation structures. Today's
customer interfaces typically make these tradeoffs with a view to creating
a single interface that is identical for all shoppers, guided by what is
deemed to be most suitable for the average customer. Naturally, such an
interface tends to be suboptimal for almost all shoppers.
Electronic shopping environments, however, offer the opportunity to create
personalized customer interfaces with a unique store layout for each
individual customer, tailored to his or her needs, preferences, and
interests. We show how such personalized interfaces can be constructed
automatically and in real time using a model of the shopper-interface
interaction and a record of the customer's past buying behaviour. This
technology will tend to be most useful in the presence of (1) a large
product assortment, (2) multi-item purchases, and (3) a substantial amount
of preference heterogeneity among shoppers.
The full paper introduces a sufficient set of features for representing a
personalized customer interface, explains how the interface can be viewed
as a dynamic language created between the user and the system, and shows
how formal decision-theoretic techniques can be used to optimize this
language. We illustrate the proposed technology in a typical repeated
multi-item purchase setting (grocery shopping), and demonstrate how our
approach can be used to optimize the online grocery shopping experience
for consumers through minimizing, on a user-by-user basis, the effort each
requires to fill his or her weekly shopping basket.
|