Learning tasks typically begin with a data sample --- eg, symptoms and test results for a set of patients, together with their clinical outcomes. By contrast, many real-world studies begin with no actual data, but instead with a budget --- funds that can be used to collect the relevant information. For example, one study has allocated $2 million to develop a system to diagnose cancer, based on a battery of patient tests, each with its own (known) costs and (unknown) discriminative powers. Given our goal of identifying the most accurate classifier, what is the best way to spend the $2 million? Should we indiscriminately run every test on every patient, until exhausting the budget? ... or selectively, and dynamically, determining which tests to run on which patients? We call this problem budgeted learning.
This page overviews our explorations on this theme.