Computing Science 466/551
Introduction to Machine Learning
Information about Research Projects

Requirements Evaluation Logistics Teams
Possible Topics

For your project, you should investigate some interesting aspect of machine learning.  This should include

As the examples above show, the project can involve either an "application pull" -- seeking ways to solve some specific problem (Ex1); or a "technology push" -- exploring ways of coping with some specific technical challenge (Ex2).

Note that this investigation may begin by reading two or more recent, related papers from conferences/journals on artificial intelligence, where the papers are related by tackling the same problem but using different approaches; or by employing similar techniques to solve different problems; or as one is a follow-up work of the other; or they hold the opposite points of view over some problems; etc.

Evaluation Criteria.

Each project will involve Your project will be evaluated based on

Rough Guidelines

75% Content of Written Report Understanding of basic Idea: 
Evaluation of ideas
15% Form of Written Report Clarify of presentation, ...
10% Verbal Presentations, I + II Conciseness, preparations, appropriate content

Required Components of WriteUp.

It should answer the questions:

Try to make your report EASY to read.

  • Be sure to include a overview in the beginning, which outlines what the paper will be describing, in a section-by-section fashion.
  • Include simple examples (or better, a single simple example throughout), to help illustrate the ideas.
  • A picture is worth (at least) a thousand words. Use figures, flow-charts, graphs, whenever appropriate.
  • The material should be structured, and flow. It should NOT be a core-dump of everything you happened to read when you were looking at things related to X. Readers (read "the people who will assign your grade!") get annoyed by having to wade through irrelevant material.
  • If you are giving a high-level description of an algorithm, be sure to explicitly state its input and output.
  • Many algorithms have a flow of information, from one subroutine to another. Provide one or more figures, to make the ideas clear.
  • Also, proof-read your report. As a grader, I find it very irritating to read a report that has pages of easy-to-fix typos, illegible figures, missing citations, etc. And you really don't want to irritate the person who is assigning your grade...
  • If you are describing a precise algorithm, you should give the actual formulas, using terms that are well-defined, in the report.
  • Your report should be self-contained. You are allowed to copy figures from other sources (if they are properly credited). But if you do, be sure to define the terms that appear in that figure!
  • Save trees -- hand in a 2-sided version. And use section numbers, and page numbers!
  • Format/Style

    In a nutshell, your write-up tells a story, in a clear fashion. Your paper should work to establish some explicitly-stated "falsifiable conclusion" (which should, of course, be related to learning...) Every section, paragraph, section, figure, table, ... should contribute to establishing this specific claim. Towards enforcing this, your first section should include an overview, outlining the contents of the paper. You may also want to begin each section with an overview, indicating both what will be included here, and also connecting this to the central theme. As an example, suppose you are claiming that algX works effectively at taskY (eg, algX=="Support Vector Machines", taskY=="detecting patterns in heart rhythms"). Here, it makes sense to describe algX, and perhaps its precursors, and to contrast algX with other related algorithms. (Note this contrast is typically in the form "algQ does BLAH; our algX differs by doing the subBLAH differently, our report proves that this is an improvement"; etc.) You should also discuss the effects of changing the settings for various parameters. Similarly, you should precisely define taskY, and perhaps contrast it with other related tasks. You should then provide evidence to establish the claim -- either empirical or theoretical.

    If your report contrasts algQ with algR, you should explain why that is relevant. Or if it digresses to consider some taskW, again explain why this is included. (If you simply want to include such analysis -- perhaps to indicate that you had read an article -- you may include it in an appendix, possibly labeled as "not completely irrelevant asides" :-) )

    Your report should contain precisely-defined terms; do not be afraid of using mathematical notation! Similarly, if you use comparisons, be sure to specify the details; eg, state "algX is an improvement over algZ", rather than just "algX is an improvement".

    You should include simple illustrative examples! One that conveys the basic ideas, to help the reader understand the various points.

    Be sure to re-read your report! Imagine this topic was new to you... would you understand the material presented? You may assume your reader knows only material presented in 466/551; if you use any other terms, be sure they are defined. You should also explain why you are including that term -- ie, how does it relate to the overall theme of the paper.
    Don't make your reader guess at your meanings!

    Be sure to label figures/tables. (Eg, if you write "10%", is this 10% error, or 10% accuracy?)



    Timing To hand-in your reports...
  • Hand me a hard-copy of your write-up (or put it in my mailbox, or under my door, or ...)
  • Create a webpage containing pointers to
  • Note "PLAIN-TEXT" means just regular text, which is NOT *.doc, NOT *.rtf, ... Also: this plain-text should be email-ed to me; I do not want just a hard-copy.
    wrt Empirical Studies

    Many people are considering empirical studies -- eg, "application pulls". Here, the learning challenge is

    how to use some "experiences" to improve "performance" on some "performance task".
    Your proposal should therefore include the following information:

    The Learning Task

    Notice the LearningTask is independent of "implementation details": This is intensional, as this means you can compare different learning algorithms over the same task. The other parts of the proposal should NOTE: Just building a single learner for a learning task is typically not interesting; I am much more interested in claims of the form LearningAlgorithm A1 did better than A2 at some task, in that A1's learning curve is steeper, or converges to better result, or ...

    Other comments

    Info for Coaches