CMPUT 690: KDD Principales

HOME Announcements Calendar On-line Materials Activities Grading Glossary U-Chat Tool Web Links Stud. Resources

 
About...

 

©1999 Osmar R. Zaïane
(zaiane@cs.ualberta.ca)

                 

Activities


Assignments
There are two assignments scheduled. The first one is a group work for data mining tool evaluation. One tool from a set of data mining tools is studied and evaluated, then presented in class. The second assignments, a set of questions and exercices to answer, is is NOT a group work.

Important Dates

 Due Date
Homework 1 (report)October 20th, 1999
Homework 1 (presentations)
[schedule]
October 20th and 22nd, 1999
Homework 2October 29th, 1999

Homeworks will count 10% of the overall grade.

Projects
Each student should either
  attempt to implement an algorithm for one of the problems discussed in the class or from one research paper related to data mining and knowledge discovery from large databases, or
  write a survey on a particular Data Mining topic.

Implementation Projects

It is the student's responsability to choose or come up with an implementation project. The project can be the implementation of a new algorithm, the adaptation of an existing algorithm or combination of few existing algorithms to solve a given problem, a data visualization solution, etc.

Students should write a project proposal (1 or 2 pages) explaining their project: topic, implementation choices, approach and schedule. All projects will be demo'ed to the class at the end of the semester. The implementations could be with C/C++ or Java, on Linux, Window NT/98 , or other systems.

A project report should be submited at the end the project before the demo. The report should include (i) a description of the project, (ii) a brief overview of the design and structure chosen, (iii) the algorithms used, (iv) a list of limitations and known bugs, (v) the program listings (preferably in a flopy disk), and (v) a discussion on the potential use of the program and proposal of its future possible improvements.

I will suggest in the following list some examples of projects.

Survey Papers

Survey papers should be between 20 and 30 pages and should be presented in class by the author.
Survey papers should summarize previous research and report on recent research issues and advances in the chosen topic. The papers should be well written and organized, and should provide a thorough summarization of the selected data mining research area. A list of references (bibliography) must be included.
The evaluation of the paper would be based on the comprehensiveness and organization of the paper.

Students may also opt for a research paper. A research paper should present a new idea or method to solve a given data mining problem. The approach presented should be a novel and original contribution. A research paper could be good start for a Masters or Ph.D. research.

Here are some research survey topic examples:

  1. Web usage mining (knowledge extraction from Web access logs)
  2. Knowledge discovery from unstructured or semi-structured data on the WWW (query languages for unstructured data, Web-content mining, etc.)
  3. Text mining (data mining from text repositories and documents)
  4. Data Mining from non-traditional databases such as OODB and deductive database.
  5. Spatial Data Mining (data mining from spatial databases and GIS systems);
  6. Multimedia Data Mining (data mining from image or video repositories)
  7. Clustering Mining
  8. Classification Mining
  9. Association rule mining.
  10. Datacube construction
  11. Datawarehousing
This list is not exhaustive. Students can suggest other survey topics.

Here is an example of survey written by a grad student in summer 1995.

Important Dates

 Due Date
Project or Survey Paper ProposalOctober 4th, 1999
Project Report or Survey PaperDecember 3rd, 1999
DemoSee demo shedule

Projects (or survey papers) will count 35% of the overall grade.

Readings
Students will have to read recent or classical research papers on data mining or related fields, and present the papers in class. The papers will be selected from conference proceedings such as SIGMOD, SIGKDD, VLDB, ICDE, etc, or journals and books. A list of suggested papers will be put on-line soon.

After each student presentation, attending students should fill in the presentation evaluation form and are asked to submit a written evaluation to the instructor (zaiane@cs.ualberta.ca) answering the following questions:

  • What did you like in this presentation?
  • Was the topic clearly presented?
  • Were the slides understandable?
  • What could be done to improve the presentation?
  • Do you have any suggestions for the speaker?
  • The evaluations are anonymous in the sense that only the instructor will see your name on your evaluation.
    The evaluations and comments will be summarized and sent to the presenter.

    Important Dates

     Due Date
    Paper SelectionSeptember 22th, 1999 (Sept. 24th the latest)
    PresentationsSee presentation shedule

    Readings and class presentations will count 25% of the overall grade.


    [Home] [Announcements] [Calendar] [On-line Materials] [Activities]
    [Grading] [Glossary] [U-Chat Tool] [Web Links] [Student Resources]

    Last updated: September 9th, 1999
    [About this site and list of symbols]
    Copyright Osmar R. Zaiane, 1999