©1999 Osmar R. Zaïane
There are two assignments scheduled. The first one is a group work for
data mining tool evaluation. One tool from a set of data mining tools
is studied and evaluated, then presented in class. The second
assignments, a set of questions and exercices to answer, is is NOT a group work.
Homeworks will count 10% of the overall grade.
- Each student should either
- attempt to implement an algorithm for one of the
problems discussed in the class or from one research paper related to
data mining and knowledge discovery from large databases, or
- write a
survey on a particular Data Mining topic.
It is the student's responsability to choose or come up with an implementation
project. The project can be the implementation of a new algorithm, the
adaptation of an existing algorithm or combination of few existing
algorithms to solve a given problem, a data visualization solution,
Students should write a project proposal (1 or 2 pages) explaining their project:
topic, implementation choices, approach and schedule. All projects will be demo'ed to the
class at the end of the semester. The implementations could be with
C/C++ or Java, on Linux, Window NT/98 , or other systems.
A project report should be submited at the end the project before the
demo. The report should include (i) a description of the project, (ii)
a brief overview of the design and structure chosen, (iii) the
algorithms used, (iv) a list of limitations and known bugs, (v)
the program listings (preferably in a flopy disk), and (v) a
discussion on the potential use of the program and proposal of its future possible
I will suggest in the following list some examples of projects.
Survey papers should be between 20 and 30 pages and should be
presented in class by the author.
Survey papers should summarize previous research and report on recent research issues and
advances in the chosen topic. The papers should be well written and
organized, and should provide a thorough summarization of the selected
data mining research area. A list of references (bibliography) must be
The evaluation of the paper would be based on the comprehensiveness
and organization of the paper.
Students may also opt for a research paper. A research paper should
present a new idea or method to solve a given data mining problem. The
approach presented should be a novel and original contribution.
A research paper could be good start for a Masters or Ph.D. research.
Here are some research survey topic examples:
This list is not exhaustive. Students can suggest other survey topics.
Here is an example of survey written by a grad student in summer 1995.
- Web usage mining (knowledge extraction from Web access logs)
- Knowledge discovery from unstructured or
semi-structured data on the WWW (query languages for unstructured
data, Web-content mining, etc.)
- Text mining (data mining from text repositories and documents)
- Data Mining from non-traditional databases such
as OODB and deductive database.
- Spatial Data Mining (data mining from spatial databases and GIS systems);
- Multimedia Data Mining (data mining from image or video repositories)
- Clustering Mining
- Classification Mining
- Association rule mining.
- Datacube construction
| ||Due Date|
|Project or Survey Paper Proposal||October
|Project Report or Survey Paper||December
|Demo||See demo shedule|
Projects (or survey papers) will count 35% of the overall grade.
Students will have to read recent or classical research papers on data
mining or related fields, and present the papers in class. The papers
will be selected from conference proceedings such as SIGMOD,
SIGKDD, VLDB, ICDE, etc, or journals and books. A list of suggested
papers will be put on-line soon.
After each student presentation, attending students should fill in the
presentation evaluation form and are asked to
submit a written evaluation to the instructor (firstname.lastname@example.org)
answering the following questions:
What did you like in this presentation?
Was the topic clearly presented?
Were the slides understandable?
What could be done to improve the presentation?
Do you have any suggestions for the speaker?
The evaluations are anonymous in the sense that only the instructor will
see your name on your evaluation.
The evaluations and comments will be summarized
and sent to the presenter.
Readings and class presentations will count 25% of the overall grade.