MCS314 ( Diwali Semester 2010) Special Topics in Data Mining
Advanced Classification: Ensemble Techniques (4 credit course : 3-0-2)
Being an advanced course in Data Mining, the pre-requisite is MCS104. Other pre-requisite is a course on Algorithms and fair understanding of statistics and linear algebra. Strong programming skills and dexterity with complex data structures will be of advantage.
Tentative Weekly schedule
Week | Topics | Suggested Lab Tasks |
Week 1 - 3 | Recapitulate Classification, Bayesian Decision Theory, Taxonomy of Classification Methods, Decision Functions and Notation building | Learn Latex, GnuPlot Revise scripting language Check Assignment 1 |
Week 4 - 5 | Evaluation of Classifiers, Occam's Razor, No Free Lunch theorem | |
Week 6 - 16 | Ensemble Techniques | Check Assignment 2 |
Text Books:
1. Data Mining with Decision Trees : Theory and Applications, Lior Rokash and Oded Maimon, (2008), World Scientific Publication.
2. Pattern Classification (Second Edition), (2001) Duda, Hart and Stork, John Wiley
Supporting Texts
3. Neural Networks (Second Edition), (1999) Simon Haykin, PHI
4. Data Mining: Concepts and Techniques, Han and Kamber (Morgan Koffmann, 2006)
5.
Principles of Data Mining, David J. Hand, Heikki Mannila and Padhraic Smyth
(PHI)
Research Papers
1. Introduction to ROC Analysis (A detailed version available here)
Internal Assessment
Programming assignments (20 marks )
Minors (30 marks)
Syllabus for Minor 1: Portion of the syllabus covered up to week 6.
Syllabus for Minor 2 : Portion of the syllabus covered between week 7 and 11
Assignment 1 (Submission date 5 Sept 2009)
Implement Naive Bayes Classifier. For five data sets in weka, run two classification algorithms and note the performance. Observe the performance of NB classifier for these data sets. Plot the observations and write a report.
Assignment 2: As announce in class; Submit by Oct 10; Evaluation schedule to be announced.