Goal
Each group (maximum 2) need to select different classification dataset and compares three data mining models you have already covered, then explains why one works better.
Methodology Process is as follows:
Part A Dataset selection (Telco Customer Churn dataset)
Each group must pick a dataset that:
is classification (binary or multi-class)
is not used by any other group – for my Group is (Telco Customer Churn dataset)
Different sector: healthcare, marketing, finance, education, cybersecurity, etc my Case (marketing).
Use Kaggle warehouse to pick your data (uploaded)
Part B Models to run (must include a baseline)
Each group must run three models:
0R or 1R (baseline)
Naive Bayes
Decision Tree (e.g., J48/C4.5 in Weka)
Part C What to report (beyond accuracy)
Cover page
Data Description and understanding
o The size of data, number of records, number of features, the target class, how balanced are the classes.
Preprocessing
o Handle missing values (remove or impute must state which)
Evaluation
o How you split the data?
o For each model report: Accuracy, Confusion matrix, Recall, F1, Precision.
Feature importance / interpretability
o For Decision Tree:
provide the top splitting features (root + next level) and
include a screenshot of the tree or rules
o For Naive Bayes:
list the top 5 most informative features (or top conditional probabilities per class, depending on tool)
Error analysis
o Provide some error analysis such as which class is most often misclassified?, provide 2-3 possible reasons (data, imbalance, noise, overlap)
Conclusion
o Which model is best and why (refer to F1/precision/recall, not only accuracy)
o One limitation + one improvement idea
Deliverables
3-6 pages report (template headings you can enforce)
Results table comparing the three models
Screenshots/export from the tool (confusion matrix + tree)
Dataset reference (source + brief description)
Data Sources:
UCI Machine Learning Repository:
Kaggle Dataset:
Hint: if you use Excel, below is an important video
Attached Files (PDF/DOCX): Syllabus.docx, CASE STUDY 1.docx
Note: Content extraction from these files is restricted, please review them manually.

Leave a Reply
You must be logged in to post a comment.