Goal
Each group (maximum 2) need to select different classification dataset and compares three data mining models you have already covered, then explains why one works better.
Software used:
Excel and Weka.
Methodology Process is as follows:
Part A Dataset selection I choose (Telco Customer Churn dataset)
Each group must pick a dataset that:
is classification (binary or multi-class)
is not used by any other group – for my Group is (Telco Customer Churn dataset)
Different sector: healthcare, marketing, finance, education, cybersecurity, etc my Case (marketing).
Use Kaggle warehouse to pick your data (uploaded)
Part B Models to run (must include a baseline)
Each group must run three models:
0R or 1R (baseline)
Naive Bayes
Decision Tree (e.g., J48/C4.5 in Weka)
Part C What to report (beyond accuracy)
Cover page
Data Description and understanding
o The size of data, number of records, number of features, the target class, how balanced are the classes.
Preprocessing
o Handle missing values (remove or impute must state which)
Evaluation
o How you split the data?
o For each model report: Accuracy, Confusion matrix, Recall, F1, Precision.
Feature importance / interpretability
o For Decision Tree:
provide the top splitting features (root + next level) and
include a screenshot of the tree or rules
o For Naive Bayes:
list the top 5 most informative features (or top conditional probabilities per class, depending on tool)
Error analysis
o Provide some error analysis such as which class is most often misclassified?, provide 2-3 possible reasons (data, imbalance, noise, overlap)
Conclusion
o Which model is best and why (refer to F1/precision/recall, not only accuracy)
o One limitation + one improvement idea
Deliverables
3-6 pages report (template headings you can enforce)
Results table comparing the three models
Screenshots/export from the tool (confusion matrix + tree)
Dataset reference (source + brief description)
Data Sources:
UCI Machine Learning Repository:
Kaggle Dataset:
Hint: if you use Excel, below is an important video
Attached Files (PDF/DOCX): CASE STUDY 1.docx, Syllabus.docx
Note: Content extraction from these files is restricted, please review them manually.

Leave a Reply
You must be logged in to post a comment.