Customer Churn Prediction Using Classification Models

Goal

Each group (maximum 2) need to select different classification dataset and compares three data mining models you have already covered, then explains why one works better.

Software used:

Excel and Weka.

Methodology Process is as follows:

Part A Dataset selection I choose (Telco Customer Churn dataset)

Each group must pick a dataset that:

is classification (binary or multi-class)

is not used by any other group – for my Group is (Telco Customer Churn dataset)

Different sector: healthcare, marketing, finance, education, cybersecurity, etc my Case (marketing).

Use Kaggle warehouse to pick your data (uploaded)

Part B Models to run (must include a baseline)

Each group must run three models:

0R or 1R (baseline)

Naive Bayes

Decision Tree (e.g., J48/C4.5 in Weka)

Part C What to report (beyond accuracy)

Cover page

Data Description and understanding

o The size of data, number of records, number of features, the target class, how balanced are the classes.

Preprocessing

o Handle missing values (remove or impute must state which)

Evaluation

o How you split the data?

o For each model report: Accuracy, Confusion matrix, Recall, F1, Precision.

Feature importance / interpretability

o For Decision Tree:

provide the top splitting features (root + next level) and

include a screenshot of the tree or rules

o For Naive Bayes:

list the top 5 most informative features (or top conditional probabilities per class, depending on tool)

Error analysis

o Provide some error analysis such as which class is most often misclassified?, provide 2-3 possible reasons (data, imbalance, noise, overlap)

Conclusion

o Which model is best and why (refer to F1/precision/recall, not only accuracy)

o One limitation + one improvement idea

Deliverables

3-6 pages report (template headings you can enforce)

Results table comparing the three models

Screenshots/export from the tool (confusion matrix + tree)

Dataset reference (source + brief description)

Data Sources:

UCI Machine Learning Repository:

Kaggle Dataset:

Hint: if you use Excel, below is an important video

Attached Files (PDF/DOCX): CASE STUDY 1.docx, Syllabus.docx

Note: Content extraction from these files is restricted, please review them manually.

WRITE MY PAPER


Comments

Leave a Reply