Decision tree model for Titanic survival prediction

You are tasked with building and optimizing a decision tree model to predict survival on the Titanic dataset. Using the embedded titanic dataset, you will tune the models hyperparameters to maximize performance. 1. Dataset: Use the titanic dataset from Pythons seaborn library: Python Code import seaborn as sns titanic = sns.load_dataset(“titanic”) Relevant features: * Input variables: pclass, sex, age, sibsp, and fare. * Target variable: survived (1 = survived, 0 = did not survive). 1. Steps: * Data Preprocessing: * Handle missing values (e.g., fill missing age values with the median). * Convert categorical variables (e.g., sex) into numeric format using one-hot encoding. * Split the data into training and test sets. * Baseline Model: * Train a basic decision tree model using default hyperparameters. * Evaluate the model using metrics such as accuracy, precision, recall, and F1 score. * Hyperparameter Tuning: * Use GridSearchCV from scikit-learn to optimize hyperparameters, such as: * max_depth: Maximum depth of the tree. * min_samples_split: Minimum number of samples required to split an internal node. * min_samples_leaf: Minimum number of samples required to be at a leaf node. * Evaluate the optimized model on the test set. * Visualization: * Plot the decision tree using graphviz or plot_tree from scikit-learn. * Compare performance metrics (e.g., accuracy, precision, recall) between the baseline and optimized models. 2. Analysis: * Summarize the results of the optimization process and discuss how the tuned hyperparameters improved the model. * Reflect on any trade-offs between complexity and performance (e.g., overfitting vs. generalization). Submit a Word Document containing your assignment and the following requirements: * A Python Jupyter Notebook containing: * Code for data preprocessing, model training, and hyperparameter tuning. * Visualizations of the decision tree and performance metrics. * A written summary comparing baseline and optimized model results (1-2 pages).

WRITE MY PAPER


Comments

Leave a Reply