Statistical Learning I

fnish all questions

{“cells”:[{“cell_type”:”markdown”,”id”:”dceccf8c”,”metadata”:{“id”:”dceccf8c”},”source”:[“# STA142A Homework 5 – Lasso & Ridgen”,”n”,”n”,”### Dr. Lingfei Cuin”,”n”,”n”,”Please complete the missing parts of this notebook. ~~Problem 1 will not be graded. However, you are strongly encouraged to complete them, as these theoretical exercises will deepen your understanding of the underlying methods.~~n”,”n”,”- If the answer cell is a markdown cell, you are expected to type in your answer without any Python codes.n”,”- If the answer cell is a code cell, you are expected to write python codes to answer the problem.n”,”n”,”If the problem involves mathematical derivations, you need to write Latex code to answer it. Help see https://stackoverflow.com/questions/13208286/how-to-write-latex-in-ipython-notebookn”,”n”,”#### Submission Instructionn”,”You can finish this homework either locally or on colab. After you have finished all the problemsn”,”n”,”**Local Clean Re-run**n”,”- Click `Kernel`->`Restart & Run All` to get a clean output.n”,”- Click `File`->`Download as`->`PDF` to get a pdf copy of your finished homework.n”,”n”,”**Colab Clean Re-run**n”,”n”,”- Click `Runtime`->`Restart Session and Run All` to get a clean output.n”,”- Click `File`->`Print`->`save as PDF` to get a pdf copy of your finished homework.n”,”n”,”**Upload to Canvas**n”,”n”,”- Submit the pdf file online before the deadline.”]},{“cell_type”:”markdown”,”id”:”9cd6090a-21bf-4ec6-aa80-3eee8f596a8b”,”metadata”:{“id”:”9cd6090a-21bf-4ec6-aa80-3eee8f596a8b”},”source”:[“### Problem 1 (Lasso, Ridge and Their Bayesian Connection)n”]},{“cell_type”:”markdown”,”id”:”6bc90e15-86e4-47c4-8fac-3b6b1382eaf0″,”metadata”:{“id”:”6bc90e15-86e4-47c4-8fac-3b6b1382eaf0″},”source”:[“We will now derive the Bayesian connection to the lasso and ridge regression.n”,”n”,”(a) Suppose that $y_i=\beta_0+\sum_{j=1}^p x_{i j} \beta_j+\epsilon_i$ where $\epsilon_1, \ldots, \epsilon_n$ are independent and identically distributed from a $N\left(0, \sigma^2\right)$ distribution. Write out the likelihood for the data.n”]},{“cell_type”:”markdown”,”id”:”fd2d6f1a-6b17-4001-a55d-dd81a8198e60″,”metadata”:{“id”:”fd2d6f1a-6b17-4001-a55d-dd81a8198e60″},”source”:[“Answer:n”]},{“cell_type”:”markdown”,”id”:”7bf2c60c-0502-49bf-b3c9-dd0bb0a3c04c”,”metadata”:{“id”:”7bf2c60c-0502-49bf-b3c9-dd0bb0a3c04c”},”source”:[“(b) Assume the following prior for $\beta: \beta_1, \ldots, \beta_p$ are independent and identically distributed according to a double-exponential distribution with mean 0 and common scale parameter $b$ : i.e. $p(\beta)=\frac{1}{2 b} \exp (-|\beta| / b)$. Write out the posterior for $\beta$ in this setting.n”]},{“cell_type”:”markdown”,”id”:”1381e898-6c9e-4cec-a413-8134e1674ed0″,”metadata”:{“id”:”1381e898-6c9e-4cec-a413-8134e1674ed0″},”source”:[“Answer:n”]},{“cell_type”:”markdown”,”id”:”990139bb-d12b-4f3d-80cb-3017aa3df476″,”metadata”:{“id”:”990139bb-d12b-4f3d-80cb-3017aa3df476″},”source”:[“(c) Argue that the lasso estimate is the mode for $\beta$ under this posterior distribution.n”]},{“cell_type”:”markdown”,”id”:”990508df-42a8-4e1e-8112-8b41cac1cfa4″,”metadata”:{“id”:”990508df-42a8-4e1e-8112-8b41cac1cfa4″},”source”:[“Answer:”]},{“cell_type”:”markdown”,”id”:”63fcd1be-0c57-43af-a88c-bbe3a7f7d505″,”metadata”:{“id”:”63fcd1be-0c57-43af-a88c-bbe3a7f7d505″},”source”:[“(d) Now assume the following prior for $\beta: \beta_1, \ldots, \beta_p$ are independent and identically distributed according to a normal distribution with mean zero and variance $c$. Write out the posterior for $\beta$ in this setting.n”]},{“cell_type”:”markdown”,”id”:”38b7e372-3efd-4d77-81a4-de6568c4796b”,”metadata”:{“id”:”38b7e372-3efd-4d77-81a4-de6568c4796b”},”source”:[“Answer:n”,”n”]},{“cell_type”:”markdown”,”id”:”c5f56913-5f7b-4b96-9d0b-1cede99297de”,”metadata”:{“id”:”c5f56913-5f7b-4b96-9d0b-1cede99297de”},”source”:[“(e) Argue that the ridge regression estimate is both the mode and the mean for $\beta$ under this posterior distribution.”]},{“cell_type”:”markdown”,”id”:”22a53d8d-d480-4250-a645-989f15b25372″,”metadata”:{“id”:”22a53d8d-d480-4250-a645-989f15b25372″},”source”:[“Answer:n”]},{“cell_type”:”markdown”,”id”:”40baf850″,”metadata”:{“id”:”40baf850″},”source”:[“### Problem 2 (Lasso, Ridge and OLS)n”,”n”,”In this exercise, we will predict the number of applications received using the other variables in the `College` data set.n”]},{“cell_type”:”markdown”,”id”:”2e807a06″,”metadata”:{“id”:”2e807a06″},”source”:[“(a) Split the data set into a training set and a test set.”]},{“cell_type”:”code”,”execution_count”:null,”id”:”e3bcf492-f122-423c-8307-f50391c8ca2f”,”metadata”:{“id”:”e3bcf492-f122-423c-8307-f50391c8ca2f”},”outputs”:[],”source”:[]},{“cell_type”:”markdown”,”id”:”83f283f5-1a6e-4f07-82c3-1b297ab36efb”,”metadata”:{“id”:”83f283f5-1a6e-4f07-82c3-1b297ab36efb”},”source”:[“(b) Fit a linear model using least squares on the training set, and report the test error obtained.n”]},{“cell_type”:”code”,”execution_count”:null,”id”:”cfd010f4-866b-48ea-a758-63122dbfaa10″,”metadata”:{“id”:”cfd010f4-866b-48ea-a758-63122dbfaa10″},”outputs”:[],”source”:[]},{“cell_type”:”markdown”,”id”:”83f99a18-b4fa-449e-99fc-9c21ba52b960″,”metadata”:{“id”:”83f99a18-b4fa-449e-99fc-9c21ba52b960″},”source”:[“(c) Fit a ridge regression model on the training set, with $\lambda$ chosen by cross-validation. Report the test error obtained.n”]},{“cell_type”:”code”,”execution_count”:null,”id”:”0b2f4bd8-e985-4eb8-b000-c9a7d17fd234″,”metadata”:{“id”:”0b2f4bd8-e985-4eb8-b000-c9a7d17fd234″},”outputs”:[],”source”:[]},{“cell_type”:”markdown”,”id”:”0de95829-9475-4511-8d33-d94b46815899″,”metadata”:{“id”:”0de95829-9475-4511-8d33-d94b46815899″},”source”:[“(d) Fit a lasso model on the training set, with $\lambda$ chosen by crossvalidation. Report the test error obtained, along with the number of non-zero coefficient estimates.”]},{“cell_type”:”code”,”execution_count”:null,”id”:”64fc67bf-186a-4349-b863-c9bbe054bc4e”,”metadata”:{“id”:”64fc67bf-186a-4349-b863-c9bbe054bc4e”},”outputs”:[],”source”:[]},{“cell_type”:”markdown”,”id”:”6cb429ff-1e33-4cf4-8201-9ef8a80dee58″,”metadata”:{“id”:”6cb429ff-1e33-4cf4-8201-9ef8a80dee58″},”source”:[“(e) Comment on the results obtained. How accurately can we predict the number of college applications received? Is there much difference among the test errors resulting from these five approaches?”]},{“cell_type”:”code”,”execution_count”:null,”id”:”ff2e5696-4070-4544-849a-b0b6db61382e”,”metadata”:{“id”:”ff2e5696-4070-4544-849a-b0b6db61382e”},”outputs”:[],”source”:[]}],”metadata”:{“kernelspec”:{“display_name”:”Python 3 (ipykernel)”,”language”:”python”,”name”:”python3″},”language_info”:{“codemirror_mode”:{“name”:”ipython”,”version”:3},”file_extension”:”.py”,”mimetype”:”text/x-python”,”name”:”python”,”nbconvert_exporter”:”python”,”pygments_lexer”:”ipython3″,”version”:”3.10.0″},”colab”:{“provenance”:[]}},”nbformat”:4,”nbformat_minor”:5}

Requirements: normal

WRITE MY PAPER


Comments

Leave a Reply