WWCC Project Data

What determines a diamonds price? Do larger diamonds cost more? Does color affect price?

For this assignment, you will explore the various characteristics of diamonds to uncover how certain features relate to the price. You will be using the Diamonds data set with the following variables:

Price price in U.S. dollars

Carat weight of diamond

Cut quality of cut (Fair, Good, Very Good, Premium, Ideal)

Color from J (worst) to D (best)

  • Using StatCrunch, explore the quantitative variables carat and price. Calculate summary statistics and construct a histogram for each variable. Use your findings to describe each distribution (carat, price) making sure you include information about the shape, center, and spread as well as any outliers that may be present. Be sure to consider what the shape suggests in the context of the variable. Decide whether the mean or the median is a better measure of center and why for both price and carat.
  • Using StatCrunch, find the proportion of diamonds in this data set of each color. What is the most common and least common color in this set?
  • Using StatCrunch, construct side-by-side boxplots for price across levels of cut. Compare shape, center, and spread across the groups. Identify which group has the highest variability and whether there are notable differences in medians or outliers. Interpret your findings in context.
  • Repeat the process above for price across levels of color
  • Summarize and reflect on your findings from this analysis. Include things like the typical size and cost of a diamond and make comparisons between groups, noting any relationships or patterns that emerged. What did you find most interesting or surprising?
  • Based on your analysis, what insights would guide your decision if you were shopping for a diamond? What factors would you prioritize, and why?

Your final submission should be typed in a Word document and include the following components presented in paragraph form. StatCrunch graphs and tables that support your findings should be cut and pasted into the document.

  • Describe the distribution of both weight (carat) and price of the diamonds in the set including shape, center, spread, and any extreme values. Discuss the shape in the context of the variable and determine whether the mean or the median is the better measure of center for each variable and how you made that determination.
  • Give the most and least common colors and respective proportions.
  • Provide the side-by-side boxplots for price across levels of cut and report your findings. Be sure you interpret in context and using plain language.
  • Provide the side-by-side boxplots for price across levels of color and report your findings. Be sure you interpret in context and using plain language.
  • Summary and reflection Include things like the typical size and cost of a diamond and make comparisons between groups, noting any relationships or patterns that emerged. What did you find most interesting or surprising?
  • Analysis – what insights would guide your decision if you were shopping for a diamond? What factors would you prioritize, and why?

You would need my statcrunch log in to complete the assignment.

WRITE MY PAPER