Visualizing and interpreting data – answers questions; creat…

You will use the skills you learned in weeks 1 and 2 evaluate a set of statistics. But keep this in mind, learning statistics is more than learning techniques and tools. It is also about learning how to think critically. That is the essential skill to develop in this course. Associated with critical thinking is the need to present reasoning that explains how you arrived at the conclusions you drew, and to provide evidence that supports your reasoning and conclusions! You must practice this in all of your post and responses for every assignment. Assertions or conclusions that you cannot support with evidence and valid reasoning will, sadly, result in a loss of points.

A) Determine Descriptive Statistics of Sample Dataset

Physical variables can be correlated with each other. An Excel spreadsheet is attached that includes 30 data sets illustrating the height and weight of a population segment. The first ui is the height in inches and the second the weight in pounds. Heights and weights are the two variables in the datasets, and each of these variables has a number of observations (values) associated with it. Choose one data set that has not already been selected (include the number of the data set in your title to make this easier to do; see notes section for example).

i) First provide the definitions of:

  • Mean
  • Median
  • Standard deviation

ii) Next, calculate the values of mean, median and standard deviation of the a) heights and the b) weights in the data set. You can either do this by hand or see the video linked at the end to show you how to do this using the tools in Excel. Then discuss with each other the questions posed later in this assignment.

iii) Do the values of the mean, median, and standard deviation tell you anything useful about a dataset? If so, what is that? If you wanted to convey to someone a sense of what the data means, which statistic would you use (mean, median, standard deviation). Explain the reasons for your choice?

B) Create Plot of Data

Now, construct a scatter plot. You will use this plot to discuss some of the questions in Part C.

  • Select the Height and Weight columns in a dataset.
  • From the insert menu select the “Scatter” plot (the option with datapoints only and no connecting lines).
  • Right-click on one of the data points in the chart, From the pop-up menu, click on “Add Trendline”. This brings up another menu, A Linear trendline is the default option but you can pick another option as well.
  • Try fitting several of the options. Do you see much of a difference between the different trend lines? Which one do you think gives you a visually better fit? Which one would you choose to use for this plot? Explain your reasoning.

C) Analyze and Discuss

Discuss any two of the numbered items below in your main post. Write the response for each item in a separate paragraph and provide the number of the item to which you are responding.

Note: You must provide logical reasoning for the inferences that you make. If you make an assertion or claim, you must provide reasoning and links to published, reliable evidence to support your assertion/claim.

  1. Observe the scatter plot you built? Does there appear to be a correlation between the height and weight of the subjects? Is it a strong or weak correlation? What reasoning did you use to decide if the correlation is strong or weak.
  2. Right-click on the numbers lining the y-axis. From the menu that pops up, select “Format Axis”. Change the upper and lower bounds for the y-axis. What does this do to the shape of the scatter plot. Does it change your interpretation of the strength of the correlation between the variables? Explain your reasoning. Paste your charts directly in your post.
  3. Explain what is correlation and what is causation? Are the variables in the provided dataset correlated or causally connected? Explain your reasoning.
  4. In creating scatter plots, there is an interpretation bias that gets built in. Identify the bias and discus the difficulty it causes. How would you avoid this bias?
  5. Can you infer anything about the nature of the subject population from the data provided? Explain your reasoning and provide references to reliable published evidence that supports your reasoning.
  6. If you were told that the variables in the dataset were the speed of a car (in miles per hour) and the systolic blood pressure of the driver (in mm of mercury), which variable would you plot on which axis? What conclusion would you draw?
  7. In light of the above questions, what do you think of the notion of “data-driven decision making”? Is data (numbers) sufficient by itself to draw conclusions, i.e. to draw conclusions without knowing what physical things are represented by the data?

D) Self-Reflection and Responses

Self-reflection: Describe what you have learned from this exercise about using descriptive statistics and data visualization to understand the information contained in a dataset, and about communicating that information to others.

Response Posts: Respond to at least two of your classmates by elaborating upon or critiquing the points they make.

Notes

If you have a different version of excel then you may want to search to see how to form a trendline. These instructions work with Office 365 and Office 2010.

  • Present your work neatly so that it is easy to read. Separate text into paragraphs. Highlight/bold important points or numbers. Attach your excel worksheet with your scatter plot and paste/insert plots directly into your post. (5 points associated with the presentation of your posts)
  • If you make an assertion or claim, provide supporting evidence (citation to published work) and explain your reasoning. (This is very important. Up to 15 points are associated with this).

Requirements: Questions answered 250 words +

WRITE MY PAPER