Skip to content
Home » Descriptive Statistics and Data Visualization

Descriptive Statistics and Data Visualization

  • by

INSTRUCTIONS

  • The purpose of this assignment is to help you familiarized with the foundations of Data Mining through using descriptive statistics and data visualization with R.
  • In this assignment we shall use R functions to get hands-on experience to calculate the correlations through graphical and numerical methods.
  • We will create heatmap correlation plots for observing the distributions of correlations among those variables and calculate the descriptive statistics among those correlation coefficients.
  • Please refer to Chapter 3, R codes for creating Figure 3 and 7 in Data Mining for Business Analytics: Concepts, Techniques, and Applications in R (in this week’s Reading & Resources). You may refer to the publisher website for the open resources to create the same Figures in chapter 3 to get familiarized with the chapter 3 contents.
  • Then use the attached dataset NewYorkHousing.csv to answer the following questions. Please also open the second attached file for the sample R codes which you can easily revise to generate the Figures 3.1 to 3.4 and 3.5 to 3.8 required in this assignment:
  • Create a heatmap with values (just run the R codes will get it).
  • Calculate the minimum, maximum, medium, standard deviation of ALL the correlations, except those correlations which are equal to 1 in the diagonal cells in the heatmap. (Hints: use functions in R instead of finding them in the heatmap visually. Use the summary(cor.mat) will get the min, max and medium, and use function sd() for the standard deviation).
  • Create scatterplot matrix (hints: using ggpairs in R) using MDEV with these predictors: INDUS, CHAS, NOX, RM, AGE, DIS, TAX and state which predictor has strongest correlation with MEDV?
  • Please copy/paste screen images of your work in R, and put into a Word document for submission. Be sure to provide narrative of your answers (i.e., do not just copy/paste your answers without providing some explanation of what you did or your findings).
  • Please make sure you use install.packages(“????”) before you invoke the library(????) otherwise you will have errors.
  • Please include Introudction, R codes with outputs, Figures and explanations with cover and reference pages. A good conclusion to wrap up the assignment is also expected. 
  • Please follow APA format.

References:

The R Guide (http://cran.fhcrc.org/doc/contrib/Owen-TheRGuide.pdf)

error: Content is protected !!