Download The R Studio Tidyverse And Associated Packages

Question

Download The R Studio Tidyverse And Associated Pack Use R package tidyverse (see chapter 4 of Introduction to Data Science Data Analysis and Prediction Algorithms with R by Rafael A. Irizarry). You need to go through chapter 4 before attempting the following questions. Using dplyr functions (i.e., filter, mutate ,select, summarise, group_by etc. ) and "murder" dataset (available in dslab R package) and write appropriate R syntax to answer the followings: a. Calculate regional total murder excluding OH, AL, and AZ (Hint: filter(! abb %in% x) # here x is the exclusion vector) b. Display the regional population and regional murder numbers. c. How many states are there in each region? (Hint: n ()) d. What is Ohio's murder rank in the Northern Central Region (Hint: use rank(), row_number()) e. How many states have murder number greater than its regional average (Hint: nrow() ) f. Display 2 least populated states in each region (Hint: slice_min() ) Use pipe %>% operator for all the queries. Show all the output results

Dr. Jack HW Helper · Accepted Answer

Download The R Studio Tidyverse And Associated Pack Introduction The application of R programming language, particularly the tidyverse collection of packages, offers an efficient way to analyze and interpret complex datasets. The 'murder' dataset from the dslab R package provides valuable insights into regional crime statistics across the United States. This paper demonstrates how to utilize dplyr functions within tidyverse to perform specific data analysis tasks, including filtering, summarizing, ranking, and selecting data based on regional and state-level murder statistics. Data Context and Preparation The 'murder' dataset contains variables such as state abbreviations, regional classifications, population figures, and murder counts. Before executing the analysis, it is essential to have the tidyverse package installed and loaded, along with the dslab package that includes the 'murder' dataset. The analysis assumes the data is pre-processed as per guidelines in chapter 4 of Irizarry's introductory text, focusing on data manipulation techniques. Analysis Tasks and R Syntax The following are detailed steps with corresponding R syntax to fulfill the specified questions: a. Calculating regional total murders excluding certain states Using the filter() function, exclude Ohio (OH), Alabama (AL), and Arizona (AZ) by filtering out their abbreviations: library(dplyr) library(dslab) exclusion_states total_murders_excluding % filter(!abb %in% exclusion_states) %>% group_by(region) %>% summarise(total_murder = sum(murder)) print(total_murders_excluding) b. Displaying regional population and murder numbers Summarize the total population and total murders for each region: regional_stats % group_by(region) %>% summarise(total_population = sum(population), total_murder = sum(murder)) print(regional_stats) c. Counting states per region Use n() within group_by() and summarise() to count states: states_per_region % group_by(region) %>% summarise(state_count = n()) print

Download The R Studio Tidyverse And Associated Packages

Download The R Studio Tidyverse And Associated Pack

Paper For Above instruction

Introduction

Data Context and Preparation

Analysis Tasks and R Syntax

a. Calculating regional total murders excluding certain states

b. Displaying regional population and murder numbers

c. Counting states per region

d. Ohio's murder rank in the Northern Central Region

e. Counting states with murder numbers exceeding regional averages

f. Displaying two least populated states in each region

Conclusion

References