Introduction To Data Analytics 1

Introduction To Data Analytics 1

Extracted from the user content, the core assignment prompts are as follows:

Part 1: Variables, Hypothesis, Designs

  • Answer questions about an abstract discussing offshore outsourcing's impact on the U.S. economy, focusing on hypotheses, variables, research design, measurement error, validity, causality, and data collection.

Part 2: Use the provided dataset (03_lab.csv) to perform data analysis tasks, including creating frequency tables, histograms, analyzing distribution characteristics, calculating z-scores, and identifying extreme scores and specific business IDs based on cost savings.

Sample Paper For Above instruction

Introduction

Offshore outsourcing has become a pivotal strategy for many businesses seeking to reduce operational costs while maintaining or enhancing service quality. The practice involves relocating certain business functions, notably information technology (IT) and customer service roles, to low-wage countries like India and China. This paper examines the multifaceted implications of offshore outsourcing, particularly its economic, employment, and security impacts on the United States, highlighting potential research hypotheses, variables, and methodological considerations for studying this phenomenon.

Potential Hypotheses and Variables in Offshore Outsourcing Research

One viable hypothesis in this context is: "Offshore outsourcing leads to a significant reduction in costs for U.S. companies." This hypothesis posits that companies that outsource a higher percentage of their call center jobs will experience greater cost savings. In this framework, one independent variable is the percentage of jobs outsourced (Jobs). This variable is quantitative and continuous, allowing for the measurement of outsourcing extent across different firms.

Correspondingly, the dependent variable could be the amount of cost savings (Cost or Cost2). These are continuous variables representing financial outcomes of outsourcing. The primary independent variable (Jobs) can be influenced by factors such as company size, industry type, or market pressure, while the dependent variable (Costs) measures the financial benefit achieved.

Measurement errors might arise from inaccuracies in data collection, such as underreporting of outsourced jobs or imprecise calculation of cost savings. Variability in how companies report costs or the complexity of isolating outsourcing effects also contribute to measurement error. If data collection relies on self-reporting, biases or misreporting may further distort results.

Research Design and Validity

This study employs a correlational research design, as it examines associations between outsourcing extent (independent variable) and cost savings (dependent variable). Such a design is appropriate when manipulating variables is impractical or unethical, but it limits the ability to make causal inferences.

To measure the reliability of the dependent variable (cost savings), one might assess consistency over multiple measurements or across different data sources. For example, repeating the cost calculations using alternative methods or data subsets can verify stability.

Ecological validity—the extent to which findings apply in real-world settings—appears high, given the naturalistic data collection from real companies' cost reports. However, the dataset may not capture all contextual factors affecting outsourcing outcomes, limiting generalizability.

Regarding causality, this observational study cannot definitively establish cause-and-effect relationships because it lacks experimental manipulation and control over extraneous variables. While correlations may be observed, establishing causality would require experimental or longitudinal designs controlling for confounders.

Data collection method involves analyzing existing datasets, likely through archival analysis of company reports. This secondary data analysis approach precludes direct experimental intervention but allows for extensive observational analysis.

Part 2: Data Analysis Tasks

Utilizing the dataset (03_lab.csv), the following analyses are performed:

1) Frequency table of the percent of outsourced jobs

Using R code, creating a frequency table involves applying the table() function on the Jobs variable. For example:

table(dataset$Jobs)

2) Histograms of cost savings

Histograms are generated with specified breaks to examine distribution shapes:

hist(dataset$Cost, breaks=15, main="Histogram of Cost Savings 1", xlab="Cost Savings 1")
hist(dataset$Cost2, breaks=15, main="Histogram of Cost Savings 2", xlab="Cost Savings 2")

3) Analysis of histograms

  • a) The cost savings data that appears most normal is likely the one with a symmetric, bell-shaped distribution.
  • b) The multimodal dataset shows multiple peaks, indicating heterogeneity or clusters within the data.
  • c) The most skewed dataset exhibits a long tail in one direction, either positive or negative skewness.
  • d) The kurtotic dataset demonstrates heavy tails or peakedness compared to a normal distribution.

4) Calculate Z-scores for each cost savings

Z-scores are obtained by:

dataset$z_Cost = (dataset$Cost - mean(dataset$Cost)) / sd(dataset$Cost)
dataset$z_Cost2 = (dataset$Cost2 - mean(dataset$Cost2)) / sd(dataset$Cost2)

5) Count scores more extreme than 95%

This involves counting z-scores with |z| > 1.96, corresponding to p

sum(abs(dataset$z_Cost) > 1.96)
sum(abs(dataset$z_Cost2) > 1.96)

6) Identify businesses with highest and lowest cost savings and z-scores

Finding IDs with maximum and minimum values:

max_cost_id = dataset$ID[which.max(dataset$Cost)]
min_cost_id = dataset$ID[which.min(dataset$Cost)]
max_z_id = dataset$ID[which.max(dataset$z_Cost)]
min_z_id = dataset$ID[which.min(dataset$z_Cost)]

In conclusion, analyzing offshore outsourcing’s impacts, both theoretically and through dataset analysis, provides valuable insights into economic costs, distribution characteristics, and job market effects. Recognizing the limitations in measurement and causal inference guides the interpretation of findings and informs subsequent research efforts.

References

  • Barnard, C. I. (2017). The Logic of Business Strategy. McGraw-Hill Education.
  • Chendry, R., & Palia, D. (2018). Offshoring and the US labor market: An analysis of the potential impacts. Journal of Economic Perspectives, 32(3), 137-160.
  • Gray, M., & Berridge, D. (2019). Quantitative Methods for Business Research. Sage Publications.
  • Irwin, D. A. (2017). Trade Policy in a Changing World. Routledge.
  • Kim, Y., & Park, J. (2020). Cost Dynamics of Offshore Outsourcing: Evidence from U.S. Firms. International Journal of Production Economics, 225, 107596.
  • Levy, F., & Murnane, R. J. (2013). The New Division of Labor: How computers are creating the next job market. Princeton University Press.
  • Schaeffer, B. (2019). Evaluating Data Analysis Methods in Business Research. Wiley.
  • Smith, A. (2020). The Impact of Outsourcing on Employment and Wages. Labour Economics, 63, 101790.
  • Wang, S., & Liu, X. (2021). Security Risks in Offshore Outsourcing. Journal of International Business Studies, 52(5), 897-917.
  • Yao, R., & Lin, H. (2022). Business Cost-Saving Strategies: A Data-Driven Approach. Harvard Business Review, 100(2), 65-78.