Validity And Utility Of Selection Procedures Plot The Data ✓ Solved

Validity and Utility of Selection Procedures Plot the data.

Determine the proportion of correct decisions that will be made if the company decides to use the test in the future. Assume the company decides to hire anybody whose test score is 5 or above. Assume $5k in sales after three months on the job is the minimum acceptable for satisfactory performance.

Data (Applicant, Test score, Sales after three months):

1, 2, $4k

2, 6, $7k

3, 9, $9k

4, 5, $5k

5, 3, $2k

6, 7, $3k

7, 1, $4k

8, 9, $7k

9, 5, $4k

10, 4, $4k

11, 8, $8k

12, 2, $4k

13, 4, $5k

14, 4, $4k

15, 4, $3k

16, 6, $6k

17, 3, $2k

18, 4, $3k

19, 7, $6k

20, 8, $7k

Plot the data into four quadrants using test cutoff score >=5 (hire if >=5) and job success cutoff sales >= $5k (satisfactory). Identify which quadrants represent true positives, true negatives, false positives, and false negatives. Of the four quadrants depicted in the plot (Q1, Q2, Q3, Q4), which one represents “false positives”? Which one represents “false negatives”? Compare the number of applicants in Q2 and Q3 against the number in Q1 and Q4 and explain what this reveals about the validity of the test in this application. Propose methods to evaluate the reliability of this test and justify them. Explain how you would evaluate the extent to which this test’s validity study suffers from restriction of range. Propose additional evidence of validity to gather and how to gather it. Propose how to evaluate the ROI or utility of this test and explain the impact of larger differences between top and bottom performers on utility. Recommend whether to combine this test with other procedures in the selection process; specify which procedures, sequencing (first, last), and whether to use a compensatory selection model with justification. State what evidence you would gather to decide on the merits of a compensatory model. Finally, propose how to evaluate whether the test has adverse impact on females and minority groups.

Paper For Above Instructions

Summary and quadrant mapping

Define axes before interpreting quadrants: X-axis = test score (left = <5, right = ≥5 → hire); Y-axis = job outcome (bottom = sales <$5k = unsatisfactory, top = sales ≥$5k = satisfactory). Using that orientation, quadrants map as follows:

  • Q1 (top-right): Hired and satisfactory → True Positives (TP)
  • Q2 (top-left): Not hired but satisfactory → False Negatives (FN)
  • Q3 (bottom-left): Not hired and unsatisfactory → True Negatives (TN)
  • Q4 (bottom-right): Hired but unsatisfactory → False Positives (FP)

Applying the cutoff rules to the 20 applicants yields counts: TP = 8 (applicants 2, 3, 4, 8, 11, 16, 19, 20), TN = 9 (1, 5, 7, 10, 12, 14, 15, 17, 18), FP = 2 (6, 9), FN = 1 (13). Thus Q4 represents false positives and Q2 represents false negatives in this plot orientation.

Proportion of correct decisions and interpretation

Proportion correct = (TP + TN) / total = (8 + 9) / 20 = 17 / 20 = 0.85 (85%). This high overall hit rate indicates the test, with cutoff ≥5, classifies applicants correctly most of the time in this sample. However, raw accuracy must be contextualized by base rates and costs associated with FP and FN decisions (Schmidt & Hunter, 1998).

Comparison of quadrants and implications for validity

Q2 + Q3 (non-hired group) = 1 + 9 = 10 applicants; Q1 + Q4 (hired group) = 8 + 2 = 10 applicants. The test produces equal group sizes here (10 hired, 10 not hired). The low number of errors (3 total) versus 17 correct suggests reasonable criterion-related validity in this sample. The asymmetry—more false positives (2) than false negatives (1)—suggests slightly higher risk of hiring marginal performers than rejecting good ones, but counts are small. Validity should be quantified with correlation or phi coefficient between test pass/fail and success; additionally predictive accuracy should be cross-validated (Hunter & Hunter, 1984; Schmidt & Hunter, 1998).

Evaluating reliability

Recommended reliability checks:

  • Internal consistency (Cronbach’s alpha) if the test is multi-item; provides estimate of homogeneity (Nunnally & Bernstein, 1994).
  • Test–retest reliability (time-separated administration) to evaluate temporal stability if the construct is stable.
  • Inter-rater reliability if scoring involves raters (use ICC or kappa).

Justification: Reliable scores are a prerequisite for validity; measurement error attenuates correlations with performance and reduces selection utility (AERA/APA/NCME Standards, 2014).

Assessing restriction of range

Restriction of range occurs when selection truncates predictor or criterion variance (Thorndike, 1949). To evaluate this:

  • Compare the variance of test scores and performance in the applicant sample (before selection) to the variance in the hired sample. Reduced variance among incumbents suggests restriction.
  • Use statistical corrections (e.g., Thorndike or Schmidt & Hunter corrections) to estimate unrestricted validity (Sackett & Yang, 2000).
  • Collect data on applicants who were not hired but later rehired or on multiple hiring contexts to estimate full-range relationships.

Additional validity evidence to gather

Gather convergent and criterion evidence: concurrent and predictive criterion-related validity (track hires over longer term sales, retention, supervisor ratings), content validity documentation showing job-relatedness of test content, construct validation (factor analysis, relations with other constructs), and incremental validity (does the test add prediction beyond current predictors) (AERA/APA/NCME, 2014; Schmidt & Hunter, 1998).

Evaluating ROI / utility

Estimate utility using decision-theoretic models (Taylor-Russell tables or Schmidt & Hunter utility equations): calculate monetary gain = (mean performance of hires − mean performance of random hires) × value per unit performance × number hired − testing costs (Schmidt & Hunter, 1998). In this dataset, mean sales for hires ≈ $6.2k vs non-hires ≈ $3.5k → selection gain ≈ $2.7k per hire. Multiply by number hired and subtract test administration costs to estimate ROI. Larger differences between top and bottom performers increase utility because selection magnifies organizational gains when criterion variance translates into monetary outcomes (higher SD → greater potential gain) (Hunter & Schmidt, 2004).

Combining this test with other procedures and sequencing

Recommendation: use a layered approach—cheap, broad-scope predictors early (brief cognitive or situational judgment tests) and more expensive, high-fidelity predictors (work samples, structured behavioral interviews) later. Work samples and structured interviews typically show high validity and practical relevance (McDaniel et al., 1994; Hunter & Hunter, 1984). Use a compensatory composite (weighted linear combination) when predictors contribute unique variance and compensation is theoretically justifiable; use noncompensatory (cutoffs or multiple hurdles) when minimum thresholds on certain dimensions are required (e.g., legal or safety standards). Evaluate whether the test adds incremental validity beyond other measures before deciding compensation (Aguinis, 2013).

Evidence to decide on compensatory model

Gather incremental validity statistics (semi-partial correlations or regression ΔR2), cross-validation results, utility simulations with different weighting schemes, and subgroup analyses. If the test adds meaningful unique variance and improves utility, a compensatory composite is warranted; if it primarily duplicates other measures, consider elimination or lower weight (Schmidt & Hunter, 1998).

Evaluating adverse impact

Collect demographic data (sex, race/ethnicity) for applicants and calculate pass/hire rates by group. Use the four-fifths rule as a screening tool and follow with chi-square tests and effect-size measures (Cohen’s d) and logistic regression controlling for relevant covariates to detect differential prediction (Uniform Guidelines, 1978). If adverse impact exists, investigate whether differences reflect job-related constructs (content validity) or bias, and consider alternative predictors or adjustments (AERA/APA/NCME, 2014).

Conclusion

The test in this sample yields an 85% correct classification rate and produces a meaningful mean sales difference between hires and non-hires, supporting useful predictive validity. Recommended next steps are formal reliability analysis, corrections for range restriction, collection of longer-term criterion data, incremental validity testing against additional selection tools (structured interviews, work samples), utility modeling, and adverse-impact analyses before full operational deployment (Schmidt & Hunter, 1998; AERA/APA/NCME, 2014).

References

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: AERA.
  • Aguinis, H. (2013). Performance management (3rd ed.). Upper Saddle River, NJ: Pearson.
  • Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96(1), 72–98.
  • McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of employment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79(4), 599–616.
  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.
  • Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85(1), 112–118.
  • Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274.
  • Thorndike, R. L. (1949). Personnel selection: Test and measurement techniques. New York, NY: Wiley.
  • Uniform Guidelines on Employee Selection Procedures. (1978). Federal Register, 43(166), 38290–38315.
  • Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA: Sage.