A work sample asks the candidate to do a representative slice of the actual job — write the code, draft the memo, fix the part, run the till. A job-knowledge test asks what they know about the work. A cognitive-ability (GMA) test measures general reasoning and learning speed. All three are useful, but the 2022 reanalysis changed how they rank.

The numbers (Sackett et al., 2022).

  • Job-knowledge tests — .40
  • Work-sample tests — .33
  • Cognitive-ability / GMA tests — .31

So a work sample and a job-knowledge test now predict performance at or above a cognitive test. Flag the revision honestly: work samples were the headline casualty of the recalibration. Schmidt & Hunter (1998) reported .54; Roth, Bobko & McFarland (2005) — the meta-analysis Sackett et al. relied on — found a mean observed validity of .26 rising to .33 after correcting only for criterion unreliability, “approximately one third less than previously thought.” Anyone quoting .54 for work samples is using the old figure.

When to use each.

  • Work samples and job-knowledge tests only work for candidates who already have the skill. You cannot give a work sample to someone you intend to train from scratch — there is nothing yet to sample. They shine for experienced hires and skilled trades.
  • Cognitive-ability tests work for inexperienced candidates because they predict how fast someone will learn the job, and their validity rises with job complexity.

The practical tradeoffs.

  • Cost and fidelity. A high-fidelity work sample (realistic task, realistic conditions) is more predictive but expensive to build and score; a low-fidelity simulation is cheaper but weaker. Job-knowledge and cognitive tests are cheap to administer at scale.
  • Adverse impact. This is the decisive practical difference. Cognitive-ability tests produce the largest subgroup score gaps. Roth, BeVier, Bobko, Switzer & Tyler (2001) put the Black-White standardized mean difference at about d ≈ 1.0 for tests of general ability among job applicants in corporate settings — which translates into the greatest risk of disproportionately screening out protected groups. Work samples are usually lower, but they are not the panacea the textbooks suggest: Roth, Bobko, McFarland & Buster (2008) found work-sample Black-White differences “markedly larger for samples of job applicants (d = .73)” than the long-quoted meta-analytic value of about d = .38 drawn from incumbents. The diversity tradeoff is taken up in the adverse-impact note.

For a K-W SMB hiring into a skilled role, a structured job-relevant work sample is often the most persuasive and defensible single test you can run — candidates accept “show me you can do the work” more readily than an abstract aptitude test, and it ties directly to the job. Reserve cognitive testing for roles where you are hiring for learning potential, and get advice on adverse-impact and accommodation duties (Compliance cluster) before you deploy any standardized test.