Evidence for Formal reviews and self-testing #

Every substantive claim on the Formal reviews and self-testing page is checked against current research. Here is each claim, how well today’s evidence supports it, and the sources. The full, de-duplicated source list lives on the references page.

Supported · strong evidence — A formal test does not merely measure existing knowledge; the act of retrieving information under test conditions itself strengthens later memory (the testing effect), so testing produces learning, not just an assessment of it.

The testing (retrieval-practice) effect — that retrieving information yields more durable retention than restudying it — is one of the most robust and replicated findings in the science of learning, established by Roediger & Karpicke and confirmed in subsequent meta-analyses. This directly supports reframing exams as learning events rather than pure measurement.

Sources: Roediger & Karpicke (2006), Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention, Psychological Science — https://doi.org/10.1111/j.1467-9280.2006.01693.x · full reference ›

Supported · strong evidence — Students who test themselves on material after studying it remember substantially more after a delay of about a week than students who reread the same material for the same amount of time.

Roediger & Karpicke’s classic experiments showed a large delayed-retention advantage for testing over repeated study, a result reproduced extensively across materials and populations and treated as a cornerstone finding.

Supported · strong evidence — Rereading can look as good as or better than self-testing on an immediate test, but after a delay the testing advantage emerges and grows, so the immediate ease of rereading is misleading.

The crossover whereby restudy looks better immediately while testing wins at a delay is a well-documented feature of the testing-effect literature and a paradigm case of Soderstrom & Bjork’s learning-versus-performance distinction.

Sources: Soderstrom & Bjork (2015), Learning Versus Performance: An Integrative Review, Perspectives on Psychological Science — https://doi.org/10.1177/1745691615569000 · Roediger & Karpicke (2006), Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention, Psychological Science — https://doi.org/10.1111/j.1467-9280.2006.01693.x · full reference ›

Supported · strong evidence — The benefit of practice testing over restudying is robust and generalises across subject matter, age groups and test formats, as shown by meta-analysis of many comparisons.

Adesope, Trevisan & Sundararajan’s meta-analysis of hundreds of effects found a moderate-to-large, broadly generalisable testing benefit, consistent with Rowland’s independent meta-analytic review.

Sources: Adesope, Trevisan & Sundararajan (2017), Rethinking the Use of Tests: A Meta-Analysis of Practice Testing, Review of Educational Research — https://doi.org/10.3102/0034654316689306 · Rowland (2014), The Effect of Testing Versus Restudy on Retention: A Meta-Analytic Review, Psychological Bulletin — https://doi.org/10.1037/a0037559 · full reference ›

Supported · strong evidence — Effortful retrieval feels less productive than smooth rereading even though it produces more durable learning, so how a study activity feels in the moment is an unreliable guide to how much it teaches.

Soderstrom & Bjork’s distinction between current performance and durable learning is foundational and widely accepted; retrieval practice is a canonical desirable difficulty that lowers in-session ease yet raises long-term retention, and learners systematically misjudge fluent study as more effective.

Sources: Soderstrom & Bjork (2015), Learning Versus Performance: An Integrative Review, Perspectives on Psychological Science — https://doi.org/10.1177/1745691615569000 · Dunlosky, Rawson, Marsh, Nathan & Willingham (2013), Improving Students’ Learning With Effective Learning Techniques, Psychological Science in the Public Interest — https://doi.org/10.1177/1529100612453266 · full reference ›

Supported · strong evidence — Practice testing is among the highest-utility study techniques, whereas rereading is rated low-utility, so preparing for an exam by self-testing is more effective than preparing by rereading.

Dunlosky et al.’s comprehensive review rated practice testing (and distributed practice) as high-utility and rereading, highlighting and summarisation as low-utility; this ranking remains the standard reference and is widely endorsed.

Sources: Dunlosky, Rawson, Marsh, Nathan & Willingham (2013), Improving Students’ Learning With Effective Learning Techniques, Psychological Science in the Public Interest — https://doi.org/10.1177/1529100612453266 · full reference ›

Supported · moderate evidence — Attempting to recall material and failing, then being shown the correct answer, improves subsequent learning more than not attempting retrieval at all, so a question fluffed in an exam and then understood is not wasted.

Benefits of attempting retrieval before feedback, including from unsuccessful attempts, are supported by Kornell and colleagues and the broader pretesting/errorful-generation literature, with the consistent caveat that corrective feedback after the attempt is important.

Sources: Kornell, Hays & Bjork (2009), Unsuccessful Retrieval Attempts Enhance Subsequent Learning, Journal of Experimental Psychology: LMC — https://doi.org/10.1037/a0015729 · Kornell & Bjork (2008), Learning Concepts and Categories: Is Spacing the ‘Enemy of Induction’?, Psychological Science — https://doi.org/10.1111/j.1467-9280.2008.02127.x · full reference ›

Supported · moderate evidence — The less someone knows about a topic, the more they tend to overestimate their competence, so a beginner’s confidence is least trustworthy exactly when it feels most solid, and a cold test should be trusted over that confidence.

Kruger & Dunning’s core observation that low performers lack the metacognitive insight to recognise their deficits is robust and replicated; the magnitude and statistical interpretation (regression to the mean, better-than-average effects) are debated, but the practical lesson that beginners are poorly calibrated holds.

Sources: Kruger & Dunning (1999), Unskilled and Unaware of It, Journal of Personality and Social Psychology — https://doi.org/10.1037/0022-3514.77.6.1121 · full reference ›

Supported · moderate evidence — Pairing a retrieval attempt with corrective feedback (e.g. looking up the right answer after an exam) improves learning, partly by correcting errors so wrong answers are not simply rehearsed.

Feedback reliably augments the testing effect — particularly by correcting errors and protecting against retention of wrong responses — and meta-analytic evidence shows tests with feedback generally outperform tests without it.