Evidence for Spaced-repetition and study apps that actually help #

Every substantive claim on the Spaced-repetition and study apps that actually help page is checked against current research. Here is each claim, how well today’s evidence supports it, and the sources. The full, de-duplicated source list lives on the references page.

Supported · strong evidence — Spaced-repetition software schedules reviews to distribute practice over time, and distributing the same study across separate sessions produces better long-term retention than cramming it into one session.

The spacing effect is among the most robustly replicated findings in learning science; Cepeda et al.’s quantitative synthesis of 100+ years of verbal-recall studies found distributed practice reliably outperformed massed practice, and spaced practice is endorsed as a high-utility technique in subsequent reviews. Spaced-repetition software simply automates this scheduling.

Sources: Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006), Distributed practice in verbal recall tasks: A review and quantitative synthesis, Psychological Bulletin 132(3), 354-380 · Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013), Improving students’ learning with effective learning techniques, Psychological Science in the Public Interest 14(1), 4-58 · full reference ›

Supported · strong evidence — The advantage of spaced over crammed review grows the longer the delay before you need to recall the material.

Cepeda et al. (2006) found the spacing advantage increased with retention interval, and their later large-scale study mapped how the optimal inter-study gap scales with how long material must be retained.

Sources: Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006), Distributed practice in verbal recall tasks: A review and quantitative synthesis, Psychological Bulletin 132(3), 354-380 · Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008), Spacing effects in learning: A temporal ridgeline of optimal retention, Psychological Science 19(11), 1095-1102 · full reference ›

Supported · moderate evidence — Spaced-repetition software such as Anki lengthens the interval for cards recalled easily and shortens it for cards recalled poorly, concentrating review time on weaker material.

This describes the documented operation of widely used spaced-repetition schedulers; the adaptive-interval logic implements well-supported spacing and difficulty-based scheduling principles. Strength is moderate because the design rests on memory theory rather than head-to-head outcome trials of each algorithm.

Sources: Anki Manual — Deck Options / FSRS scheduling (2024), https://docs.ankiweb.net/ · Open Spaced Repetition, FSRS (Free Spaced Repetition Scheduler) algorithm documentation (2024), https://github.com/open-spaced-repetition · full reference ›

Supported · moderate evidence — FSRS fits a model of an individual learner’s forgetting to their own review history and schedules each card to hit a chosen retention probability (e.g. 90%).

FSRS is built on the DSR (difficulty-stability-retrievability) memory model and optimises review timing against a user-set desired-retention target, as described in its open documentation and the Anki manual; it operationalises established spacing and forgetting-curve principles.

Sources: Open Spaced Repetition, FSRS algorithm and memory model documentation (2024), https://github.com/open-spaced-repetition · Anki Manual — FSRS (2024), https://docs.ankiweb.net/ · full reference ›

Supported · strong evidence — A flashcard only confers a learning benefit if the learner attempts to recall the answer before revealing it; merely viewing both sides (recognition without an attempted recall) is far less effective.

The testing/retrieval-practice effect is strongly established: actively retrieving an answer produces substantially better retention than restudying or re-reading it, which is what a flashcard glanced at on both sides amounts to.

Sources: Roediger, H. L., & Karpicke, J. D. (2006), Test-enhanced learning: Taking memory tests improves long-term retention, Psychological Science 17(3), 249-255 · Adesope, O. O., Trevisan, D. A., & Sundararajan, N. (2017), Rethinking the use of tests: A meta-analysis of practice testing, Review of Educational Research 87(3), 659-701 · full reference ›

Supported · moderate evidence — Constructing your own flashcards — selecting what matters and rephrasing it in your own words — is itself a learning activity (elaboration plus retrieval), so automating card creation removes part of the learning benefit.

Generative, effortful processing — selecting, organising and rephrasing material rather than receiving it pre-packaged — supports durable learning (the generation effect and elaboration), so offloading card construction to a tool forfeits that benefit. The specific claim about AI-generated decks is newer and less directly tested, hence moderate.

Sources: Bjork, R. A., Dunlosky, J., & Kornell, N. (2013), Self-regulated learning: Beliefs, techniques, and illusions, Annual Review of Psychology 64, 417-444 · Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013), Improving students’ learning with effective learning techniques, Psychological Science in the Public Interest 14(1), 4-58 · full reference ›

Supported · moderate evidence — Restating material in your own words on a flashcard (rather than copying it verbatim) improves learning relative to rote copying.

Elaborative, meaning-focused processing produces better retention and comprehension than shallow verbatim copying; this is well supported by the elaboration and levels-of-processing literatures.

Sources: Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013), Improving students’ learning with effective learning techniques, Psychological Science in the Public Interest 14(1), 4-58 · Bjork, R. A., Dunlosky, J., & Kornell, N. (2013), Self-regulated learning: Beliefs, techniques, and illusions, Annual Review of Psychology 64, 417-444 · full reference ›

Supported · strong evidence — Large language models used to generate study cards can produce confident factual errors, so auto-generated cards need to be checked for accuracy before use.

It is well documented that current large language models hallucinate — generate plausible but false statements — so fact-checking AI-generated study material is a sound precaution.

Sources: Ji, Z., Lee, N., Frieske, R., et al. (2023), Survey of hallucination in natural language generation, ACM Computing Surveys 55(12), 1-38 · full reference ›

Supported · moderate evidence — The subjective ease/fluency of a study tool is a poor guide to how much real learning it produces; effortful study that feels harder often produces more durable memory.

The dissociation between in-the-moment fluency and durable learning (desirable difficulties) is well documented: learners routinely misjudge easier, more fluent study as more effective despite worse delayed-test performance.

Sources: Bjork, R. A., Dunlosky, J., & Kornell, N. (2013), Self-regulated learning: Beliefs, techniques, and illusions, Annual Review of Psychology 64, 417-444 · Kornell, N. (2009), Optimising learning using flashcards: Spacing is more effective than cramming, Applied Cognitive Psychology 23(9), 1297-1317 · full reference ›