Case against cramming
Cramming
SuperMemo makes it possible to review items ahead of time, e.g. before an exam. This goes against optimum repetition schedule. When items are reviewed very often, we call it cramming (in opposition to optimum schedule produced by SuperMemo).
Problem: SuperMemo shortens intervals
We have noticed long ago that users who set their forgetting index to 3% tend to fall into a trap of astronomical repetition loads. This is why we recommend staying with the default forgetting index (10% forgetting).
New Algorithm SM-17 in SuperMemo 17 revealed that very short intervals can result in weakening memories.
When memories get weakened, SuperMemo will shorten intervals instead of increasing intervals.
Intense cramming, e.g. before an exam, can also result in a similar effect.
Keeping intervals on the steady increase like in older SuperMemo would resolve the problem of slow progress but would also prevent sticking to forgetting index targets (this was the case in older SuperMemos as well). Moreover, keeping intervals long would prevent collecting data that revealed the problem in the first place. We would have not learned how memory behaves in super-massed presentation.
On the other hand, after very long breaks in learning, SuperMemo 17 would also shorten intervals that might go into decades despite scarcity of repetition data (one good grade after a decade, could produce an interval of 20-30 years). In those cases, user would be asked to determine if he was ready to take the risk on a particular item.
New interval choice dialog lets users choose between the "honest" interval derived from the DSR model and the default SuperMemo 17 interval with anti-cramming protection (U-Factors cannot drop below set values in shorter interval ranges). In SuperMemo 19 those intervals get lengthened by SuperMemo if they generate a schedule that is too dense. See: I choose minimum interval and get a different value
Divorce: Algorithm vs. DSR Model
It seems obvious that the only solution to above dilemmas is to divorce the algorithm from the DSR Model in extreme cases of cramming. While DSR Model would allow of all honest computations on the behavior of memory, the algorithm would enforce a rational schedule that would prevent short intervals and weakened memories.
The advantage of shortening of decade long intervals could come with "weakening" the divorce for longer intervals. This would imply a strict U-Factor increase in the first weeks of learning, strict application of the DSR model in very long intervals, and a gradual transition to the DSR model in between.
The whole concept of "weakened memories" was born when collecting data with the new algorithm. Implementing the divorce will instantly reduce the flow of data. However, it does not mean we would for ever stay ignorant about cramming. SuperMemo still allow of manual interval intervention, cramming via subset review, and manual interval choice (DSR model choice).
Drop in stability
SuperMemo 17 honestly estimates increase in stability using recall data. If data indicates that a drop in stability is possible, it will be reflected in the stability increase matrix. To prevent the impact of outliers and reduce the impact of stability drop on learning, at this moment (February 2017), maximum allowed drop in stability is 20% (SInc[D,S,R]=0.8).
Simulations shows that drop in stability may result in "interval stagnation" for difficult items. For example, one the interval reached beyond 2-3 years, it would stop increasing. That would, in theory, necessitate a repetition every 2-3 years without the usual benefit of increasing intervals that make items "fade" in the process.
We have now seen actual data showing a drop in stability for difficult items. This drop, however, can show up much earlier than expected, when items are still showing low stability (intervals measured in days and weeks).
Feedback loop
Memory literature shows that a repetition can actually weaken memories and SInc<1 would be an expression of that theoretical phenomenon. This may result in a feedback loop where SuperMemo might want to keep shortening intervals instead of increasing them. This will not happen in actual learning because SuperMemo introduces boundary conditions that ensure all interval increase for passing grades.
For difficult items or items with very low forgetting index:
- early repetition invokes the spacing effect
- due to the spacing effect, memory weakens
- due to a drop in stability, bound by the forgetting index target, SuperMemo reduces the interval
- due to a drop of interval, the spacing effect intensifies (back to Step 1)
Problem in SuperMemo 17
If SuperMemo disallows interval shortening, the feedback loop will not occur, however, SInc matrix entries lower than 1 will result in a very slow increase in intervals. If sufficiently many items showed this behavior, this could clog the learning process and result in lower performance.
The obvious solution would be to ruthlessly eliminated leeches, however, some collections are inherently difficult (e.g. learning Chinese for Europeans).
SuperMemo might tighten interval increase requirements and make leech reporting more pesky. In particular, it could make a better use of Postpone as it has been shown that, paradoxically, forgetting may be helpful in eliminating leeches. Not only does it unclog the learning process, re-learning may result in building a different representation of the same memory. Instead of trying to fix a case of a "leaky memory" item, the user would attempt to learn the same item "differently".
Difficulty problem
Another feedback loop with Sinc<1 is that short intervals lead to low stabilities and high retrievabilities. A single lapse in a long series of good answers is best explained by assigning an item to high difficulty category. High difficulty reduces the weight of the lapse, while high retrievabilities result in the fact that other good grades hardly matter, even if there are many good scores that follow the lapse.
One of the considerations for the algorithm to prevent similar feedback was to increase the weights for grade deviations for most recent repetitions. This would gradually reduce the impact of a single lapse and hopefully help item recover its "good reputation".
In an extreme case, for very high weights of the most recent repetition, items would easily toggle from high to low difficulty. For equal weights, SInc<1 and short intervals could make items stick to high difficulty with just a single lapse.
A perpetual and unresolved problem is that on a slope of the SInc matrix, items tend to "roll" to one or the other side of the hill with few difficulties determined in between 0 and 1.
Cramming is bad
Discovery of SInc<1 is an important lesson on the dangers of cramming:
- cramming can weaken memories
- low forgetting index may have bad influence on long-term outcomes
- natural development of mnemonic skill is an important benefit in long-term use of SuperMemo
- the art of item formulation is growing to be of paramount importance
Example
This example comes from a collection in which Difficulty=100% contributed over 16,000 repetition cases. SInc<1 was recorded in stability bins of 9 to 58 days:
Clarification
Question: meaning of cramming
Are there two meanings of "cramming"?
One, mentioned at the beginning of this page and the second one - the phenomenon that is caused by SM17 algorithm? I have never deliberately cramming items in SuperMemo as I usually use it to learn for myself and not for exams. And I have almost never crammed at school due to my bad memory. I had to learn systematically or not learning at all ;)
Answer
Perhaps we better define a dictionary entry cramming.
In present context cramming refers to anything that causes frequent review and weakening of memories. This can be intentional (e.g. with subset review), or due to feedback loop cause by leaches and "negative increase" in stability resulting in short intervals. The latter is a major discovery, a weakness and, you might even say, a bug, in the algorithm. If the algorithm slows your learning, you can call it a bug. Nobody simply predicted that SuperMemo could actually weaken memory. Once the program is defended from the feedback loop, this is a very happy discovery actually. It shows something we always suspected but never measured: massed review can weaken memory.
The simplest explanation of the loop that occurred in the collection show in the figure is this: leeches -> low recall -> stability increase less than 1 -> weakening memories -> slow increase in stability -> short intervals -> weakening memory further -> shorter intervals -> etc. If we disallow very low U-Factors for shorter intervals, this problem will disappear (naturally, leeches will have to be found and eliminated independently). In addition, very low values of SInc matrix worked like an "attractor" in the hill-climbing procedure that determines item difficulty. This made "high difficulty" category more "attractive" and more items were classified as difficult even if they got just one lapse of memory.
(as for bad memory, if you develop some good incremental reading habits, you can discover how much of "memory" is a matter of training, implicit and explicit mnemonic skills -- one day you may says "I had bad memory before ... I started using incremental reading" :)
Question: user perspective
Could you explain how it works from the user’s perspective?
Answer
If you have super-difficult items, and your memory does not work too good, SM17 may suggest to shorten the intervals to meet the forgetting index criteria. But when you shorten intervals, you evoke the spacing effect and make it worse. So it is, at first, a positive feedback loop. You fall into a black hole of endless repetitions.
All older SuperMemos just had a rule: never use an interval increase than is less than 1.1 (that value might differ in different versions). This rule is valuable but it may not be too good if you want to make best use of the DSR Model.
If you get to an item like “codicil” in English after 10 years break, and you score well, it is a bit risk to send it out to 20 year interval. You might have hard the word codicil by chance just yesterday. Statistically, the model may say "make it 7 years" (to account for luck, interference, etc.).
So the main idea is to go in between: old rigid SuperMemo for short intervals, and more freedom to use the memory model at longer intervals (even if this results in shortening intervals).
This helps you cram safely, and still safely take 10 year breaks from learning. The best of two worlds.
Question: pejorative spacing effect
Spacing effect in your words sounds like something bad! I always thought that SuperMemo banks on the spacing effect, which would make it good!?
Answer
Spacing effect implies that you learn better when you space repetitions. When you space them, the spacing effect is good; if you cram them, the spacing effect will weaken memories and that's bad.
Question: super-long intervals
You say "If you get to an item like codicil after 10 years break, and you score well, it is a bit risk to send it out to 20 year interval". Yeah! Why not?! It’s logical.
Answer
If you happen to have just read about a codicil in a book, it will camouflage as a "false memory". You will get a next chance in 20 year. If you took a long break from a collection of English words and you use English often, you may have a large number of such cases. It is best to rely on statistics and look what happened in the past. Some strange learning habits can produce strange statistics and strange intervals. SuperMemo 17 is impartial, it just looks at data and makes best guesses about the future. Some of those guesses result in shortening intervals.
Question: forgetting the obvious
Do you have some research saying that after 7 years you may forget even your father’s name if you don’t use it?
Answer
SuperMemo does not "research" specific knowledge. People rather do not store father names in SuperMemo. All memories can be forgotten, but some probabilities might be less than a risk of a nuclear attack. If you store your father's name in SuperMemo and keep scoring 5, SuperMemo will just assume it is an easy item, give it long intervals, and you may discover the item as forgotten at the age of 85, e.g. as a result of Alzheimer's (assuming you will remember to fire up SuperMemo). Until then, father's name is treated like all other easy items. From your own life though, you might recall episodes when obvious things cannot be recalled. This happens most often to items that you review very often (e.g. in life) and then take a longer break, or learn something similar, or just get confused for a while. Credit card PIN is a typical example.
Question: safe cramming
I still don’t understand what risks you see and want to avoid. What do you mean by saying „you can safely cram”? What’s risky about cramming and what do you achieve by not allowing intervals get shorter. From my guts, I don’t like this kind of tinkering. It’s making the whole think more complicated, unpredictable and arbitrary.
Answer
- If intervals are too short, you will not establish good memories and will review ad infinitum
- If intervals are too long, you may forget
All SuperMemos in the past did this kind of "tinkering". The intervals had to increase by some minimum value to prevent massive review and to enforce quality of items. Only now, with DSR Model we can see that only a drop in the length of interval can ensure that the probability of forgetting would match the requested forgetting index. This refers to badly formulated items or items remembered "by chance" (e.g. as a result of real life encounters).
SuperMemo 17 marries those two worlds: (1) it still prevents massive review for shorter intervals, but it also (2) allows of shorter intervals if statistics say an increase would likely cause forgetting.
"safely cram" means that you do not risk positive feedback of weakened memories if you load your collection with badly formulated items at low forgetting index. In other words, SuperMemo will protect you from bad practices that are often encouraged at school (overload, deadlines, list items, etc.).
Question: limits of U-Factors
When you write restoring lower limits on U-Factors within short interval range, what do you mean?
Answer
U-Factors tell you how much intervals increase. In older SuperMemos, U-Factors could not go below a certain value. For example, if the lower limit on U-Factors was 1.1, intervals had to increase by no less than 10%.
Algorithm SM-17 temporarily took off those limits. The main motivation was "to see what happens". What happened is that SuperMemo easily could reduce intervals if stability dropped at review.
If this happens for long intervals, like "codicil" mentioned above, this is not a problem. For lower intervals, this can trigger spacing effect positive feedback loop. This is why SuperMemo 17 now restores lower limits on U-Factors for shorter intervals, but those limits are phased out in proportion to the increase of intervals when the power of the spacing effect diminishes.
Question: can intervals increase at failure?
Do you mean it is possible for intervals to rise even if I fail to remember?
Answer
No. At failure you always go back to Repetition=1. The first interval can occasionally be longer than short intervals before the failure, but this should not be interpreted as "rise", only as "long first interval".