PDF

From SuperMemopedia
Jump to navigation Jump to search

PDF PROBLEM

How can I import PDF files to SuperMemo for incremental reading?

ANSWER

PDF is a proprietary format. SuperMemo does not support PDF natively. This has always made PDF materials harder to process than ordinary HTML text imported from the web.

There approaches to process PDF incrementally. You will need to see which one is best for your particular material. It may happen that you will need to resort to mixed strategies and use different approaches to different texts. The main options are:

  1. SuperMemo Assistant: see: SuperMemo Assistant
  2. SuperMemo 19: makes it possible to import as plain text with a link to the original. If formatting is essential or pictures are included, you can fill in the gaps while reading incrementally
  3. Adobe Acrobat or ABBYY Finereader: see: Converting a PDF to HTML using Adobe Acrobat
  4. Converters: using PDF to HTML converters to generate HTML text that can be read in SuperMemo. See: PDF to HTML converters
  5. Pictures: using page snapshots (e.g. with Print Screen) and employing visual learning. See: PDF and Visual Learning or Demo
  6. Manual: using copy and paste to copy PDF to SuperMemo page by page (or picture by picture). See: PDF Copy and Paste
  7. Google: using Google cache. See: PDF Google Cache
  8. Incremental: using an incremental reading approach to read-copy-and-paste while working with PDF opened from SuperMemo. See: How to read PDF incrementally?
  9. Tutorial from Master How to Learn: How I deal with PDFs for incremental reading
  10. Plug-in (highly ranked by users)(by Alexis at GitHub): SuperMemo PDF plug-in
  11. OCR: using CDR to convert PDF to text. See: PDF and OCR

Conversion to HTML is most convenient and least expensive. However, some converters and/or some PDFs produce HTML that is quite different than the original, and/or difficult to process in SuperMemo (e.g. requiring extra filtering, or extra manual formatting). Page snapshots are a fast way to read and import pages that are difficult to convert or are read only (e.g. manuals that require a specific page layout). Copy and paste approach is best for articles that can easily be selected in their entirety and which do not contain too many pictures. Finally, the incremental approach is most natural for SuperMemo, however, instead of using read-points, the students needs to make a note where he or she stopped reading the text.

There is a Reddit thread: Incremental reading of PDF files

BuboFlash

BuboFlash is also worth a mention here. It supports incremental reading of pdfs, as well as export to SuperMemo. One major downside for some though will be that it works only as a chrome extension.

Implementing a PDF reader in Delphi appears trivial

According to this StackOverflow answer embedding an Acrobat PDF reader into a Delphi app seems to be pretty trivial. It essentially is two steps: (1) add the Adobe component to the Delphi project, (2) write a 3-line function.

Here is an example screen: https://i.imgur.com/kiwwzpI.png

Since there is no ability right now to "incrementally read" a PDF natively there is no need to attempt to replicate pure-IR using PDFs either. Simply having the PDF available in a component beside the html component would be a huge improvement.

So much of our available knowledge now is in PDF form that it seems inconceivable that the application designed to allow us to absorb as much knowledge as rapidly as possible cannot support one of the dominant formats. This format is used for ebooks, textbooks, journal articles, magazines, etc. So much content is "locked away" unable to be used in SM when the solution appears to be very, very simple.

Is it at all possible to modify the next version of SuperMemo to accommodate this critical highly-demanded capability?

PDF in SuperMemo

Comment

You can keep PDF as binary components. The downside is the need to switch from PDF to SuperMemo. The benefit is that you can use HTML full screen and paste whole pages while reading without focus problems (e.g. focus kept in HTML all the time).

Read-points and Kami

Alternatively, in Chrome, for reading PDF in SuperMemo and remember the last read point use Kami extension in Chrome. Whenever your topic with PDF shows, you enter the component with PDF and it opens with the last read point. You don't have to manually write your last page. See: https://www.youtube.com/watch?v=nF_1TiXUvFI

PDF at SuperMemoPedia