Problems with umlauts in XML exchange

From SuperMemopedia
Jump to navigation Jump to search

Question:

I've installed a trial version of SM CE 3.50.

I'm trying to use it with bases I have on my desktop in SM 2004. I have a problem though. When I export the items with polish or german characters, they get converted to UTF-8. But the CE version doesn't recognize their encoding. In the end german o umlaut (ö) gets displayed as ¶, etc.

When I export the XML on my iPaq, and then try to import it back to SM 2004, the characters are wrong again. This time the o umlaut is like ö


Answer:

SuperMemo CE requires all characters to be represented as Unicode. Your umlaut (ö) should be encoded as ö in the XML file. If you use HTML in the original collection, its original representation will be preserved. If your collection uses ANSI/OEM fonts, they have to be converted at XML export by choosing the option Convert OEM characters set to Unicode. Your collection should not use UTF-8 for export to SuperMemo CE. If you happen to have some texts UTF-8 encoded, you will need to decode them first with Text : Convert : Decode UTF-8 on the component menu.

Once your export containes UTF-8 encodings, it will not be recognized by SuperMemo CE. Moreover, it will be then converted to (pairs of) characters corresponding with UTF-8 codes and displayed as such in SuperMemo 2004. In other words, it will not only be unreadable but no longer decodable with Decode UTF-8.

For more details see: http://supermemo.com/help/fonts.htm

If you are not sure what encoding is used in a given collection, you can ask the collection's author, or write to SuperMemo Library. In case of collections released with Multimedia SuperMemo or on CD-ROM/DVD (incl. older versions of SuperMemo), use the address for Multimedia SuperMemo (for addresses see: http://www.supermemo.com/english/contact.htm)