http://www.w3.org/TR/REC-html40/sgml/entities.html lists the HTML 4 entities.
drupal_html_to_text() uses decode_entities() ("Decode all HTML entities"), which in turn uses the PHP get_html_translation_table() function, which supports only part of the HTML 4 entities (see http://ch2.php.net/manual/en/function.html-entity-decode.php#78665).
There's a conscious effort in decode_entities() to add ', but others like € and — remain undefined and are thus not translated into the proper characters.
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch decode-entities-support-all-entities.212130.32.patch. View
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 212130-D6-decode-entities-support-all-entities.patch. View
|#20||test-memory-inclusion.patch||8.06 KB||Damien Tournoud|
Invalid PHP syntax in test1.inc. View
|#18||212130-decode-entities-support-all-entities.patch||9.98 KB||Damien Tournoud|
Unable to apply patch 212130-decode-entities-support-all-entities.patch View
|#17||212130.patch||0 bytes||Damien Tournoud|
Failed on MySQL 5.0 ISAM, with: 152 pass(es), 6 fail(s), and 0 exception(es). View