If the right column looks the same as the left column, you're losing the eighth bit somewhere. If the characters in the right column don't match their descriptions, then your browser is translating incorrectly between ISO 8859-1 Latin 1 and your platform's native character set.
Finally, note that positions 127-159 are not displayable characters in ISO 8859-1 Latin 1, and are not part of any HTML standard, so that HTML code such as "™" is incorrect, and will be displayed differently in browsers on different platforms (probably often in ways that you did not intend). See the next chart below (unicode) for the (future) correct way of displaying characters which are in positions 130-159 in Microsoft Windows -- including such typographical niceties as "curly" quotes, dashes, ellipses, and the trademark symbol.
The following chart only tests the ISO 8859-1 compliance of your browser's non-proportional font.
32 160 Non-breaking space 33 ! 161 ¡ Inverted exclamation 34 " 162 ¢ Cent sign 35 # 163 £ Pound sterling 36 $ 164 ¤ General currency sign 37 % 165 ¥ Yen sign 38 & 166 ¦ Broken vertical bar 39 ' 167 § Section sign 40 ( 168 ¨ Umlaut (dieresis) 41 ) 169 © Copyright 42 * 170 ª Feminine ordinal 43 + 171 « Left angle quote, guillemotleft 44 , 172 ¬ Not sign 45 - 173 Soft hyphen 46 . 174 ® Registered trademark 47 / 175 ¯ Macron accent 48 0 176 ° Degree sign 49 1 177 ± Plus or minus 50 2 178 ² Superscript two 51 3 179 ³ Superscript three 52 4 180 ´ Acute accent 53 5 181 µ Micro sign 54 6 182 ¶ Paragraph sign 55 7 183 · Middle dot 56 8 184 ¸ Cedilla 57 9 185 ¹ Superscript one 58 : 186 º Masculine ordinal 59 ; 187 » Right angle quote, guillemotright 60 < 188 ¼ Fraction one-fourth 61 = 189 ½ Fraction one-half 62 > 190 ¾ Fraction three-fourths 63 ? 191 ¿ Inverted question mark 64 @ 192 À Capital A, grave accent ("À") 65 A 193 Á Capital A, acute accent ("Á") 66 B 194 Â Capital A, circumflex accent ("Â") 67 C 195 Ã Capital A, tilde ("Ã") 68 D 196 Ä Capital A, dieresis or umlaut mark ("Ä") 69 E 197 Å Capital A, ring ("Å") 70 F 198 Æ Capital AE dipthong (ligature) ("Æ") 71 G 199 Ç Capital C, cedilla ("Ç") 72 H 200 È Capital E, grave accent ("È") 73 I 201 É Capital E, acute accent ("É") 74 J 202 Ê Capital E, circumflex accent ("Ê") 75 K 203 Ë Capital E, dieresis or umlaut mark ("Ë") 76 L 204 Ì Capital I, grave accent ("Ì") 77 M 205 Í Capital I, acute accent ("Í") 78 N 206 Î Capital I, circumflex accent ("Î") 79 O 207 Ï Capital I, dieresis or umlaut mark ("Ï") 80 P 208 Ð Capital Eth, Icelandic ("Ð") 81 Q 209 Ñ Capital N, tilde ("Ñ") 82 R 210 Ò Capital O, grave accent ("Ò") 83 S 211 Ó Capital O, acute accent ("Ó") 84 T 212 Ô Capital O, circumflex accent ("Ô") 85 U 213 Õ Capital O, tilde ("Õ") 86 V 214 Ö Capital O, dieresis or umlaut mark ("Ö") 87 W 215 × Multiply sign 88 X 216 Ø Capital O, slash ("Ø") 89 Y 217 Ù Capital U, grave accent ("Ù") 90 Z 218 Ú Capital U, acute accent ("Ú") 91 [ 219 Û Capital U, circumflex accent ("Û") 92 \ 220 Ü Capital U, dieresis or umlaut mark ("Ü") 93 ] 221 Ý Capital Y, acute accent ("Ý") 94 ^ 222 Þ Capital THORN, Icelandic ("Þ") 95 _ 223 ß Small sharp s, German (sz ligature) ("ß") 96 ` 224 à Small a, grave accent ("à") 97 a 225 á Small a, acute accent ("á") 98 b 226 â Small a, circumflex accent ("â") 99 c 227 ã Small a, tilde ("ã") 100 d 228 ä Small a, dieresis or umlaut mark ("ä") 101 e 229 å Small a, ring ("å") 102 f 230 æ Small ae dipthong (ligature) ("æ") 103 g 231 ç Small c, cedilla ("ç") 104 h 232 è Small e, grave accent ("è") 105 i 233 é Small e, acute accent ("é") 106 j 234 ê Small e, circumflex accent ("ê") 107 k 235 ë Small e, dieresis or umlaut mark ("ë") 108 l 236 ì Small i, grave accent ("ì") 109 m 237 í Small i, acute accent ("í") 110 n 238 î Small i, circumflex accent ("î") 111 o 239 ï Small i, dieresis or umlaut mark ("ï") 112 p 240 ð Small eth, Icelandic ("ð") 113 q 241 ñ Small n, tilde ("ñ") 114 r 242 ò Small o, grave accent ("ò") 115 s 243 ó Small o, acute accent ("ó") 116 t 244 ô Small o, circumflex accent ("ô") 117 u 245 õ Small o, tilde ("õ") 118 v 246 ö Small o, dieresis or umlaut mark ("ö") 119 w 247 ÷ Division sign 120 x 248 ø Small o, slash ("ø") 121 y 249 ù Small u, grave accent ("ù") 122 z 250 ú Small u, acute accent ("ú") 123 { 251 û Small u, circumflex accent ("û") 124 | 252 ü Small u, dieresis or umlaut mark ("ü") 125 } 253 ý Small y, acute accent ("ý") 126 ~ 254 þ Small thorn, Icelandic ("þ") 255 ÿ Small y, dieresis or umlaut mark ("ÿ")
Some commonly-desired characters, such as the trademark symbol, as well as such typographical niceties as "curly" quotes, dashes, and ellipses, are not part of the ISO 8859-1 character set, and so cannot be displayed properly in HTML 2.0. If you put a raw 8-bit character in your file and intend it to be understood with a non-ISO8859-1 meaning, or put a numeric entity reference between 128 and 159 there (such as ""), then this is incorrect HTML, which will not display as you intended on browsers on other platforms, and maybe not even on other browsers on the same platform -- even when it "looks right" in your own browser.
One correct way to specify such characters in more recent versions of HTML (starting with the "Cougar" proposal -- now superseded by the proposed HTML 4.0 standard -- and/or "internationalized HTML" as specified in RFC 2070 is to use numeric entities greater than 255, which refer to positions in the Unicode character set, as outlined in the Usenet posting below. Unfortunately, these are only begining to be implemented in some newer brower versions at this moment, but will become more widely implemented in the future. (You can see whether your own browser understands these entities by looking at the third column of the table below.)
(See also http://www.w3.org/pub/WWW/TR/WD-entities (from the "Cougar" draft) or http://www.w3.org/TR/WD-html40-970708/sgml/HTMLmisc.ent (HTML 4.0) for relevant entity lists in the proposed HTML standards.)
[Question: valid HTML or no?]
The characters 128-159 are not used in ISO 8859-1 and Unicode, the character sets of HTML. MS-Windows uses a superset of ANSI/ISO 8859-1, known to experts as "Code Page 1252 (CP1252)", a Microsoft-specific character set with additional characters in the 128-159 range (also known as the "C1" range).
All the CP1252 characters are also available in Unicode. For example the CP1252 character 146 that you mentioned (RIGHT SINGLE QUOTATION MARK) has the Unicode number 8217, therefore you should use this number in order to conform to the HTML standard. Modern HTML browsers like Netscape 4.0 understand Unicode, and will automatically convert the Unicode character ’ back into the character 146 on MS-Windows machines, and into the appropriate character on other systems.
The official CP1252<->Unicode conversion table is printed in the Unicode 2.0 standard for instance, and is available on
The CP1252 characters that are not part of ANSI/ISO 8859-1, and that should therefore always be encoded as Unicode characters greater than 255, are the following:
Windows Unicode Char. char. HTML code test Description of Character ----- ----- --- ------------------------ ALT-0130 ‚ ‚ Single Low-9 Quotation Mark ALT-0131 ƒ ƒ Latin Small Letter F With Hook ALT-0132 „ „ Double Low-9 Quotation Mark ALT-0133 … … Horizontal Ellipsis ALT-0134 † † Dagger ALT-0135 ‡ ‡ Double Dagger ALT-0136 ˆ ˆ Modifier Letter Circumflex Accent ALT-0137 ‰ ‰ Per Mille Sign ALT-0138 Š Š Latin Capital Letter S With Caron ALT-0139 ‹ ‹ Single Left-Pointing Angle Quotation Mark ALT-0140 Œ Œ Latin Capital Ligature OE ALT-0145 ‘ ‘ Left Single Quotation Mark ALT-0146 ’ ’ Right Single Quotation Mark ALT-0147 “ “ Left Double Quotation Mark ALT-0148 ” ” Right Double Quotation Mark ALT-0149 • • Bullet ALT-0150 – – En Dash ALT-0151 — — Em Dash ALT-0152 ˜ ˜ Small Tilde ALT-0153 ™ ™ Trade Mark Sign ALT-0154 š š Latin Small Letter S With Caron ALT-0155 › › Single Right-Pointing Angle Quotation Mark ALT-0156 œ œ Latin Small Ligature OE ALT-0159 Ÿ Ÿ Latin Capital Letter Y With Diaeresis