Oh, it wouldn’t be an issue if the rules weren’t constantly changing, would
it? How many of you were taught to place a pair of spaces after a period?
Indent a paragraph? How about quotes? Do you remember any specific rules about
quotes? Anyone ever tell you about a ‘Smart Quote” as you blasted out the
Great American Novel on your Remington portable? I’ll make it easier. How
many of you ever used a typewriter? Do you recall that, for the longest time,
there was no such thing (or a need) for a scalable font, the difference between
an Em-Dash and an En-Dash was relegated to the silent musings of the typesetter,
cursed to a life of reading little metal characters in reverse as he set up
the page prior to working his frustrations out by squeezing ink onto paper
with the sheer bulk of his output.
But now we’re in a world of broader rebellion. Everyone has become an author.
Everyone has turned into a typography specialist and all the things you used
to do with print are, well, wrong. That’s okay. Our world changes all the
time and you’ve gotta keep up. Still, those poor folks trying to apply the
new rules to your work often have a wonderful education in exposing the life
in your writing. Too bad they often have no idea how to make your work viewable,
properly, across platforms.
For instance, were you aware that Microsoft Word’s SmartQuotes really aren’t?
Did you know that the way Microsoft produces these characters is not compliant
with ASCII, ANSI, the Latin-1 unicode set? If you
produce a Word document, save it as HTML and publish it, only Microsoft users
will be able to properly render the page?
Recently, a request came into Dian asking her what the ASCII character code
for left and right SmartQuotes are. The answer, of course, is there is no
real ASCII representation of these characters. The ONLY way to represent them
is to leave the base ASCII code behind and move into the High ASCII range...and
Microsoft Word doesn’t even use these values for the job so the best answer
we could provide was to use characters 147 and 148 from the High ASCII range.
Sheesh!! That’s nuts!! There has to be some order here someplace, some way
to get the stylist satisfied while not completely mucking up your work as
it’s read on a Linux workstation. If you’ve been working with web pages at
all, you’re probably familiar with the concept of the Color Safety Palette
(the 216 color codes which will render the exact same color across all platforms).
Think of the rest of this article as the Type Safety Palette and you’ll probably
find it to be an easier meal to digest.
Here are the universally acceptable ASCII codes which will be properly represented
across platforms. (The following two tables are taken directly from Microsoft’s
own VBA help files)
Character Set (0 – 127)
| 0 |
|
32 |
[space] |
64 |
@ |
96 |
` |
| 1 |
|
33 |
! |
65 |
A |
97 |
a |
| 2 |
|
34 |
" |
66 |
B |
98 |
b |
| 3 |
|
35 |
# |
67 |
C |
99 |
c |
| 4 |
|
36 |
$ |
68 |
D |
100 |
d |
| 5 |
|
37 |
% |
69 |
E |
101 |
e |
| 6 |
|
38 |
& |
70 |
F |
102 |
f |
| 7 |
|
39 |
' |
71 |
G |
103 |
g |
| 8 |
*
* |
40 |
( |
72 |
H |
104 |
h |
| 9 |
*
* |
41 |
) |
73 |
I |
105 |
i |
| 10 |
*
* |
42 |
* |
74 |
J |
106 |
j |
| 11 |
|
43 |
+ |
75 |
K |
107 |
k |
| 12 |
|
44 |
, |
76 |
L |
108 |
l |
| 13 |
*
* |
45 |
- |
77 |
M |
109 |
m |
| 14 |
|
46 |
. |
78 |
N |
110 |
n |
| 15 |
|
47 |
/ |
79 |
O |
111 |
o |
| 16 |
|
48 |
0 |
80 |
P |
112 |
p |
| 17 |
|
49 |
1 |
81 |
Q |
113 |
q |
| 18 |
|
50 |
2 |
82 |
R |
114 |
r |
| 19 |
|
51 |
3 |
83 |
S |
115 |
s |
| 20 |
|
52 |
4 |
84 |
T |
116 |
t |
| 21 |
|
53 |
5 |
85 |
U |
117 |
u |
| 22 |
|
54 |
6 |
86 |
V |
118 |
v |
| 23 |
|
55 |
7 |
87 |
W |
119 |
w |
| 24 |
|
56 |
8 |
88 |
X |
120 |
x |
| 25 |
|
57 |
9 |
89 |
Y |
121 |
y |
| 26 |
|
58 |
: |
90 |
Z |
122 |
z |
| 27 |
|
59 |
; |
91 |
[ |
123 |
{ |
| 28 |
|
60 |
< |
92 |
\ |
124 |
| |
| 29 |
|
61 |
= |
93 |
] |
125 |
} |
| 30 |
|
62 |
> |
94 |
^ |
126 |
~ |
| 31 |
|
63 |
? |
95 |
_ |
127 |
|
* *Values 8, 9, 10, and 13 convert to backspace, tab, linefeed,
and carriage return characters, respectively. They have no graphical representation
but, depending on the application, can affect the visual display of text.
The following High ASCII values will NOT be universally represented. They
are represented as values 128 through 255:
Character Set (128 – 255)
| 128 |
|
160 |
[space] |
192 |
À |
224 |
à |
| 129 |
|
161 |
¡ |
193 |
Á |
225 |
á |
| 130 |
|
162 |
¢ |
194 |
 |
226 |
â |
| 131 |
|
163 |
£ |
195 |
à |
227 |
ã |
| 132 |
|
164 |
¤ |
196 |
Ä |
228 |
ä |
| 133 |
|
165 |
¥ |
197 |
Å |
229 |
å |
| 134 |
|
166 |
¦ |
198 |
Æ |
230 |
æ |
| 135 |
|
167 |
§ |
199 |
Ç |
231 |
ç |
| 136 |
|
168 |
¨ |
200 |
È |
232 |
è |
| 137 |
|
169 |
© |
201 |
É |
233 |
é |
| 138 |
|
170 |
ª |
202 |
Ê |
234 |
ê |
| 139 |
|
171 |
« |
203 |
Ë |
235 |
ë |
| 140 |
|
172 |
¬ |
204 |
Ì |
236 |
ì |
| 141 |
|
173 |
|
205 |
Í |
237 |
í |
| 142 |
|
174 |
® |
206 |
Î |
238 |
î |
| 143 |
|
175 |
¯ |
207 |
Ï |
239 |
ï |
| 144 |
|
176 |
° |
208 |
Ð |
240 |
ð |
| 145 |
|
177 |
± |
209 |
Ñ |
241 |
ñ |
| 146 |
|
178 |
² |
210 |
Ò |
242 |
ò |
| 147 |
|
179 |
³ |
211 |
Ó |
243 |
ó |
| 148 |
|
180 |
´ |
212 |
Ô |
244 |
ô |
| 149 |
|
181 |
µ |
213 |
Õ |
245 |
õ |
| 150 |
|
182 |
¶ |
214 |
Ö |
246 |
ö |
| 151 |
|
183 |
· |
215 |
× |
247 |
÷ |
| 152 |
|
184 |
¸ |
216 |
Ø |
248 |
ø |
| 153 |
|
185 |
¹ |
217 |
Ù |
249 |
ù |
| 154 |
|
186 |
º |
218 |
Ú |
250 |
ú |
| 155 |
|
187 |
» |
219 |
Û |
251 |
û |
| 156 |
|
188 |
¼ |
220 |
Ü |
252 |
ü |
| 157 |
|
189 |
½ |
221 |
Ý |
253 |
ý |
| 158 |
|
190 |
¾ |
222 |
Þ |
254 |
þ |
| 159 |
|
191 |
¿ |
223 |
ß |
255 |
ÿ |
The values in the table are the Windows default. However, values
in the ANSI character set above 127 are determined by the code page specific
to your operating system.
Note that characters 147 and 148 in the table above are listed as
unsupported in Windows. Sadly, these ARE the HIGH ASCII values for SmartQuotes . Words interpretation of these characters is mystifying
as the values used by Word are not even compliant with Unicode standards (that
I have been able to identify).
Also notice the little note at the end of the table. What’s this
Code Page stuff all about? Well, it’s fairly simple. The code page is the
interpreter between your keyboard and the operating system’s rendering of
the particular keystroke you pressed. And each key on your keyboard returns
an 8 bit value to the system when you press it. That 8 bit
value is interpreted directly by the OS depending on whether it is a keystroke
in the lower 128 (lower 4 bits) or the upper 128 (upper 4 bits). In
fact, that table may be soooo painfully incompatible that my editor for this
article may just banish me to the couch for a few days.
Holy CAPS LOCK, Batman!
Next time you’re sitting in front of your machine to compose your
next Op-Ed piece for the NYTimes ezine (that would be the North Yeoman Times
Online), remember that you’ll really look much less the idiot if you confine
yourself, where possible, to the standard Low ASCII set (characters 0-127). Whatever
you create using these characters will render properly in any browser on any
operating system and on any processor made in the last 30 years. Your characters
will also be HTML compliant right out of the box!
Oh!! And do yourself a favor—disable SmartQuotes and other bits of
funkiness in Word if you’d like to appeal to a broader audience. Perhaps at
that point there will be no further need for tools like the DeMoronizer script!
(http://www.fourmilab.ch/webtools/demoroniser/)
If you really need to represent a special character from that 128-255
range, be sure to use the HTML equivalents as found at http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html
for the ISO8859-1 (Latin) codeset.
Hey, Big Mouth, How About Some Simple Guidance?
Sure!! When producing documentation in which you don’t know what Fonts are
available or even what system will be used to view the information, use only
characters in the ASCII 0-127 range. If you need special characters (and you
will, if you’re doing any scientific publishing), refer carefully to the characters
from 128-255. If you’re producing HTML, use the special character shortcuts
or the raw ASCII code for that range. Don’t use Microsoft Office Autoformatting
features. If your audience is multinational, use UNICODE (double byte) character
sets appropriate to the viewer’s region (assuming you also know the language,
too).
That should keep you out of trouble.
|