Are You Suffering from Charset=UTF-8 Blindness?
There are a many roads that can lead to becoming a web designer. Some start off as programmers and then see the need to "design" graphics for the sites they are building. While others start out learning about color, kerning, and strong visual layouts before learning how to "code" — this is me. Whichever direction you follow you're bound to learn something new along the way. Something which seems common knowledge to other designers or programmers, you simply overlook. I will explain an example of this below.
My Character Set UTF-8 Blindness
I must have built hundreds of webpages with the following meta tag inserted in the head of the page ...
HTML<head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> ... </head>
I knew I needed it there but I never wondered why it needed to be there.
I just went along building pages with charset blindness. I often would copy and paste international text from Microsoft Word documents and then paste them into my HTML page, refresh the browser and then see a bunch of missing symbol characters everywhere (?). Well that is no good, so I would search Google for a table of HTML special characters, find the ones I was missing, copy and paste. Done.
It Turns Out, I Was Doing It Wrong
Someone on Reddit suggested I read an article written by Joel Spolsky titled, "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)". I would go a step further and say everyone using the internet should read this, or at least understand Unicode.
Joel explains in the early days of computers there was no continuity between countries and languages. A computer sold in the USA would focus on English characters and have little or no support for languages such as Hebrew or Chinese. Wikipedia defines ASCII as a character-encoding scheme based on the ordering of the English alphabet.
So it is basically an English-centric character encoding. This reminds me of how the US hasn't adopted the metric measurement system or how atlas's have the USA in the left corner of the map.
Each language would have english at its core and fill empty character slots with whatever they wished. The inconsistency of what was included in the free spaces caused a global communication problem.
The Internet Set the Stage for Unicode
Before the internet, a computer in China and a computer in the USA could live happy lives. But as soon as they started talking, a clear message quickly turned into gargley gook.
But of course, as soon as the Internet happened, it became quite commonplace to move strings from one computer to another, and the whole mess came tumbling down. Luckily, Unicode had been invented.
Joel Spolsky
Unicode goes a step further then ASCII by creating an all encompassing character set for all of the languages. Each letter is given a unicode reference such as U+0041. The "U" stands for Unicode and the numbers are a Hexadecimal value, similar to what is used in referencing colors in CSS.
How to Find and Add Unicode to Your Document?
The best way I have found to add the Unicode characters is to use the character map included on your computer. Locate the character, and either copy and paste or drag and drop. I have noticed I only need to do this when I am copying text from word documents or webpages encoded in anything but Unicode.
So, if you ever notice a missing character symbol on your page, first check the encoding in your HEAD and then replace it with a Unicode encoded character from your OS's character map palette.




