No utf-8 for me

 | Web Design | 3 Comments | 0 TrackBacks

I never figured that the solution would be so easy. Those bothersome question marks within black diamonds that were appearing occasionally on some of my older blog entries were driving me nuts.

All that I had to do was change the following:

<meta http-equiv="Content-Type" content="text/html;
   charset=utf-8" />

into this:

<meta http-equiv="Content-Type" content="text/html;
   charset=iso-8859-1" />

See the difference? Perhaps this is not the most elegant way to fix this, not internationally compatible with every known character set on the planet, but it works.

No TrackBacks

TrackBack URL: http://www.kiffingish.com/cgi-bin/mt/mt-tb.cgi/73

3 Comments

UTF-8 is a standard way to store characters outside the ISO-8859-1 specification within 8 or 16 bit characters. Another way to do this is by using ISO-8859-2 for eastern europe, but since the ISO standard only stores characters inside 8 bits the ISO-8859-2 spec is not fully compatible with the ISO-8859-1 spec and you will lose characters like the copyright symbol.

The reason why some of the entries are not submitted correctly is because some browsers will follow the language setting of itself to submit fields inside a form. This will result in ISO-8859-1 submission to an UTF-8 site if there is no characterset specified inside the form itself.

Another, and better, way to solve the UTF-8 problem is by making all forms submit in UTF-8.

Thanks for the tip, Art. Now I figured an easy way to search and replace those characters. First I switch over to ISO-8859-1 so that the search field accepts the correct characters. Second, each entry containing such unwanted characters can be scanned and zapped accordingly. Finally when I am all done, I switch back to the more acceptable utf-8 and it's square one all over again. Piece of cake, really.

Oh yeah, about the copyright symbol. The limitiation you mention should not be a problem as long as you use the &copy thingie instead.

The copy thingie is not part of any characterset. It's a workaround introduced in html 3, just as well as some other characters and entities. If you go that way you can use &#number; as well for all your characters which are not in your current characterset.

Leave a comment

Recent Entries

No more winter greens
The cool, windy and grayish afternoon calls me onward. It's drizzling on my windshield but I do not care.
Champion of Scotland
I nearly cried my eyes out when I discovered that young Tommy Morris dies in the prime of his
Sticky keyboard
Up until yesterday the Del-, Ins- and Home- keys on the upper right-hand corner were sticking alot, and it's
Perl on steroids
Read this from beginning to end, and you then tell me with a straight face that it hasn't completely
Seventh Dutch Perl Workshop
Going to attend the 7th Dutch Perl Workshop tomorrow in Arnhem. Cannot wait to get up bright and early