00673: PmWiki changes a character with a [[http://en.wikipedia.org/wiki/Macron|macron]] to a character with a [[http://en.wikipedia.org/wiki/Circumflex|circumflex]] on save

Summary: PmWiki changes a character with a macron to a character with a circumflex on save
Created: 2006-02-21 18:08
Status: Closed - not a bug
Category: Bug
From: Simon
Assigned:
Priority: 4
Version: 2beta26
OS: W2000/Apache2/PHP4
Description

PmWiki changes a character with a macron to a character with a circumflex on save.

See Macron

Problem
A browser or server (not PmWiki) changes a character with a macron to a character with a circumflex on save (see the rightmost column in the table below).
CharacterUnicodeISO-8859-1approximation HTML entitychr
Capital letter A with macronU+0100AAĀ ĀÂ
Small letter a with macronU+0101aaā āâ
Capital letter E with macronU+0112EEĒ ĒÇ
Small letter e with macronU+0113eeē ēç
Capital letter I with macronU+012AIIĪ ĪÎ
Small letter i with macronU+012Biiī īî
Capital letter O with macronU+014COOŌ ŌÔ
Small letter o with macronU+014Dooō ōô
Capital letter U with macronU+016AUUŪ ŪÛ
Small letter u with macronU+016Buuū ūû

References

IE6.02āā
Firefox 1.5.0.1ā
Opera 8.5.2ā
Mozilla 1.8a5ā

ĀĀĀĀ


PmWiki doesn't do any such conversions on its own -- there must be some other component that is performing the conversion. I suspect its your browser that is actually changing the characters.

In particular, my browser (Firefox 1.5) is automatically converting "Ā" into Ā prior to sending it to PmWiki.

One thing to note is that that macron characters are outside of the 0-255 codepoint range, so unless a site is using UTF-8 a browser might be very confused and simply try to map things onto ISO-8859-1 (Latin-1, PmWiki's default).

At any rate, I don't think this is a PmWiki bug but is instead a browser bug. What browser are you using?

Pm

I see, I mainly use IE 6.02, but I also test with Opera 8 and Firefox 1.5, all of which converted as you found above on PmWiki (but not my wikis).
Is it possible it a (Apache) server setting? eg would it help if the content-type was set to a character set?

After more testing it is related to the generation of the macron character, at home I use the Microsoft Mâori keyboard on XP (see language settings), at work it is a a program called Patuhi. Having said all that I have yet to determine how to stop it happening on my wikis, when I find out I'll tell ya.

thanks to you PM

It's possible that Apache is sending the wrong content-type or character set. I've seen instances where Apache likes to enforce utf-8 on character sets even when the application specifies a different one. Is there a url I could test so that I could see the headers your server is sending back? Perhaps that would provide a clue.

Pm


It is quite confusing. Even using the Microsoft Māori language keyboard (eg in this ūpdātē) while the character displays correctly in the text area before submission, it is wrong afterwards - when displayed.

Yes, please try http://ttc.org.nz/pmwiki/pmwiki.php/Test/Macron, unfortunately the site I am having most trouble with is an intranet.