00224: selective utf-8 characters damaged

Summary: selective utf-8 characters damaged
Created: 2004-12-09 06:21
Status: Closed - fixed in 2.0.6
Category: Bug
From: Ken?
Priority: 553
Version: up to current

Description: Selective words are somehow damaged when saved.

I think that all of the utf-8 issues on this page have been resolved for 2.0.6. If I'm wrong about this, open a new issue and give details there. --Pm

Not sure if you can see, but in this page http://www.pmichaud.com/wiki/PmWikiZhCn/WikiSandbox selective words comes out as �

Unfortunately, I found in release pmwiki-2.0.2 there is still problem in xlpage-utf-8.php. For example, I cannot use Chinese "Xi Tong (System in English)" (I use this because it does not display correct characters in this page), because the words will be changed after being saved. pmwiki-2.0beta55 is still OK.

And will anyone will follow my advice to use UNIX text format instead of UTF-8 format for xlpage-utf-8.php because it does make the unexpected warning message in Apache/Windows XP server. --weijiang

I found in release pmwiki-2.0.0, the issue reapears while was fixed in pmwiki-2.0beta55.

I just overwrite scripts/xlpage-utf-8.php of pmwiki-2.0.0 with that file from beta55 and then OK.

In addition, xlpage-utf-8.php is in UTF-8 format, which will make warning like "http header resent" in Apache/WinXP platform. I converted it to UNIX format, it is OK. --weijiang

I think it's fixed now -- could you test on PmWikiZhCn.WikiSandbox and let me know for sure?

If it's not fixed, then I'd need to know which characters are having trouble. It would help for me to know the exact codepoint(s) -- i.e., the &#nnnnn; encoding for the character(s) having trouble. --Pm

Problem is still there. So far I had found 2 characters with problem

one is 史

the other is 没

Okay, try it one more time -- I'm fairly certain I got it right this time. The examples were a *big* help. --Pm

Yes fixed! That was really fast. Thanks! --Ken

Is there anything I need to do? I had updated to beta14 and still have the problem with some characters. Test here shows that the characters are ok. So there must be something I'm missing.

No, it should be working as of beta14 -- something else must be going on? You can reopen this issue if it's not fixed on your site, just give an example url where it's not functioning. Note that you also need the latest version of the xlpage-utf-8.php file (from i18n.tgz). --Pm

In current 2.0.beta35 there is still a problem with the unicode symbol &#1074 - that is the small third letter in the cyrillic alphabet. Example is available at http://iassen.projectoria.org/wiki/Main/UnicodeTest --Turin

Problem was fixed just by adding "include_once('scripts/xlpage-utf-8.php');" in config.php. It was a variant I have already tried, just the error (PHP was compiled without mbstring support) that I received confused me and I thought it was some bug :) Sorry for the noise, keep up the great work!! --Turin

Please, fix the problem described above. The third letter of the cyrillic alphabet("ve", "в") still appears corrupted (as "�"), and this is after clean install of the current 2.0-beta36.

When I try to add "include_once('scripts/xlpage-utf-8.php');" into my config.php, I get lots of error messages which say "Warning: Compilation failed: this version of PCRE is not compiled with PCRE_UTF8 support at offset 10876688 in <...> on line 60".

Even if adding this line worked out for me, this should NOT be required. I mean, a user should not add cryptic lines into the config file just to get correct display of all the damn letters :)

OR this should be well-documented somewhere. --rm _________

I have a problem with o grave (Unicode 00F2, UTF8 C3 B2), which is a surprisingly simple character to give trouble: Vietnamese, having combined diacritics, is a test case for Unicode support. This page shows the effect: the black triangle is shown, and there is always a line break inserted in the draft, and often in the result copy.


It really messes up my translation. :(

I'm having trouble understanding how to translate the XLPageTemplate for pmwiki: is there anyone I can ask? There are strings which don't have entry fields. I filed it as a bug, but now I'm not sure it is: maybe I just don't understand what to do in that case.

from Clytie