00035: Internationalized page names work even worse

Summary: Internationalized page names work even worse
Created: 2004-09-09 08:57
Status: Closed - fixed for 2.0.beta8
Category: Bug
From: Zverik
Priority: 35
Version: all versions
OS: -

Description: Consider the following link: Библиотека

I think, it links somewhere... Unusual.

And it looks like you'll make such changes to 1.0.9 version too. Or is it normal behaviour?

I'm having a little problem with redirection to pages with Russian names (in charset windows-1251). I've written translation module, and eveything looks good, but when a browser is redirected using 'Location: $pageurl' header, sometimes several last chars from page name disappear. For example (in transliteration), 'Biblioteka' redirects to 'Biblioteka/Biblioteka', but finally presents a page 'Biblioteka/Bibl'. For now I'm placing in those stripped pages redirect tags to correct ones, as I couldn't find a solution to this problem, but may be you could?

Thanx, Zverik

Ouch. This is being caused by the substitution of html character entities before the links are being processed, and the resulting characters are being treated as a valid pagenames. It will take me a bit to figure out what to do in this case.

As far as dealing with pages with Russian names, it's hard to validate/test it here because these pages are in charset iso-8859-1, which has different encoding and classification rules than windows-1251. I'll set up some windows-1251 pages on this site and we can experiment with it a bit. --Pm

As for windows-1251, I've already made some files to support that codepage for pmwiki 1.0.8, so I can upload them to you to replace existing iso-8859-5 structure on the PmWiki site (because that ISO codepage is not as spread as 1251 here).