00903: intermittent broken links with umlauts

Summary: intermittent broken links with umlauts
Created: 2007-03-20 14:59
Status: Closed (Server Locale bug, solution provided in the page).
Category: Bug
From: Frank
Priority: 55
Version: 2.2.0-beta37
OS: Linux/Apache 2.0/PHP 5.2.1 (module)

Description: (Sorry I can not judge from all the other reports whether this issue is already known)

Sporadically (one out of 5-10 browser refreshes) links in the sidebar containing German umlauts are not URL-encoded and thus broken. The umlauts (ü) are replaced by other characters, the pagename is truncated in one case (at German ß). PMWiki puts a question mark at such links indicating that the respective page doesn't exist (thus it is not a browser issue).

Having two such problematic links on the sidebar, it can happen that only one of them is broken, or both, or none. I cannot predict after how many refreshes the one or the other occurs.

Furthermore sometimes (even less frequent) one line on a FoxForum index page is also missing.

After one more browser refresh - everything is back to normal. The phenomen occurred on beta34, and also after upgrading to beta37.

Looks like a runtime issue with a timing problem.

See a screenshot here:

The smaller section in the black frame contains the page back to normal.

Regards, Frank

Solution copied from the bottom of this page:

I got my wiki to do umlauts properly now. It turned out that for some still not properly understood reason, the server's locale was changing erratically. Simply setting the XLPage locale entry to ...

  • 'Locale' => 'deu',

--Henning October 28, 2008, at 11:31 AM

I have exactly the same problem with all links in my PmWiki 2.2.0-beta68 installation under Windows (Server 2003 Enterprise Edition) with IIS.

It has an extremely heavy impact as it affects several hundreds of pages and uploads ... any hints towards a fix would be highly appreciated!

--Henning October 06, 2008, at 09:36 AM

Frank and I just compared notes per email, and we found that we are both using server clusters to host our wikis. Thus our best guess at the moment is that the temporary nature of the problem is somehow an effect of different elements of the cluster reacting differently.

(For me, the links are either consistently good or consistently broken for several hours, and I haven't noticed any connections to browser refresh attempts so far.)

--Henning October 06, 2008, at 11:01 AM

I now noticed more Umlaut-related problems that indicate that my problem at least is not one on the link/URL level alone. I'll look into i18n generally to see if there is something I have missed ...

--Henning October 07, 2008, at 09:19 AM

My wiki installation runs fine now after the service provider changed some "defaults" of the Windows/IIS combination. Unfortuately, he did not tell me exactly what settings of the server had to be changed to fix the problem, so I'm afraid the only feedback I can provide is that this is in fact a server issue, not a PmWiki issue.

--Henning October 13, 2008, at 07:00 AM

Well, looks like I rejoiced too early. Despite the claims of my service provider, they have not managed to fix anything - the recurring pattern is that I have umlauts in the mornings and loose them around noon. Every morning, the service provider reports the problem as solved ... but in reality, there is no progress at all.

--Henning October 14, 2008, at 09:13 AM

I'm still plagued by this problem ... some new ideas:

  • The pattern is
    • "Präsentation" -> "PrSentation"
    • "Prösentation" -> "PrSentation"
    • "Prüsentation" -> "PrSentation"
    • "Prßsentation" -> "PrSentation"
  • For comparison:
    • "Pr sentation" -> "PrSentation" (Wiki Word!)

This looks like PmWiki somehow treats the special characters like spaces, capitalizing after those.

This also applies to Frank's example above:

  • Frank's example:
    • "Grüße" -> "GrE"
  • identical to the result of:
    • "Gr e" -> "GrE"

I'm not sure why this is intermittend, but as links of the form [[http:www.pmwiki.örg]] are still encoded directly, it appears that some conversion routine in PmWiki that is only applied to Wikilinks is responsible for the effect we're seeing.

--Henning October 27, 2008, at 12:19 PM

After staring hard at the source for some time, I believe that this might be a problem of the MakePageName function.

It seems that regular expressions rely on the locale to determine the scope of definitions like [:alnum:]. I'm not sure in which Windows menu the locale is defined, but the server location at least is set to "United States", which might explain why my umlauts get filtered out of the wiki links ...

With regard to Frank's problem, it might be slightly more complex as it seems to be a common problem that the locale gets overwritten by competing threads ... I did not really understand what I found here, but it seems to apply:


--Henning October 27, 2008, at 01:58 PM

I got my wiki to do umlauts properly now. It turned out that for some still not properly understood reason, the server's locale was changing erratically. Simply setting the XLPage locale entry to ...

  • 'Locale' => 'deu',

forced PmWiki to ignore the locale set by Windows and to use the desired German locale instead. This solved all issues.

Of course I'd have preferred a clean configuration for the server, but as long as I've working umlauts, I won't complain. In fact, I love PmWiki for its capability to simply override the troublesome server settings!

(My problem is solved, but maybe Frank's is not, so I'll leave this PITS "open" for now.)

--Henning October 28, 2008, at 11:31 AM

I've just added my solution to the documentation under PmWiki.Internationalizations - hopefully helping others to avoid the problem from the outset! :-)

--Henning October 28, 2008, at 12:10 PM

For the record: I've had several cases in the last weeks during which the problem re-appeared for single page loads despite having applied the above solution. However, it was not the permanent problem I originally had, and it was in fact a very rare occurrence which (as in Frank's case) could be locally circumvented by merely re-loading. My conclusion: There is another yet unsolved aspect to this problem, though it only affects a very small proportion of the page loads. (Accordingly, I'm leaving this PITS on "closed" ... I'm adding this text only for the sake of completeness and in case someone else runs into a variation of this problem.)

--Henning February 02, 2009, at 02:52 AM