Internationalizations-Talk

SEO-problem with internationalized sites.

When people make internationlizations, they download translations archives with those language documentation, pages like "PmWikiRu.ConditionalMarkup". By default, this groups are not closed from indexation by search engines. So, the whole national documentation is being indexed on every pmwiki-site, that is not good for SEO. This was also discussed here: 00330 but a long time ago.

I have suggestion how to fix it: extend core function XLPage(); so it could extend $MetaRobots array, adding national group to it, like it is done now with PmWiki group.

UPD: another and easier solution is just to extend default $MetaRobots variable, adding to it any group starting with PmWiki*...

Finar April 13, 2021, at 11:59 AM

Translation Portal

According to the Library of Congress the language code for Slovenian is "sl", not "si". RandyB August 13, 2010, at 11:43 PM

I think it may be of some use to have a central page devoted to translation (aimed to translators) - the present page is for i18n by users. the translation portal is a central hub for translators as starting point jdd Unfortunately, an anonymous contributor have deleted on march,16 the link to the localization page, so it have disappeared in space. It was intended to be the resource you are searching for. I think these two pages shall be merged somewhat. The same contributor also have made an energic cleaning of the Internationalization page and the pertinence of all that erasing shall be carefully checked.PRZ

feel free to restore it!

The creation of a "localization" group is a very good idea (I'm too new here to take such decision :-), but I think keeping at least one page aimed to translators in the PmWiki group is a good idea, must simpler to find, I guess. the other translator pages should be moved to the new group. jdd

As suggested by PRZ above and agreed by Petko (see Localization.Localization-Talk), the help for translators will be moved to the Localization group; while keeping the installation, configuration and usage of other language(s) support in the PmWiki documentation. But I have improved the introduction of the Internationalizations page, to take care of your concern that this information is not "easy to find". -- SchreyP April 04, 2010, at 06:38 AM

Multilingual pmWiki-based website, is it possible?

Would it be possible to set pmwiki in a way that allows to maintain a multilingual website? I realize when you discuss i10n and l18n you refer to the wiki system itself, but on the content level it may be more than desirable, especially if the user may select his/her desired language for the content. VirtualFlavius?

Might be possible and actually sort of a common feature. (A heavy-weight CMS I know offers several language instances of the same page "seamlessly" - which is different from the Wikipedia, where you have to actively change the Wikipedia instance.) Have you had a look at the PITS yet to see if someone already made a feature request for this? --Henning June 25, 2007, at 09:55 AM
I've had some success using (and extending) the SubgroupMarkup recipe for multilingual websites. This allows me to create different language instances of a page by appending a language code as a subgroup. For example, "Main/Homepage,EN" would refer to the English version of "Main/Homepage", "Main/Homepage,FR" would contain the French version, and so on. --gh? December 3, 2007

Is it possible to get ride of all accented letters in the distributed pmwiki files?

For example in the french translation, there is PmWikiFr.BacÀSable PmWikiFr.CréationAutomatiqueDeVignettes etc, it conflicts with some servers. --Farvardin

Hi. I agree with you, but I currently see no way other than manually (1) creating new pages without accented names, (2) moving the content and deleting the old pages, (3) fixing all links to point to the new pages. The French docs are on my non-urgent ToDo list, but I wouldn't mind if others could help out. --Petko May 08, 2009, at 05:19 AM

I've converted it to UTF-8, I couldn't get rid of the accented letters, but at least it will use utf-8. https://www.pmwiki.org/wiki/Cookbook/French --Farvardin

In a fresh PmWiki installation, include_once("scripts/xlpage-utf-8.php") is active. I created a page "Main.Thème"... The file name is Main.Thème. Why the è is not display correctly? Is there a way to correct that?

Check if your file manager or ftp browser is configured to display the remote directory listing in the encoding of the wiki. In your case, the encoding is likely UTF-8. For the following programs, set:

 Krusader :  Menu View  > Select Remote Charset > Unicode (UTF-8)
 Konqueror : Menu Tools > Select Remote Charset > Unicode (UTF-8)
 Dolphin :   Menu Tools > Select Remote Charset > Unicode (UTF-8)
 FileZilla : Site Manager > Select a site > Charset tab > Force UTF-8

German Internationalization Problems

I'm trying to sort out some problems with the German-specific characters ("AÖÜäöüß") in preparation for moving my PmWiki from one server to another. (Both are Windows machines running IIS.) As I had considerable difficulties making links with these characters work on the old server, I'm trying to understand PmWiki's localization schemes better in order to avoid a repitition of these difficulties.

Analyzing the config.php, I find that in my existing installation, contrary to the PmWiki documentation, I had to comment out the line

 include_once("scripts/xlpage-utf-8.php"); 

If I include it, the umlaut links won't work. I have included the standard line

 XLPage('de','PmWikiDe.XLPage');

while the XLPage does contain a statement

 ‘Locale’ => ‘deu’,

which five years ago was critical for making umlaut links work at all. I find that today, it doesn't seem to make any difference at all. Note: I'm not sure why it's 'deu' instead of the 'de_DE', but it only worked with 'deu' five years back. (The server is administered by a service provider who changes things around without informing me about it.)

The locale that today is actually set on the old server is

 German_Germany.1252

as reported by the PHP statement

 print setlocale(LC_ALL,NULL); 

On the old server, I have not made any attempt to save the config.php in any specific encoding. I have always edited it either with MS Notepad or with MS Wordpad.

My questions regarding localization:

  • What is the impact of the locale?
  • What is the impact of the inclusion of the xlpage-utf-8.php?
  • What is the impact of the encoding of the config.php?
  • How does PmWiki determine the encoding of the config.php?

I noticed that on the new sever, when I use the Visual Studio Editor to save the config.php with encoding defined as Windows-1252, it adds an XML tag to the HTML output:

 <?xml version="1.0" encoding="Windows-1252"?>

However, this is also applied to the content provided by PmWiki in response to action=download requests, breaking downloading with $EnableDirectDownload=0;

Any help in understanding the various elements of PmWiki internationalization would be highly appreciated! Many thanks in advance! --Henning March 01, 2013, at 10:45 AM

The "locale" is used for the dates in RecentChanges and in the page footers, notably the names of the months and the days of the week.

The "encoding" or "character set" or "charset" is a set of rules mapping all written "letters and signs" to "bytes" or sequences of bytes. There are many character sets. Older ones like Windows-1252 or Windows-1251 use one byte per letter or sign - - and there are only 256 positions, so this is mostly good for the signs in one language family - eg. Western European languages - but for other languages, eg. Central European languages, for Cyrillic, for Greek, etc, there is another character set in which about half of the positions are used for other letters or signs (in different charsets: same byte position=different letter). UTF-8 is a newer (since 1993) charset and contains all possible letters and signs of human alphabets, but uses not a single byte, but 1-4 bytes to represent them. When a browser receives some bytes it needs to know in which charset they are in order to correctly display the letters, and also when you edit the page, to post the letters you entered in the correct byte positions. The browser knows the charset either by a server "header" or by a <meta> tag in the HTML source (if none found, tries to guess it, or uses the default one ISO8859-1).

Obviously, the "locale" also needs a "charset" to be able to provide the names of the months in the correct byte positions. Your operating system has a default charset for the locale, and it may be possible to select a different charset with a suffix, for example deu.1252 or de_DE.cp1252. The first part depends on you operating system ("deu" in Windows, "de" or "de_DE" in Linux), the second too, so you should read the documentation of your system, or possibly the page https://www.php.net/setlocale and the comments there.

Obviously, your config.php file should be saved in the charset of the wiki content, otherwise some of the characters in it will not be on the correct byte positions and will not display correctly.

When PmWiki writes a wiki.d file, it uses the byte positions of the current charset. As long as the file system (FAT32, NTFS, ext3, ReiserFS...) accepts to save a filename with that sequence of bytes, and when asked about the saved filename it replies with that same sequence of bytes, it is fine for PmWiki. In some cases this charset is different than the default one of your operating system (Windows, GNU/Linux, MacOS...), file manager or FTP browser so it is possible that when you browse the wiki.d directory with your Windows Explorer, some filenames appear partially or completely broken, even if your wiki site displays fine.

PmWiki and PHP don't determine the charset of config.php, they read the bytes as they are. But PmWiki needs to know in which charset the pages are saved. Again, if we tell it that the charset is 1252 but config.php was saved in UTF-8, it will not work as intended. If your text editor doesn't let you select the output encoding, you should use one that does, see LocalCustomizations#encoding for a short list.

Also, when you use ?action=download, PmWiki normally doesn't send a header with the charset of the uploaded file because it doesn't know it, and your browser does what it can to correctly display it. When you directly download a file from the server, PmWiki doesn't interfere and in some cases the server sends the charset header which can be the correct one or not, and in other cases your browser does what it can to correctly display it. If it is not correct, for example in Firefox you can select View>CharacterEncoding>...(try some) to see the correct characters. Note that this is only for uploaded files, not for wiki pages which should display fine or your config.php needs fixing.

The impact of the inclusion of the xlpage-utf-8.php is enabling support for the UTF-8 universal character set for wiki.d pages - their content and their filenames, and for the HTML output sent to the browsers (and also for case insensitive searches). It is strongly recommended for new wikis where you start with an empty wiki. But if you have an existing wiki which you started in another encoding, the one thing it cannot yet fix is the filenames which are in another charset. Newer versions of PmWiki can automatically convert the "content" inside the wiki.d file, but the if "filename" contains more than the plain "A-Za-z0-9-." characters common to different charsets, it will mean nothing to PmWiki.

Lastly, when you comment out xlpage-utf-8.php, PmWiki saves the pages in the ISO-8859-1 charset which is almost the same as Windows-1252, and browsers treat them as interchangeable.

In your case, if you already have pages with umlauts in the filename, and if they are saved with the 1252 charset, you should first try to use the same when you move them to the new server. Test the wiki, see if links and pagelists correctly display the pages with umlauts. If not, when you copied the files from the old to the new server, the filenames may have been recoded from 1252 to UTF-8 - Windows does that sometimes. If this is the case, include xlpage-utf-8.php. If the links and pagelists work fine now, you just need to set the correct UTF-8 locale with the XLPage() line. This should work with a recent version of PmWiki which can automatically recode the "content" of the wiki.d files, even if they are in another charset. --Petko March 02, 2013, at 04:33 AM

Many thanks! Your explanations were a great help for me in sorting out my Umlaut problems! They also helped me to be confident with the thus-found solution ... I always had a bit of a "sword of Damocles"-feeling about the Umlauts before!

On a more technical level, I've managed to set the proper Locale on the new server now with

  'Locale' => 'German_Germany',
  'TimeFmt' => 'm.H:%M Uhr',

I have also removed the xml tag MS Visual Studio added to the config.php file. Of the "ÄÖÜäöüß" combo, I have only used a single "ü" in the config.php file, and as it's the alternative text for a link icon, I replaced it with a ü HTML entity, so I should be on the safe side now (because the config.php file now consists entirely of the standard "US" characters, which hardly ever get confused).

With the removal of the xml tag at the beginning of the config.php file, downloads now work again even with $EnableDirectDownload=0; ... I believe the xml tag added by MS Visual Studio must have been redundant for the encoding and inadvertendly messed up pmwiki downloads. (It even affected binary files.)

I do indeed have a considerable number of pages with Umlaut names in my existing wiki, and it seems that they after all the changes outlined above, they do in fact display correctly. I'd conclude that Windows (or the particular ZIP software we used) did not perform any automatic UTF-8 conversion during the transfer of the files to the new server, so the existing setup works just fine with include_once("scripts/xlpage-utf-8.php"); commented out.

--Henning March 04, 2013, at 10:09 AM

Unfortunately, I've now hit more Umlaut problems. They are apparently connected to Cookbook.WikiForms, but as this recipe works quite well on the old server while it fails on the new server, there seems to be a configuration-related aspect, too. (The problem with wikiforms is that the summary lists show fields as empty if they contain at least one Umlaut. This also applies to the Edit Form action). --Henning March 12, 2013, at 09:26 AM

I hope the author of the WikiForms recipe can help. --Petko March 14, 2013, at 01:22 AM

I've posted a question on WikiFormsFAQ, but so far, no luck. However, I had a look at the source code, and despite being PHP-illiterate, it occurred to me that some string-handling functions apparently depend on the locale. I've used a modification of the EZLocaleDate code to display the current locale settings, and they read LC_COLLATE=C;LC_CTYPE=English_United States.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C if I don't set anything else on the XLPage. I guess this is not a suitable setting to properly process Umlauts!

However, when I include a promising Locale statement in the XLPage so that the locale is read back to me as ...

LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=German_Germany.1252;LC_TIME=English_United States.1252

... (which looks pretty good in my opinion), this still doesn't change the WikiForms behavior.

If I also explicitely change the LC_TIME to German, this results in the read-back to be shortened to German_Germany.1252, which I figure is OK as all of the locale's categories have the same value. However, even this does not change the WikiForms behavior.

As WikiForms ran perfectly fine for some five years on the old server, I tend to think that it must be a problem of the server configuration since this is what changed. It's just that I don't seem to be able to find out just exactly the right way to configure the new server ... :-(

--Henning March 19, 2013, at 12:37 PM

Questions about a particular cookbook recipe should be asked where the author of the recipe can notice them - either on the talk pages or maybe contact him via e-mail. --Petko March 20, 2013, at 01:51 AM

Thanks for the suggestion, I already posted over on the WikiForm pages and will try the email route, too!

I hope it's still on topic here to ask two general pmwiki questions, even though my interest of course stems from my problems with a specific recipe?

  • Is it guaranteed that the locale settings defined on the XLPage automatically apply to recipes as well (for example in basic string handling routines like strtoupper), or would a recipe have to be coded in a special way to apply the XLPage locale settings?
  • Would the sequence of the include-statements have an impact on the locale settings applying to a specific recipe?

--Henning March 20, 2013, at 07:58 AM

An update: jr, the author of the WikiForms recipe, mailed me a newer version of the recipe, fit for use with PHP 5.4.8, which solved the problem. The problem apparently stemmed from my use of a newer version of PHP on the new server (5.4.8 versus 5.2.5) which defaulted to UTF-8, leading to the observed symptoms. So it seems my general pmwiki internationalization settings were all correct (thanks to Petko for his invaluable help with that!) and don't actually need any further tweaking. Many thanks to Petko and jr, I could never have get my wiki to run properly 'i18nized' without your help!

--Henning March 22, 2013, at 06:02 AM


Describe $Name here/Site.PageNotFound problems

My Wiki installation seems unwilling to display anything but "Describe $Name here" when a non-existing page is called. I have a Site.PageNotFound in place, but it doesn't get used. I'm also unable to localize the "Describe $Name here" message in the way described in XLPageCookbookTemplate. My impression is that I might have broken my basically very old and frequently updated PmWiki installation in this regard, so now I'm looking for some pointers on how to fix it. You help would be much appreciated - many thanks in advance!

--Henning May 07, 2013, at 06:54 AM

You need to translate in your XLPage not this string, but the name of the translated PageNotFound page, for example

  '{$SiteGroup}.PageNotFound' => 'PmWikiFr.PageNotFound',

or

  '{$SiteGroup}.PageNotFound' => 'Site.PageNotFoundFr',

The other string "Describe $Name here" is from (very) old PmWiki versions and is not used now. Also, the XLPageCookbookTemplate is about Cookbook recipes needing translation. For the PmWiki's own strings see XLPageTemplate.

If this doesn't help, which PmWiki version you use, and can we look at the site? --Petko

Thanks for narrowing it down! After doing everything "by the book" and not having any success, I looked at the recipes and found the problem to be caused by a side effect of Cookbook.Markup Extensions. From Cookbook.Markup Extensions-Talk, it looks like I had first encountered this problem in 2007, working around it by commenting out a line in the original source code. It resurfaced when I recently installed a new version of the recipe ... the hole I dug for myself!

(I'm running pmwiki-2.2.46. The wiki is only accessible from a corporate intranet, unfortunately.)

--Henning May 07, 2013, at 11:17 AM

How often international files are updated?

How often files in https://www.pmwiki.org/pub/pmwiki/i18n/ are updated?

Finar January 28, 2020, at 10:00 AM

I update them manually when I see that someone finished editing. Can I update the PmWikiRu files now? --Petko January 28, 2020, at 10:06 AM

Not yet, I'm still working on. A have some questions here: Features. Finar
Translation is finished. Petko, update the files please. Finar

Thanks, updated. From now on, let's agree that if a trusted person edits a translation, I'll update the download archives within 24-48 hours -- no need to leave me other messages unless I obviously forgot. We already do this for PmWikiDe. You can see the date of the last update in the /i18n/ listing above, the times are apparently in the Pacific Time Zone UTC-8, not sure why. --Petko January 30, 2020, at 06:25 PM

Allow translations of PageNotFound message?

Wouldn't it make sense to wrap the message in Site.PageNotFound with $[] ?
$[The page "{$RequestedPage}" doesn't exist.] ? --SteP February 08, 2020, at 08:02 PM

Please see the previous, previous, section (-2) on this page, you are supposed to provide a full translated page. Otherwise yes, you can edit that page and change it however you like -- when you do this on your own wiki, the new page file gets automatically stored into wiki.d and the default one from wikilib.d is no longer used. --Petko February 08, 2020, at 08:13 PM

This is a talk page for improving PmWiki.Internationalizations.