Dash-Pagenames
Description
Page naming with dashes/hyphens replacing spaces between words, UTF-8 and upper/lower cases preserved, for better readable page names and URLs.
This script changes how spaces are treated in links, to create or link to pages. Very readable page names can easily be created through link markup, or a 'new page' form like New Page Box Plus, in many cases making it unnecessary to use a Title directive, even when using international characters in a UTF-8 enabled wiki.
The CamelCase PageNames are no longer necessary, page and file names and URLs are created with words separated by dashes/hyphens and with international characters if UTF-8 is enabled.
In links spaces get converted to dashes/hyphens instead of WikiWords. If UTF-8 is enabled, UTF-8 characters are preserved. Upper and lower cases of letters are preserved.
Links in groups PmWiki, Site and SiteAdmin, as well as WikiWord links keep their functionality, and cross-linking pages between groups needs no special attention.
Installation
Download dash-pagenames.phpΔ, copy to Cookbook folder and include in config.php, after the PmWiki's utf-8 script, with
include_once("$FarmD/cookbook/dash-pagenames.php");
Configuration
The default configuration allows UTF-8 encoded characters in page names and URLs, with all lower and upper cases of letters preserved (if the language contains lower and upper cases for characters). This presumes that UTF-8 is enabled on the wiki.
Some other options are available, to be added to config.php before including the script:
1. To force only ASCII characters for page names and URL:
$ForcePageNamesToASCII = 1;
This strips accents from characters of languages based on the Latin alphabet. The PHP Intl module needs to be enabled on the server. This will not change non-Latin words into ASCII words.
2. To force all characters to lower case (language permitting)
$ForcePageNamesToLowerCase = 1;
3. To add more groups to $CamelCaseGroups, which need to follow PmWiki's native $MakePageNamePatterns
.
For example add 'MyCamelGroup':
$CamelCaseGroups['MyCamelGroup'] = 1;
Usage
URL examples
Easier readable, easier editable, possibly better for SEO:
- mydomain.org/music/workshop-notes/hab-Spaß-mit-Flöten
- mydomain.org/art/expositions/l’exposition-rétrospective-de-l’art-français
Pagelink examples:
With UTF-8 enabled, page names can be created which are much easier to read.
[[número de teléfono]]
will link to a page, or to create a new page, with namenúmero-de-teléfono
.[[CamelCaseWords]]
link toCamelCaseWords
[[Camel Case Words]]
link toCamel-Case-Words
, but in groups PmWiki, Site and SiteAdmin will still link toCamelCaseWords
[[wiki blog/]]
will link to groupwiki-blog
[[workshop notes/ hab Spaß mit Flöten]]
link toworkshop-notes.hab-Spaß-mit-Flöten
[[Calvin & Hobbs]]
will link to pageCalvin-and-Hobbs
(customisable replacement of solitary '&').[['''Calvin''' & ''Hobbs'']]
will link to pageCalvin-and-Hobbs
(some inline markup can be used in the link, but will be stripped for the page name).-
[[l'exposition rétrospective de l'art français]]
links tol’exposition rétrospective de l’art français
(custom replacement of "'" with "’" enabling use of apostrophe).
Title and other PageVariables
PageVariables work as normal, {$FullName}
, {$Name}
, {$Group}
will show dash-spaced names, whereas {$Title}
and {$Titlespaced}
show the page name with spaces instead of dashes.
Special URL characters
As is the case with PmWiki's native $MakePageNamePatterns
scheme, some characters need special attention in use:
The dot '.' , slash '/' , question mark '?', hash '#', percent '%', dollar '$', colon ':'.
The dot '.' and slash '/' are used to separate the group from the name part in a link. Anything from '?' or '#' on-wards will be stripped, as these characters are used in URLs to begin and separate parameters.
The dash/hyphen '-' substitutes the space as word separator, and extra dashes will be stripped, as well as anything which is not a pagename character.
Using Quotation marks
"Universal" quotation marks, and anything following it, are stripped. This is standard PmWiki behaviour. But one can use some UTF-8 alternatives, like English “double” quote marks, or other language-specific quotation marks. The single quote mark ' , the Apostrophe, will be replaced by ’ , the Right Single Quotation Mark, rather than being stripped. English ‘Single’ quote marks could also be used.
Some possibilities with UTF-8:
’ Apostrophe’s single use. “ ” English “double”. ‘ ’ English ‘Single’. « » French «Christmas trees». „ “ German „paws“. „ ” Polish „paws”. » « Swedish »reverse«.
Using inline markup /HTML Entities
HTML entities in text entered for links will be stripped. This means link text can contain and display HTMl entities and simple wiki inline markup (like bold and italics), but they will not be part of the page name and URL.
Using En– and Em–dashes
As a dash/hyphen is used to substitute the space between words, an en–dash could be used instead of a hyphen as a word–connector or for number ranges, like 65–75. En– or em–dashes won't be spaced with {$Title}
, {$Titlespaced}
or {$Namespaced}
, so if the skin displays the pagename as title, the en-dash punctuation remains, whereas any dashes in the pagename will be displayed as spaces. An em-dash can be typed in Windows with Alt+0150 (hold down Alt key while typing 0150 on the numeric keypad), on a Mac with Shift+Option+- (hold down Shift and Option while typing - ).
Notes
PmWiki uses page names as file names, this should not be a problem when using UTF-8. You may wish to read more about UTF-8 use in page names here: PmWiki.UTF-8#toc-4
The script makes use of an alternate version for $MakePageNamePatterns
to create page names with dashes/hyphens, and an alternate version of function MakePageName, which enables automatic switching between the original $MakePageNamePatterns
, essential for groups PmWiki, Site and SiteAdmin, and the patterns for dashed-spaced names, for all others groups.
The array $MPN_ReplacePatterns adds some extra replacement patterns, which get processed first. You can add your own replacement rules to these arrays, or disable any included. Some characters with special use in URLs are not permitted in page names, therefore some patterns are added to substitute certain special characters. An additon in config.php would look like this:
$MPN_ReplacePatterns["/\&/"] = ' et ';
//'et' gets substituted for '&'$MPN_ReplacePatterns["/\@/"] = ' à ';
//'à' gets substituted for '@'
To do / some day / maybe
Explore possibilities to preserve original UTF-8 character input in Title directly at time of page creation, when page names are forced to be in ASCII. (A new page form from Fox PageManagement using Fox can add a (:title ...:)
directive with the submitted name already populated).
Change log / Release notes
- 2023-02-17: Changed the script to use two new configuration variables: $ForcePageNamesToASCII and $ForcePageNamesToLowerCase. Removed $EnableUTF8PageNames and $MPN_UTF8_ReplacePatterns. Modified the script for optionally force page names to lower case (no letter case preservation). Modified the function which forces page names to ASCII.
- 2023-02-15: Added $EnableUTF8PageNames, set to 0 to force ASCII in page names. Added $CamelCaseGroups so other groups can be added to be except from dashed name patterns. Added $MPN_UTF8_ReplacePatterns.
- 2023-02-14: Modified spacing function to use not just dashes, but the native AsSpaced or UTF8 AsSpaced as needed.
- 2023-02-13: Fixed name and group patterns for when utf-8 is not enabled.
- 2023-02-12: Initial release.
See also
- Cookbook /
- AlternateNamingScheme Use other naming schemes for PmWiki pages
- ISO8859MakePageNamePatterns How to convert ISO 8859 character input for page names to unaccented ASCII equivalents
- Router Router allows a website's url structure to be different from PmWiki's group/page structure. (beta)
- UnaccentUTF8 Diacritics-insensitive page index and searches (Beta)
Contributors
Comments
See discussion at Dash-Pagenames-Talk?
User notes? : If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.