Summary: Converts PmWiki markup into unstyled text
Prerequisites: PmWiki2 (developed and tested with 2.2.0-stable)
Questions answered by this recipe
How can I extract just the pure unstyled text from a given string containing PmWiki markup? Like MarkupToHTML does only without HTML tags.
Links should be converted to the usual link text PmWiki produces.
Developing SlimTableOfContents and extending SectionEdit I was in need for a recipe that gives the - properly unformatted - text of the headings.
I ended up with
function MarkupToUnstyled( implemented in this cookbook.
SlimTableOfContents uses the text-only result as link text in the TOC and SectionEdit creates the edit link html title from it.
Cookbooks SlimTableOfContents and SectionEdit (since v 2.2.1-2009-02-26) include this scrpt automatically.
When NOT using those cookbooks:
- activate the script as usual by adding the following line to your local/config.php:
- Customize the
$MarkupToUnstyledIgnorePattern array depending on recipes / markup your Wiki implements - see Customization
Whenever you need unstyled text-only call
$unstyledtext = MarkupToUnstyled(
$unstyledtext will contain no more markup, no links, no formattings, no HTML <tags>.
How it works
- redirects all link functions to suppress the generation of <a href></a> tags and to produce only the regular PmWiki link text
[[PageWithTitle|+]] becomes 'TitleOfPageWithTitle'
[[PageNotYetCreated|+]] becomes 'PageNotYetCreated'
- removes markup patterns from the input text which shouldn't be executed in step 4., i.e. removes markup that produces output we don't want in the unstyled text - see Customization
- removes html tags BEFORE evaluation markup (e.g.
[@..@] might already be wrapped with <code class='escaped'>
- evaluates markup by calling PmWiki's MarkupToHTML
- removes newlines from result
- removes html tags from result
- replaces non-styling
%...% - produced by $KeepTokens which might be restored in step 4.
- restores LinkFunctions back to their original function call
$MarkupToUnstyledIgnorePattern holds regex patterns for markup that should be ignored in unstyled text.
These patterns are removed from the input before calling MarkupToHTML.
By default it holds the replace pattern for
[[target |#]] reference links and
"(?>\\[\\[([^|\\]]+))\\|\\s*#\\s*\\]\\]", // [[target |#]] reference links
"(?>\\[\\[#([A-Za-z][-.:\\w]*))\\]\\]" // [[#anchor]]
Depending on the cookbooks / markups your Wiki uses you should extend the
$MarkupToUnstyledIgnorePattern array - after including the script.
E.g. if you have cookbook Footnotes installed you should add the following to your config.php:
$MarkupToUnstyledIgnorePattern = '\\[\\^(.*?)\\^\\]';
Cookbook SectionEdit already adds the following pattern:
$MarkupToUnstyledIgnorePattern = '\\(:sectionedit.*:\\)';
$MarkupToUnstyledIgnorePattern array will be extended in future versions - I'm no PmWiki expert and there might be a lot more
PmWiki builtin markups that should be ignored.
The recipe is required by cookbooks
If the recipe has multiple releases, then release notes can be placed here. Note that it's often easier for people to work with "release dates" instead of "version numbers".
- (2009-03-01) Added markup to $MarkupToUnstyledIgnorePattern
- (2009-02-26) Initial version
See discussion at MarkupToUnstyled-Talk
User notes? : If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.