Wikipath

Summary: A page text lookup grammar which attempts to understand wiki page structure
Version: 1.1
Prerequisites:
Status: alpha
Maintainer: Martin Fick
Categories:

Questions answered by this recipe

How can I get a list of all the <insert element> sections of a wiki page?

Description

A page text lookup grammar which attempts to understand wiki page structure

If you are familiar with them, think of wikipaths as a cross between regular expressions and XPath. Wikipaths allow you to query (lookup) text in a wiki page by describing the type of text that you are looking for or the type of section that the text will be in. In some sense they are a glorified version of pagetext variables mixed with the include directive. For example, like a pagetext variable you could lookup data values like:

Phone: (777) 222-5678

with {$:Phone}. But now imagine that you have many people listed on a page and their phone #s are all written just like above, but each person's data was preceded by a heading with their name. You can now say lookup the phone # after the heading Joe like this: {{$!%Joe:Phone}}

To achieve this, wikipaths have an intricate structured grammar which is designed around the wiki page structuring markups that allows for exact paths to your data to be defined. A wikipath is an expression (a pattern) that describes the path to the data that you are looking up.

Grammar

A wikipath takes on the following form: {{WikiPage$path}} where WikiPage is an optional name of a page (the default is the current page), and path is a path within the page.

Paths

A path is written from left to right, from general to specific. You can imagine a path as a filter which expresses which data to retrieve. A blank filter (path) would retrieve the entire text of a page, a simple filter might retrieve a section of a page, while a long intricate filter may only retrieve a word. A path is composed primarily of elements and identifiers. Elements specify the type of element to filter on, headings, list items, lines, words. Identifiers filter further within an element type, heading # 3, the line starting with the word Joe...

Elements

MarkupElementSpecial Notes
!HeadingLeveled Item
#Ordered List ItemLeveled Item
*Unordered List ItemLeveled Item
;Line
:Colon Text Variable
::Definition List Item
.Word

Leveled Items

With Leveled Items, an exact level can be specified just as you would with normal wiki markup.

i.e. !!! for a third level heading

It is however also possible to specify a range of levels by following the a number of items by a + to indicate more levels and a - minus to indicate fewer levels.

i.e. ##+ for 2nd or higher (3rd level, 4th level...) level list items or ****- for 4th or lower (3rd, 2nd and 1st) level list items

Identifiers

MarkupIdentifier
lettersLeading Word Identifier
numberNumerical Identifer
number-numberNumerical Range
number-Single Ended Increasing Numerical Range
number+Single Ended Decreasing Numerical Range
/ regexp /Regular Expression Identifier

Leading Word Identifiers are a simple, but not very powerfull way to specify which Elements to retrieve. They will only retrieve elements which begin with the Leading Word Identifier. i.e. {{$!To}} retrieves all the first level headings which begin with the word To.

Regular Expression Identifier on the other hand provide a powerfull but more complex way of specifying exactly which elements to retrieve; they must match the Regular Expression Identifier.

The various Numerical Identifiers provide a means to specify elements by position relative to other elements of the exact same type.

Identifier Examples:

{{$#3}} specifies the third top level ordered list item
{{$!!-Name}} specifies all the 1st or 2nd level heading items which begin with the word Name
{{$*3-}} specifies the first three unordered list items

Sections

But sometimes you might want to specify an element in relation to other elements that are not of the same type, by Section. Imagine that you want to specify all the list items under the second heading (but not a part of the second heading). Section Elements allow you to do just this like this: {{$!2#+}}. All of the Leveled Elements are also Section Elements, this means that they are useful for splitting a wiki page into Sections. Exactly how a Section is defined (where it starts and ends) depends on the section type, specifically:

Heading Sections extend from one heading to (but will not include) the next heading of the same level or higher.

List Sections extend within a list from one list item to (but will not include) the next list item of the same level or higher. Ordered and unordered list items are interchangeable and will readily terminate a section of the other type.

When specifying more than one Section Element in a row, all but the last Section Element in a wikipath will actually represent an entire Section of a wiki page. For convenience, the last Section Element simply represents the Element itself. If you want the last Section Element of a wikipath to also represent a section, simply follow it with a %. You may also use the % to separate two Section Elements of the same type from each other, say {{$##%###}} which would retrieve all the 3rd level unordered list items that are in a 2nd level unordered list item section.

Installation Instructions

To install this recipe, download the file wikipath.phpΔ into your cookbook/-directory and add include_once('cookbook/wikipath.php'); to your local/config.php

Notes

You can see and edit a live example of this recipe here.

While wikipaths allow you to retrieve lists of elements from a wiki page, they do not yet provide a way to format these lists. Eventually a templating method similar to pagelisttemplates could be implemented to do this.

Release Notes

V 1.1 - 11/28/06 - Works with pmwiki-2.1.10

A bug fix release for some section bugs that I found while developing unit tests.

V 1.0 - 11/09/06 - Works with pmwiki-2.1.10

First release, a good start, but many missing features and probably many bugs and ambiguous or unexpected path definitions. This will be an evolving recipe adding more element types and more identifier types as things progress along with ranges and other high level page structuring concepts. For now I have tried to keep the grammar rather simple and fairly easy to use, I hope that it can grow gracefully. :) This recipe will probably also need a lot of testing and tweaking to properly identify even the simplest elements. Despite my best efforts it really is a rather complicated task and recipe, but I think that it is now strong enough to actually be useful.

Comments

  • FYI: I had to add $lines = array(); in wikipath_lookup_molecule at line 369 (before loop over sections) to shut down php warnings. Aras, 12/04/2006
Thanks, I will look into this. I plan on making a new release soon, as soon as I finish my new table work, which looks promising! :) Martin Fick December 04, 2006, at 04:36 PM

See Also

Contributors

User notes? : If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.