Grep

Summary: Use regular expressions to control what to include from a page
Version: 20171103
Prerequisites: PmWiki 2.2.58
Status: beta
Maintainer: Petko (original author: Eemeli Aro)
Users: +2 (View / Edit)
Categories: Includes, Markup, PHP55, PHP72, PHP83
Discussion: Grep-Talk
Download: grep.phpΔ
License: GPLv3+

Description

(:grep:) is a wrapper for (:include:) that lets you search and replace using regular expressions within the included text. Internally, it's a wrapper for the PHP functions preg_replace and preg_match_all.

For example, given a page with a list of news items:

[[#news]]
* 14.6.2031: Item one
* 3.12.2030: Item two
* 24.5.2010: Item three
* 17.5.2010: Item four
[[#newsend]]

  • 14.6.2031: Item one
  • 3.12.2030: Item two
  • 24.5.2010: Item three
  • 17.5.2010: Item four

The following directive would allow you to highlight the items with future dates:

(:grep #news pat='(\d\d?)\.(\d\d?)\.(\d{4})' repl="{:if date ..$3-$2-$1:}'''$0'''{:else:}$0{:ifend:}":)

  • 14.6.2031: Item one
  • 3.12.2030: Item two
  • 24.5.2010: Item three
  • 17.5.2010: Item four

To install

  • download grep.phpΔ to your cookbook directory
  • add the following line to your configuration file:
include_once("$FarmD/cookbook/grep.php");

Using the (:grep:) directive

At its most basic, (:grep:) requires a subject (the source text), a pattern (what to look for) and a replacement (what to output for each match):

(:grep Group.Page#section pat=pattern repl=replacement:)
subject may also refer to a section on the current page with (:grep #section ...:). This may be useful for formatting data from the page itself that is hidden using (:if false:). If left empty, the subject will default to the current page, in which case match=1 will also be set to prevent recursion (see below).
pat=pattern should be a valid PHP regexp pattern without delimiters or modifiers
repl=replacement should be the text with which to replace each found match.
As the example above shows, any (:directives:) in the pattern or replacement will work, if curly brackets are substituted for normal ones: {:directive:}. This is required to correct the order of operations when turning markup into HTML.

All other parameters are optional:

(:grep ... mod=abc match=1 limit=123:)
mod= Pattern modifiers; a string of characters from the set imsuxADSUXJ. Note that PREG_REPLACE_EVAL ('e') is not allowed. If not set, the default is to enable PCRE_MULTILINE (mod=m).
match=1 If set, only the matching strings are output. If repl is not set, these are separated using a newline.
limit= The upper limit for replacements.

(:grep:) also accepts all of the same parameters as (:include:):

(:grep FullName#fromanchor#toanchor ... lines=123 self=0 basepage=abc variable=value:)

Custom processing of the matches

From version 20171103 you can have your own custom function post-processing the matched results (only with match=1). To enable it, in config.php set the variable $GrepTextJoinFunction and the function that will process the results, for example:

 $GrepTextJoinFunction = 'MyGrepJoin';
 function MyGrepJoin($pagename, $opt, $array) {
   if($opt['nodup']) $array = array_unique($array);
   if($opt['sort']) sort($array, SORT_LOCALE_STRING);

   return implode('', $array);
 }

The custom function will receive as arguments the current $pagename, the $opt array (including any options not used by GrepText, ie you can define your own), and the array with the results of the replacements. That function needs to return a string to be output into the page, see example.

Notes

I wrote this recipe today for the exact purpose mentioned in the example: to highlight future event dates in a list of news items. It might have been easier to write some custom code for it, but this way the code is more robust and someone else might find it useful one day.

Release notes

  • 20171103: add $GrepTextJoinFunction.
  • 20171102: updated for PHP 7.2 compatibility.
  • 20150816: updated for PHP 5.5 compatibility.
  • 2010-06-17Δ — bugfix: empty lines resulted in visible <:vspace> in output due to using PVSE() instead of PVS() (Eemeli)
  • 2010-06-14Δ — first public release (Eemeli)

See also

Cookbook /
Reminder  Birthday, anniversary and other task reminders (Stable)
TextExtract  search, grep, and extract text from other pages or groups with search terms and regular expressions, using search form or markup expression. (stable)

Contributors

Comments

See discussion at Grep-Talk

User notes +2: If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.