00007: WikiWord counts available yet (v2)

Summary: WikiWord counts available yet (v2)
Created: 2004-08-26 19:56
Status: Closed, fixed for 2.0.beta6
Category: Bug
From: Pm
Assigned:
Priority: 355
Version: 2.0.devel3
OS:

Description: The code for rendering links based on counts of WikiWords has not been implemented yet.

In addition, it would be nice if there were pattern-based mechanisms for enabling WikiWords or converting words to specific page locations.

Steven Leite wrote: I consider this to be a vital function of PmWiki, as a matter of fact, I've never used WikiWords (at least not intentionally). It's particularly frustrating when entering text such as a postal code with mixed number and letters, only to have it wikified, and then having to go back and put the [=...=] brackets around the offending section of text. I give this feature a priority of 5.
Morgan wrote: Not to mention that having to add a ` or [=...=] doesn't help when the offending word is auto-generated, as in part of breadcrumbs. Not only does it look wrong, it's linking to a non-existent page (one with spaces, where there shouldn't be any).

Note that it's possible to disable WikiWords entirely (if that's what you want) by setting $LinkWikiWords=0; in config.php. Also, you can prevent a word from being treated as a WikiWord by placing a ` in front of it, as in WikiWord. --Pm

According to me the earlier method of listing words that should not show-up as wiki words is a better approach - it's a one time edit, located at one place. Please offer this feature as a plugin script, if not as a default one.

Zev Goldberg wrote on 2007-04-17: I hacked together this script from the FmtPageList function in the PageLists script to count all the words in the listed pages. I'm pretty sure this can be cleaned up a little, but I know this works for me. I'm using PmWiki version 2.1.27.

Add the following to your config file:

Markup('wordcount', 'directives',
  '/\\(:wordcount(\\s+.*?)?:\\)/ei',
  "FmtPageListWordCount('\$MatchList', \$pagename, array('o' => PSS('$1 ')))");

function FmtPageListWordCount($outfmt, $pagename, $opt) {
	global $GroupPattern, $FmtV, $FPLFormatOpt, $FPLFunctions;
	# get any form or url-submitted request
	$rq = htmlspecialchars(stripmagic(@$_REQUEST['q']), ENT_NOQUOTES);
	# build the search string
	$FmtV['$Needle'] = $opt['o'] . ' ' . $rq;
	# Handle "group/" at the beginning of the form-submitted request
	if (preg_match("!^($GroupPattern(\\|$GroupPattern)*)?/!i", $rq, $match)) {
		$opt['group'] = @$match[1];
		$rq = substr($rq, strlen(@$match[1])+1);
	}
	# merge markup options with form and url
	$opt = array_merge($opt, ParseArgs($opt['o'] . ' ' . $rq), @$_REQUEST);
	# non-posted blank search requests return nothing
	if (@($opt['req'] && !$opt['-'] && !$opt[''] && !$opt['+'] && !$opt['q']))
		return '';
	# terms and group to be included and excluded
	$GLOBALS['SearchIncl'] = array_merge((array)@$opt[''], (array)@$opt['+']);
	$GLOBALS['SearchExcl'] = (array)@$opt['-'];
	$GLOBALS['SearchGroup'] = @$opt['group'];
	$fmt = @$opt['fmt']; if (!$fmt) $fmt = 'default';
	$fmtopt = @$FPLFormatOpt[$fmt];
	if (!is_array($fmtopt)) {
		if ($fmtopt) $fmtopt = array('fn' => $fmtopt);
		elseif (@$FPLFunctions[$fmt]) 
			$fmtopt = array('fn' => $FPLFunctions[$fmt]);
		else $fmtopt = $FPLFormatOpt['default'];
	}
	$fmtfn = @$fmtopt['fn'];
	if (!is_callable($fmtfn)) $fmtfn = $FPLFormatOpt['default']['fn'];
	$matches = array();
	$opt = array_merge($fmtopt, $opt);
	$out = $fmtfn($pagename, $matches, $opt);
	$FmtV['$MatchCount'] = count($matches);
	if ($outfmt != '$MatchList') 
		{ $FmtV['$MatchList'] = $out; $out = FmtPageName($outfmt, $pagename); }
//start word count code : Zev Goldberg
	$string = $out;

	//place custom $string definitions here.

	$string = ereg_replace("(<[^>]+>)","\\1 ",$string);
	$string = strip_tags($string);
	$string = eregi_replace(" +", " ", $string);
	$string = explode(" ", $string);
	$word_count = 0;
	while (list(, $word) = each ($string)) {
		//add or remove characters in the following eregi function to customize
		if (eregi("[0-9A-Za-zÀ-ÖØ-öø-ÿ]", $word)) {
			$word_count++;
		}
	}
	return $word_count;
}

To call up the word count, place (:wordcount:) in your wiki entry and use the PmWiki.PageLists syntax for everything you want to count. I've only tested this with what I needed so far, using the group and fmt parameters.

You can tweak the $string definition statements to include/exclude additional strings and characters. Replace any strings you want, the only thing displayed is the final word count.