01136: change PTV use in pagelists to be more intuitive

Summary: change PTV use in pagelists to be more intuitive
Created: 2009-09-01 21:26
Status: Open
Category: CoreCandidate
Assigned:
Priority: 52
Version: 2.2.0 beta 65
OS: 5.?

Description: Currently there is no way to differentiate between an undefined PTV as opposed to a defined-blank PTV. And the way to select on a blank PTV is the highly unintuitive $:ptv=-?* while the way to select on a non-blank PTV is the only slightly less unintuitive $:ptv=?*.

Suggestion:

  • To select by defined and undefined:
    • defined: $:ptv=*
    • undefined: $:ptv=-*
  • To select by blank and non-blank:
    • blank: $:ptv=""
    • non-blank: $:ptv=- or $:ptv="-"
      • technically $:ptv=-"" would be clearer, but it doesn't work without changes to ParseArgs which seems too invasive

To accomplish this I offer this modified PageListVariables() function from scripts/pagelist.php below:

function PageListVariables(&$list, &$opt, $pn, &$page) {
  global $PCache;
  switch ($opt['=phase']) {
    case PAGELIST_PRE:
      $varlist = preg_grep('/^\\$/', array_keys($opt));
      if (!$varlist) return 0;
      foreach($varlist as $v) {
        switch ($opt[$v]) {
        case '*':
          $opt['=varexists'][] = preg_replace('/^\$:?/', "", $v);
          break;
        case '-*':
          $opt['=varnoexists'][] = preg_replace('/^\$:?/', "", $v);
          break;
        case '':
          $opt['=varinclp'][$v] = '/^$/';
          break;
        #case '-':  # this case is handled without difficulty below
        default:
          list($inclp, $exclp) = GlobToPCRE($opt[$v]);
          if ($inclp) $opt['=varinclp'][$v] = "/$inclp/i";
          if ($exclp) $opt['=varexclp'][$v] = "/$exclp/i";
        }
      }
      return PAGELIST_ITEM;

    case PAGELIST_ITEM:
      if (@$opt['=varinclp'])
        foreach($opt['=varinclp'] as $v => $pat) 
          if (!preg_match($pat, PageVar($pn, $v))) return 0;
      if (@$opt['=varexclp'])
        foreach($opt['=varexclp'] as $v => $pat) 
           if (preg_match($pat, PageVar($pn, $v))) return 0;
      if (@$opt['=varexists']) {
        PageTextVar($pn, 'x'); // make sure $PCache is primed
        foreach ($opt['=varexists'] as $var)
          if (!isset($PCache[$pn]["=p_$var"])) return 0;
      }
      if (@$opt['=varnoexists']) {
        PageTextVar($pn, 'x'); // make sure $PCache is primed
        foreach ($opt['=varnoexists'] as $var)
          if (isset($PCache[$pn]["=p_$var"])) return 0;
      }
      return 1;
  }
}

NOTE: This would be significantly more usable if there were the capabilities of combining search terms (on the same variable) with "and", but at least this is a step in the right direction.

See also

  • 00880 a similar change is suggested, but there was too much else mixed in there.
  • 01152   Display empty and undefined variables as null

Can you give an example of where it would really be beneficial to know that a PTV has been set to a blank value, as opposed to not set at all? I'd suggest that in those cases it'd be easier to set the PTV to a false-equivalent value such as '0' instead of setting it completely blank, rather than trying to fix this in the core.

Let's say I've got pages related to library resources. On books I've got a "$:reprinted" date which is blank for books which have not been reprinted but has a value if they are reprinted. I also have several DVDs and other resources that don't have the $:reprinted at all. If I can differentiate between blank and undefined then I can choose non-books by choosing $:reprinted as undefined. (Of course, I can set up the data with a "$:resourcetype" or something, but nobody argues you can get around the problem -- the question is whether it wouldn't be better to be able to differentiate.
Another example. Let's say I'm going through group X and setting some variable to some value which could be blank. I'm changing 500 pages and I leave it in the middle and don't remember which pages have been set and I'd rather not have to look at every blank value -- I just want to see the pages that haven't been assigned yet. (Make it a hidden definition so you can't just search for the variable name.) Etc, etc. --Peter Bowers September 02, 2009, at 11:45 PM
This seems like a bit of a stretch. In the first example, you're rather heavily overloading the meaning of $:reprinted, which even then could be set to '0' for books that have not been reprinted; in which case admittedly the conditional for finding reprinted books would need to be something like (:if ( !equal '{*$:reprinted}' && !equal '{*$:reprinted}' 0 ):) instead of (:if !equal '{*$:reprinted}':). Regarding the second example, you can search for a variable name even if it's in the markup as (:varname:blah:).
To clarify, I'm not fundamentally opposed to being able to determine if a PTV has been set or not, but I'm not convinced that there's enough need for it. The danger here is that this would establish a precent for PmWiki variables to have a value of undefined, which they currently can't have, and I rather like it this way. Also, your proposal would leave -?* as the expression to use for what I'd consider the most common case, ie. listing those pages for which a variable isn't set to some non-null value. —Eemeli Aro September 03, 2009, at 05:49 AM
Several months ago I had a "real-life" example where it was important to be able to know whether the ptv was set or not. Unfortunately all I can come up with now are, admittedly, somewhat "stretchy" examples... :-) However, I'm not changing the current feature-set, but rather making use of a current-unused syntactic element with a "new" meaning. If someone doesn't want to differentiate then they just don't use $:ptv=* or $:ptv=-* -- other than adding an additional way of specifying $:ptv="" that everything else remain the same --Peter Bowers September 03, 2009, at 08:15 AM
I could be wrong, but I think PVs don't have an "undefined" state - am I right? Either it's set in $FmtPV or it's not set there... If a PV can be defined on one page and not on another then show me how and I'll look into modifying my implementation. Or another implementation - I'm not picky on how it's done... (Now that I think about it, something would have to be done to allow the $PV="" syntax, so I would need to re-think.) --Peter Bowers September 02, 2009, at 11:45 PM
Just as you said, a variable can either be in $FmtPV or not, and it may well be that some per-group customization adds an entry there, in which case it could be undefined. But my point was more that if something works for $:ptv=, it really should work for $pv= just as well. —Eemeli Aro September 03, 2009, at 05:49 AM
100% agree. --Peter Bowers September 03, 2009, at 08:15 AM

However, I would support making $:ptv=* and $:ptv=-* equivalent of the current $:ptv=?* and $:ptv=-?*, which could be achieved by modifying GlobToPCRE in pmwiki.php, or by changing the first foreach loop in PageListVariables to the following: —Eemeli Aro September 02, 2009, at 06:37 AM

foreach($varlist as $v) {
  list($inclp, $exclp) = str_replace('^.*$', '^.+$', GlobToPCRE($opt[$v]));
  if ($inclp) $opt['=varinclp'][$v] = "/$inclp/i";
  if ($exclp) $opt['=varexclp'][$v] = "/$exclp/i";
}
(See below re '*' as "one or more") Wouldn't you say $:ptv="" is a more intuitive way of saying "$:ptv is set to a blank value" as compared to $:ptv=-?* ? The -?* is logical in a somewhat complex way, but definitely not what someone would intuitively try... (At least I wouldn't have tried it.) --Peter Bowers September 02, 2009, at 11:45 PM
This I might agree on, provided that 'blank' can also mean 'not set'. Here's a different fix that should enable $:ptv='' and $:ptv=-''. —Eemeli Aro September 03, 2009, at 05:49 AM
foreach($varlist as $v) {
  if ($opt[$v]==='') { $opt['=varinclp'][$v] = '/^$/'; continue; }
  list($inclp, $exclp) = GlobToPCRE(preg_replace('/(^|,\s?)[-!][\'"]{2}/', '$1?*', $opt[$v]));
  if ($inclp) $opt['=varinclp'][$v] = "/$inclp/i";
  if ($exclp) $opt['=varexclp'][$v] = "/$exclp/i";
}
Yes, I'm not changing the meaning of "blank" -- it still includes both empty and undefined. --Peter Bowers September 03, 2009, at 08:15 AM

I am not sure we want to change the wildcard meaning. In all other places (group=PmWiki??*, name= , etc.) we have '*' equals 'zero or more' characters, and not 'one or more'. It will be counter-intuitive for the wildcard to behave differently for PageTextVariables. Changing it everywhere (GlobToPCRE) will break existing sites.

I would strongly agree we shouldn't change '*' to mean "one or more" -- anybody that is accustomed to using wildcards would have to remember this exception, complicating rather than simplifying things. If $:ptv=* is set to mean "$:ptv is defined" (as I've suggested) then what you are saying is really "$:ptv has a value which consists of 0 or more characters", a completely logical extension of the wildcard. It would be kind of making it a typed comparison, forcing a string value -- so the value NULL (or undef or whatever) is not considered a "string" and so can't be globbed. --Peter Bowers September 02, 2009, at 11:45 PM

OTOH, the current subversion release allows disabling a core $PageListFilter function, by setting it to a negative value. This makes it possible to use a custom function instead, without replacing the whole pagelist.php script (which was already possible, for wikis which require the other functionality). --Petko September 02, 2009, at 07:19 AM

Hm. You are probably right. If the search is for a single "*" then it is very likely that the user wanted a PTV that has 'one or more' characters, or they wouldn't bother searching for this PTV. Interesting. --Petko September 02, 2009, at 02:24 PM

No, I don't think so. I think defining a PTV as an empty string is a very valid and legitimate value, but should be able to be differentiated from the case where I haven't set a value at all for that PTV on that page. Right now $:ptv=* matches every page and $:ptv=-* matches no page, so they are completely useless and thus unused options. Giving them the meaning of defined/undefined adds functionality and maintains consistency...--Peter Bowers September 02, 2009, at 11:45 PM