Grep-Talk

Summary: Talk page for Grep.
Maintainer: Petko (original author: Eemeli Aro)
Users: +2 (View / Edit)

This space is for User-contributed commentary and notes. Please include your name and a date along with your comment.

Please add new threads below this line:


Grep Site.AllRecentChanges for a specific user

List 5 most recent pages last edited by gnuzoo:

(:grep Site.AllRecentChanges match=1 pat="^.*~gnuzoo.*$" limit=5:)

This is cool. Can you grep the entire wiki? Can you grep through all pages (?action=diff) history? Can you make it list every page a user has ever edited and not just entries in Site.AllRecentChanges? --gnuzoo

No, Grep only greps the current text of a page. --Petko

You can possibly try (:pagelist link=Profiles.gnuzoo:) - it lists all pages with either your signature you placed (so you edited), or a link to your profile someone else placed as a recognition (Ape, ChangeLog...). --Petko


Remove duplicates

Hi Petko, thanks for this recipe, I find it very useful. May I ask two things?:

  1. The upgrade to php 7.2;
  2. An optional parameter to avoid matching duplicate strings (maybe something like: noduplicate=1)
I need to create a list of certain words from a wikipage (or from a section), no problem with pat=... and repl=..., and I know that I can't get a list in alphabetical order, but at least I need that if a word is present more times in that text, it is only considered (matched and/or replaced) once.

Thank you.

Frank 2017-11-02 at 06:10 AM
  1. Done, thanks for reminding me.
  2. You can use limit=1 to replace the first match only (limit=2 for the first two matches, etc.)
    The limit argument refers to the total number of replacements, it doesn't remove duplicate results. Let me think about it. --Petko November 02, 2017, at 12:47 AM

My English is not perfect, better if I put an example:

[[#test]]
Lorem ipsum dolor sit amet, (APPLE) consectetur adipiscing elit, sed do eiusmod tempor incididunt 
ut labore et dolore (PEACH) magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco 
laboris nisi ut aliquip (APPLE) ex ea commodo consequat. Duis aute irure dolor in reprehenderit in 
voluptate velit (PEAR) esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat 
non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
[[#testend]]

(:grep {*$FullName}#test#testend match=1 self=1 mod=g pat='\((.*?)\)' repl="* $1\n":)

Result:
  • APPLE
  • PEACH
  • APPLE
  • PEAR

Instead, I need this result:

  • APPLE
  • PEACH
  • PEAR
Frank 2017-11-02 at 08:05 AM

In version 20171103 I've added the possibility to use a custom function to process the results, see the recipe page. The example function can sort the results and remove duplicates (note that the nodup=1 option as the function is written now will probably sort the results anyway). See array_unique and sort for different flags you can use (case insensitive, numeric, etc.). --Petko November 02, 2017, at 04:16 AM

This solution has solved several problems (... and a headache). Thank you very much.

Frank 2017-11-02 at 05:47 PM

Page wildcards, print entire lines

  1. can the subject include wildcards? e.g. Groupn*/Pagen*
  2. how do I include the entire line the text is found on?
for example I tried[1] (:grep FixtureCard/{$:Year}* pat='\A||.*(bike|biking|cycle|cycling|mtb).*\Z' repl='$0':)
simon May 28, 2016, at 05:56 PM
  1. No.
  2. Use "$0\n", or do not use a "repl=" parameter:
List lines containing ".2010":
(:grep Cookbook.Grep#news match=1 pat='^.*\.2010.*$':)

List lines containing ".2010":

  • 24.5.2010: Item three
  • 17.5.2010: Item four
List lines containing ".2010":
(:grep Cookbook.Grep#news match=1 pat='^.*\.2010.*$' repl="$0\n":)

List lines containing ".2010":

  • 24.5.2010: Item three
  • 17.5.2010: Item four

To only list lines containing the match (like the unix `grep` utility), include the "match=1" parameter. --Petko May 29, 2016, at 02:35 AM


Automated Table of Contents

Hello,
Thanks for this snippet. I'm using it to automatically generate Table of Contents for a page. tkcusr?

The core now has an automated Table of contents that can be enabled. --Petko


Grep table breaks

I'm using grep here, e.g.

||width=100%
||!Date            ||!Trip              ||!Activity ||!Leader         ||!Phone       ||!Book now                
||![+[[June]]+]    ||-||-||||||
(:grep FixtureCard/{$ThisYear}June pat='.*(?:ski|climb|lodge).*' repl="$0\n" mod=i match=1:)
||!                ||-||-||||||
||![+[[July]]+]    ||-||-||||||
(:grep FixtureCard/{$ThisYear}July pat='.*(?:ski|climb|lodge).*' repl="$0\n" mod=i match=1:)
||!                ||-||-||||||
||![+[[August]]+]  ||-||-||||||
(:grep FixtureCard/{$ThisYear}August pat='.*(?:ski|climb|lodge).*' repl="$0\n" mod=i match=1:) 

It seems to be doing something to the included line that breaks the table markup. (perhaps adding a character?) If anyone has any ideas I'd really appreciate it.

  simon 

In the remote pages, you had a line outside of a table that contained the word "ski" and it got included. I have modified your directive to this:

(:grep FixtureCard/{$ThisYear}June pat='^\|\|.*(?:ski|climb|lodge).*$' mod=im match=1:)

That is, only lines starting with "||" and the modifier needs to be "im" for ^,$ to match start and end of line instead of page. --Petko

Talk page for the Grep recipe (users).