[pmwiki-users] Google local site search

Patrick R. Michaud pmichaud at pobox.com
Thu Dec 29 09:51:44 CST 2005


On Wed, Dec 28, 2005 at 08:15:37PM +0000, Hans wrote:
> Tuesday, December 27, 2005, 12:48:20 AM, Patrick wrote:
> > Newer versions of PmWiki (since 2.1.beta8) automatically return 
> > "403 Forbidden" errors to robots for any action other than 
> > ?action=browse, ?action=rss, or ?action=dc.  However, over the
> > years Google has built up a fairly large cache of PmWiki pages,
> > so they'll likely continue to appear in Google's search results
> > until they are somehow expired from Google's database.
> 
> Can this method be extended to other parameters apart from ?action= ?
> It would be nice to have an array where a skin author can add skin
> specific parameters, like ?setskin= ?setcolor= ?setlayout= , so any
> links with these will result in the same "403 Forbidden" error for
> robots as undesired actions.

In this case I think the correct action is not to return a 403 Forbidden,
but to redirect the robot to the page without the additional params.

> And if $EnableRobotCloakActions is set, such links will be cloaked
> likewise, if that is possible.

Yes, that would be possible.

> In fact, should robots not be allowed to follow only those links with
> specific actions (?action=browse, ?action=rss, or ?action=dc) and be
> turned away from any other links using parameters, not just other
> actions (with exception of ?n=pagename)?

I'm not entirely certain that we can safely say that ?n= is the
only valid parameter a robot might use.  Other discussions on the
list have mentioned things like "trail=", which should perhaps be
left uncloaked, and the various options to the feeds.php script
("order=", "count=", etc.) probably need to be left uncloaked as
well.

Pm




More information about the pmwiki-users mailing list