|
Main sidebar
|
PITS /
00955Summary: Automatic anchors for document sections
Created: 2007-07-20 22:45
Status:
Category: Feature
From: HaganFox
Assigned:
Priority: 544441
Version: 2.2
OS: All
Description: Jon Abott started a thread on the list[1] regarding automatic anchors Is there a simple way to configure pmWiki such that all headings (H1, H2, etc) are automatically generated with anchor tags so people can link directly to a section or subsection? (I believe MediaWiki has this functionality by default.) Wikipedia does indeed have this feature (related to a TOC capability, perhaps?), and it would be nice to see it in PmWiki. The section's anchor appears immediately preceding the heading. It has the same text as the heading with some character substitutions like these: [space] => _ ( => .28 ) => .29 Essentially, this markup !! Heading Name would produce output similar to that produced by this markup does now: [[#Heading_name]] !! Heading Name --Hagan [1] http://pmichaud.com/pipermail/pmwiki-users/2007-July/044949.html Notes: The recipe Cookbook:PageTableOfContents does create an anchor at each heading it reports on, but
This sounds like a useful feature. However, I don't use anchors very often but I do use h1 and h2 headings quite often. If this becomes a core feature, I vote for a setting in config.php that would allow users to turn this feature off. --Ian MacGregor Questions from Pm:
Pm Suggested answers yes; yes; and as follows (not knowing what wikimedia does) We want to generate a usable anchor presumably to link to, the simplest way may be to allow the same heading text (sans markup) to be used in the anchor and have pmwiki transliterate it.
Note that in a few instances this may not quite work. ! My (example) heading "with" punc-uation; etc! But 'not', markup
might generate the anchor (details to be decided) [[#My_example_heading_with_punc-uation_etc_But_not_markup]]
and could be used as follows [[#My (example) heading "with" punc-uation; etc! But 'not', markup | link to heading]]
Obviously the vertical bar would not work, and how to handle markup such as ! [[PageName | +
needs to be considered Hagan's answers to Pm's questions:
I'd say yes. Hopefully this complication is not a deal-killer.
I'd say yes because it is consistent. As an author, if I see "See also" heading in a page I'll know there's a #See_also anchor that I can use to link to that section.
My hunch is that we need to convert all or none. Here's my answer on pmwiki-users: Any character not allowed in an anchor tag, I suppose. The example on the PITS page is just something I discovered when I was looking at how it was done elsewhere. Stripping out disallowed characters may be adequate. More feedback... As noted by Dominique Faure on the pmwiki-users list, another consideration is whether or not to let a heading start with a numeral (since the specification is something like "alphanumeric, starting with a letter"). The Wikipedia link to the engine in my car demonstrates that they don't worry about it there. Yet another consideration is what to do about duplicate headings that occur within a page. Personally, I don't like the idea of enumerating the links so they change when another heading is inserted. That makes the link much less useful, except perhaps for a dynamic table of contents.
FWIW, if the link will not be similar to the heading text I'd rather use something like the md5() function, so a heading of "See Also" would be linked with Update: Using the crc32() function would result in shorter links. This test script <?php $foo = 'See also'; echo '<pre>'; echo 'crc32: h'.crc32($foo)."\n"; echo 'md5: h'.md5($foo)."\n"; echo 'sha1: h'.sha1($foo)."\n"; echo '</pre>'; produces this output crc32: h-441119427 md5: h611114a3a55940e855fb96b973f897fc sha1: h2d8243a2c0e464492c9d563c4f92c56ae3421bcc --Hagan I posted this suggestion to the list, with a correction by dominique.faure@gmail.com: > I think this is an excellent suggestion. But there is a way to get I'd like to work on this project, but I haven't dug deep enough into PmWiki code yet. I'm going to go dig deeper in the site and my installation to learn what I can about implementation. -Mike John Rankin wrote this in a post to the pmwiki-users list: On Thursday, 26 July 2007 11:47 AM, H. Fox <haganfox@users.sourceforge.net> wrote: >On 26 Jul 2007 10:34:35 +1200, John Rankin <john.rankin@affinity.co.nz> wrote: >> Turning heading text into an anchor guaranteed to be valid would also need >> a bit of care; eg >> >> !!!Christian Ridderström >> >> would need to turn the ö into something else, > >Maybe there's some regular-expression magic that can do that. The trick I used in the citations recipe was to pass the text through the htmlentities function and remove the & and ; from the result. Petko Yotov wrote a descriptive post to the list, including this description of How MediaWiki Dose It: MediaWiki [1] has this capability from version 1.8 [2], one can link to a section like this: [[Wiki#Editing wiki pages|Click here]] will link to:
http://en.wikipedia.org/wiki/Wiki#Editing_wiki_pages
where there is a heading "==Editing wiki pages==". The conversion algorythm heading->anchor_id is quite simple:
This allows to have html that validates, and is very easy for the writers (PmWiki philosophy n°1). The conversion function works also in other languages, that is UTF-8 encoding: while it is not very nice in the html code for non-latin languages, it works both with the page table of contents and with links from other pages . If the section heading changes order, the links to it will still work (unlike if all anchor_ids are id1, id2...). If the section heading dissapears or changes, the anchor_id will also change and then the links will redirect to the top of the page (least surprise). I would add some items to the above list...
Both of these have the advantage of making the anchor name easier to guess without looking at the page source. The second one (stripping certain characters) also mitigates having a heading start with a non-alphanumeric character. The list becomes
...or something like that. ;-) --Hagan Martin Fick wrote this in a post to the pmwiki-users list: Perhaps the link code could be extended to look for existing/non-existing anchors in pages and display links to non-existing anchors slightly differently making misspelled anchors more obvious? That would certainly be author-friendly. --Hagan Mike Shanley wrote this in a post to the pmwiki-users list: Would abbreviations be a good option for people? At least in the test page below, not one abbreviation was repeated, and if it were, the _2 and _3 would fix it right up. This would take care of many of the human errors we've been talking about, though in pages with (way too many) headings, it might add some new ones. #wiki_style_basics = #wsb #scopes = #s #wikistyle_attributes = #wa #applying_wikistyles_to_block = #awtb #enabling_Styles = #es #custom_style_shortcuts = #css #predefined_style_shortcuts = #pss #Examples = #e #known_issues = #ki #see_also = #sa Also, how can we use this automatic anchoring to create quick TOCs? Generally speaking, an (:include function that stripped everything but headers, or a (:pagelist toc= that tracked headers instead of trail= tracking bullets would also work... My thought on this is that as long as we are talking about anchors across an entire wiki, we should also provide a way to just as easily index these anchors. Right? I agree that this feature is naturally related to a TOC feature. --Hagan See also PITS.00027 --Simon Purely "dabble-ware" but maybe does what people are looking for...? Put this somewhere in config.php:
include_once("$FarmD/scripts/stdmarkup.php");
DisableMarkup("^!");
Markup('^!#', 'block',
'/^(!{1,6})\\s?(.*)$/e',
"'<:block,1><h'.strlen('$1').PSS('>'.Header2Anchor('$2').'$2</h').strlen('$1').'>'");
function Header2Anchor($text)
{
$text = strtolower(trim(preg_replace(array('/%.*?%/', '/\(:.*?:\)/', '/\[\[.*?\]\]/'), '', $text)));
if (strtolower($text{0}) < 'a' || strtolower($text{0}) > 'z')
$text = 'h'.$text;
$text = preg_replace('/[^\w]/', '_', $text);
$text = preg_replace('/__+/', '_', $text);
for ($i=0, $sfx=''; TrackAnchors($text.$sfx); $i++, $sfx="_$i");
return Keep("<a name='$text$sfx' id='$text$sfx'></a>", 'L');
}
(We could put a call to FmtPagename() in there, but to be safe we should wait on this PITS entry to be implemented before we do that.) If it's helpful I can put it into a cookbook recipe... --Peter Bowers July 16, 2010, at 05:03 PM |