ExportHTML

(redirected from Cookbook.PmWiki2HTML-usingWGET)

Summary: Export PmWiki pages as "static" HTML pages

Version: 2007-01-12

Prerequisites: any

Status:

Maintainer: Petko (original author: Pm)

Discussion: ExportHTML-Talk

Categories: SystemTools, Administration, PHP72

Questions answered by this recipe

Can I convert PmWiki pages into static HTML?
Can I export my site to a set of static HTML pages?

Description

PmWiki does not come with a built-in export to HTML feature, but other tools can be used to grab a "static" copy of PmWiki pages. Here's at least four options.

1. Export using Wget

Wget recursively from a start page

The following wget(1) command will grab all of the pages from a site, which are linked in some way from the starting page, fixing up links to be relative as needed.

    wget -r -k -np -p -l 0 --reject-regex '[?&]action=' http://example.org/wiki --html-extension

If running with $EnablePathInfo set, then it's possible to get just a single wikigroup using

    wget -r -k -np -p -l 0 --reject-regex '[?&]action=' http://example.org/wiki/SomeGroup --html-extension

Here's the meaning of the options to wget:

    -r    Recursive retrieval.  Wget will follow any links it finds in
          the document

    -k    Link adjustment.  After retrieving the pages, wget will convert
          all of the downloaded files to have relative links instead
          of absolute ones.

    -np   No parent.  Wget will restrict itself to the path
          given on the command line (in the above case, it would
          only download the pages of SomeGroup).

    -p    Prerequisites.  Wget will retrieve local copies of any .css
          files or gif images needed to display the static copy of the
          page locally.

    -l 0  Infinite follow.  Wget will follow all of the links it encounters
          (subject to the -np restriction above), so that it will completely
          spider the group.

    --html-extension    Adds the extension html to each file to allow for local viewing.
          Internet Explorer and Firefox on Windows is too stupid to not recognize
          html files by code only.

    --reject-regex '[?&]action='   Don't download action links like Edit, History, Attach.

Since Wget doesn't respect the rel="nofollow" extension, it will follow also the action links. To avoid this:

Add $EnableRobotCloakActions = 1; to your config file and
add Wget to $RobotPattern or add '-U HTTrack' to the Wget command line to enable PmWiki recognizing Wget as robot.

Wget plus page list

If you want to include pages not reached by a path from the starting page, or you want finer control which pages to be downloaded, you can create a list of pages to be downloaded using a pagelist, e.g. (:pagelist group=*,-PmWiki,-Site*:).

To hide unwanted links, use (:noaction:) (:noleft:) (:noheader:) (:nofooter:).

Wget should be called with -l 1 so only the pages listed are downloaded.

Notes

If you want the retrieved pages to have .html extensions automatically added to them, see the HtmlUrls recipe.

How can I login using AuthUser and wget?

PmWiki doesn't use HTTP Basic authentication (i.e., wget --http-user and --http-passwd), but expects the data to be provided via POST requests. Try
--post-data=authid=USER&authpw=PASSWD

Caution: on linux systems, it is unwise to type commands into the terminal that contain sensitive data such as passwords, as they will be stored in your ~/.bash_history or equivalent file as plain text.

In order to avoid this, unset the local HISTFILE variable:
$ unset HISTFILE

Then enter the desired command:
$ command --containing --sensitive --data

The only side effect is that no terminal history will be saved in ~/.bash_history until you next log in (restoring the HISTFILE value after the command doesn't work).

2. HTTrack

Use HTtrack.

3. ExportHTMWiki

Use ExportHTMLWiki recipe script, which includes a batch export command.

4. BackupHTMLZip

Use BackupHTMLZip recipe script, which exports a static copy of your wiki.

Contributors

Comments

See discussion at ExportHTML-Talk

User notes +1: If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.