BackupHTMLZip

Summary: Export your wiki to static HTML then optionally compress/zip it
Version: 20170717
Prerequisites: Recent PmWiki version; if Windows: PHP 5.3
Status: Experimental
Maintainer: Petko
License: GPLv3+
Users: (view? / edit)

Questions answered by this recipe

  1. I have a wiki containing critically important information that I may need at any time within 5 minutes. Even if there is a nightly backup system in place, in case of a server failure, I must be able to access the information without having to first re-install the operating system, the server and the wiki.
  2. I have a wiki with some public sections (groups) that do not change very often. Can I easily create a static HTML export (frontend) that share most features with the dynamic wiki (backend)?

Description

Export your wiki to static HTML then optionally compress/zip it.

The recipe was inspired by a critical need to access information stored in a wiki, independent of the wiki itself. In case of a server/os/network/disk failure, wiki failure, deletion, ransomware encryption, fire, flood, sharks, mothers-in-law or any other disaster with the wiki, the information should be readily available within minutes.

As the company has a redundant multi-location backup system already in place, presumingly at least one of the backup servers will have the latest snapshot from the previous night. This (maybe a little older) version of the wiki information is acceptable, but it is not acceptable to wait for a consultant to come and re-install the server and the wiki.

The recipe can create automatically or manually a partial or full static HTML export of all pages, and compress the result in a zip folder. Uncompressing the zip folder on any computer will allow you to browse the full snapshot as static pages, like if you were on the wiki.

It is obviously possible to alternatively have a frontend-backend type of installation where PmWiki is used by the editors, then at a good moment, the public sections of wiki are exported to static files to the (static) online website.

Installation

  1. Place backup-html-zip.phpΔ in your pmwiki/cookbook directory.
  2. Add near the end of config.php such a line:
    include_once("$FarmD/cookbook/backup-html-zip.php");

If you want to use the zip function, you'll need InfoZIP binaries for your system. Most GNU/Linux and Unix systems have it installed or can install it with the default application manager. For those that don't have it, or for Windows, get yours here: ftp://ftp.info-zip.org/pub/infozip/ . For example, most Windows users should get the archive win32/zip300xn.zip and extract zip.exe somewhere where it can be called by PmWiki.

Please read and understand the Configuration and Usage sections.

Configuration

Here are the configuration options and the default values and need to be set before the include_once() line above. See also examples in the Usage section.

$BHZip['pages'] = '*.*,-PmWiki*.*,-SiteAdmin.*,-*.WikiSandbox,-Site.*';
Which pages to export: use * to signify any character(s), join the patterns with commas (no spaces), to exclude pages or groups, precede them with "-" minus.
By default, any pages that are not WikiSandbox, or PmWiki documentation, or site configuration are exported. To export all pages, set $BHZip['pages'] = '*.*';
$BHZip['adminperms'] = false;
Export pages as if the user has admin permissions, even if s/he doesn't. This is meant for automatic backups, so that anyone could create an export. This is also needed so the export mostly shows what the admin user would see, even if some sections are hidden from non-admins with (:if auth admin:).
By default, this is false, that means that the user's permissions are used. See also $HandleAuth['bhzip'] below.
$BHZip['skin'] = false;
Use a specific skin for the static export, eg. $BHZip['skin'] = "simple"; A different skin can be used so that there is no ambiguity that this is a static export. Also, the static export does not support searching or any "wiki actions", so the skin could hide the search form and the action links in the headers/footers.
By default, the same skin is used.
$BHZip['css'] = false;
Inject a few custom styles into the skin for the static pages, again to prevent ambiguity. For example you could add a thick dotted border to the header this way:
$BHZip['css'] = '#wikihead {border-bottom: 4px dotted red;}';
By default, no custom styles are injected.
$BHZip['autostart_delay_hours'] = 0;
If this is set, for example to 24, then every 24 hours the export+zip function will automatically start. This is the delay after the previous export, after which the first user that tries to browse any page will be able to trigger the export. By default this is disabled, ie. an export can only be launched by opening a specific URL (?action=bhzip).
$BHZip['squelch_hours'] = 1;
The number of hours to not allow exports, even if called manually. As an export is very CPU-intensive, this variable exists to prevent attacks on the server.
Set to 0 to ignore previous exports (eg. if you manually export a static frontend).
$BHZip['dir'] = 'backup';
The directory, relative to index.php, where the wiki is exported. In case of automatic backups with admin permissions, it is recommended to select a directory outside of the document root, inaccessible from the internet.
In case of a frontend export, define here the document root.
$BHZip['htaccess'] = "Order Deny,Allow\nDeny from all\n";
The contents of the ".htaccess" file to be created in the export directory. By default, access to the directory is denied (on Apache servers).
If you export a frontend, use here an empty string: $BHZip['htaccess'] = "";
If you already have a .htaccess file, it will not be overwritten.
$BHZip['namefmt'] = 'backup-%Y%m%d_%H%M%S';
The file name format of the zip files and index files. It is passed via strftime(), by default the files will be named "backup-DATE_TIME.zip" where DATE and TIME are the date YYYYMMDD and the time HHMMSS.
$BHZip['zipbinary'] = 'zip';
This is the path to the InfoZIP binary, eg. '/usr/bin/zip' or 'cookbook/zip.exe'. By default it assumes that the binary can be found in the default system location for programs, and should work on most GNU/Linux systems.
To disable the compression of the export, set here an empty string: $BHZip['zipbinary'] = '';
$BHZip['zipexclude'] = '*~ .svn *,cache';
Files patterns to NOT include in the zip archive, note, separated by spaces not commas.
$BHZip['zippassword'] = false;
If you want individual files in the zip archive to be encrypted, set the password here, eg. $BHZip['zippassword'] = 'carrot5';. Note, the file names and sizes are readable. Also note that this is a weak protection: if an attacker has access to config.php, s/he could read the password in clear.
$BHZip['delcache'] = true;
Whether or not to delete the static HTML export after compressing it. By default, the static export files and directories are "moved" into the zip archive with the InfoZIP argument "--move".
Set here false to keep both the zip archive and the HTML files. Note, this variable is ignored if you disable zip compression. Also note, if you set here false, then subsequent exports will not delete old HTML pages that no longer exist on the wiki.
$BHZip['zipkeeplatest'] = 10;
How many snapshots to keep, by default the latest 10 (older ones are automatically deleted). Set here 1 to only keep the last zipped export, or set to 0 to keep all exports.
$BHZip['memory_limit'] = '128M';
This corresponds to the system limits when exporting a snapshot, see SystemLimits.
$BHZip['max_execution_time'] = 180;
This corresponds to the system limits when exporting a snapshot, see SystemLimits. Set here 0 (zero) to let the export run without limits (may be needed if you have many pages or large and complex pagelists).
$HandleAuth['bhzip'] = 'ALWAYS'
What permissions to request from the visitor in order to export a snapshot. Can be 'ALWAYS', 'read', 'edit' or 'admin'.
If you need fully automatic exports, and the priority is the access of the information and less its protection, use 'ALWAYS' here (this is the default).
If your export folder is outside of the document root, you can also keep the default 'ALWAYS'.
If you manually export a static frontend section, you may either use 'ALWAYS' to allow anyone to export it, or 'edit' or 'admin' to allow only editors and admins to export the wiki.
Using anything other than 'ALWAYS' is slower as every page and group are checked if the current user has the selected permissions. If the user doesn't have these permissions for the current page, the password form will appear, but any other pages that require additional permissions will be silently skipped.

Usage

You can either manually call "pmwiki.php?action=bhzip" to create a snapshot, or configure a cron task scheduler to call it, or let users click on a special link when the time for a snapshot comes.

Automatic backups with Cron or Task scheduler

This is the simplest "set and forget" installation. Once configured, it works without any intervention or interaction.

You would probably have a configuration more or less like this:

$BHZip = array(
  'adminperms' => true,
  'zipkeeplatest' => 7,
  'skin' => 'BackupSkin', # select installed skin
  'dir' => '~/backup', # unreachable via HTTP, outside of htdocs
);
include_once("$FarmD/cookbook/backup-html-zip.php");

Then you configure your system to launch a repeated task, calling the wiki with ?action=bhzip.

For example, on GNU/Linux, you would open `crontab -e` then add something like this to the crontab file:

  50 2 * * * wget -q -O /dev/null "http://127.0.0.1:8080/pmwiki/pmwiki.php?action=bhzip"

This will create a snapshot at 02:50 AM every night, and will delete the oldest snapshot so that 7 latest snapshots are kept.

On a Windows task scheduler, [please complete].

Sometimes your hosting provider will allow you to configure Cron tasks in your hosting panel, refer to the documentation.

Automatic backups without Cron or other scheduler

Use this if you cannot schedule an external task to call the wiki with the ?action=bhzip action. This could also be configured in addition to the scheduled tasks, if the scheduler is not reliable.

Once a day, one wiki user will see a special page allowing her/him to create or skip a snapshot.

You would probably have a configuration more or less like this:

$BHZip = array(
  'autostart_delay_hours'=> 24,
  'adminperms' => true,
  'zipkeeplatest' => 7,
  'dir' => '~/backup', # unreachable via HTTP, outside of htdocs
);
include_once("$FarmD/cookbook/backup-html-zip.php");

This will launch an export once every 24 hours and will store the 7 latest zip archives in a directory away from the htdocs document root.

When the last export becomes older than 24 hours, the first user that browses the wiki will see a page with 2 links allowing her/him to create or skip a new snapshot.

  • The user can select to skip it, in which case a snapshot will not be created for yet another 24 hours.
  • The user can select to create the snapshot, then after a few seconds s/he'll see the list of created static pages, and at the bottom, a link to resume her/his browsing.

Manual exports

Simply call the wiki with the ?action=bhzip action, eg. "pmwiki.php?action=bhzip".

For static frontend / wiki backend installation, you may have such a configuration:

$BHZip = array(
  'pages' => 'Main.*,Blog.*', # only export Main and Blog groups
  'dir' => '../', # reachable via HTTP, eg. htdocs
  'zip' => '',  # no need to zip the export or, 
  'delcache' => false, # no need to delete the export if zip enabled
);

$HandleAuth['bhzip'] = 'edit'; # any editor can export the wiki
# or: $HandleAuth['bhzip'] = 'admin'; # only admins can export the wiki

include_once("$FarmD/cookbook/backup-html-zip.php");

Then you would probably add somewhere a link to [[{$Name}?action=bhzip]] for the editors to click on it when they finish editing.

You may also want to create a static index.html file that redirects to the static HomePage.html, or rename the exported "backup-DATE_TIME.html" in the export directory to "index.html".

Internationalizations

Here are the strings that can be translated, for example in French:

  'Too early for a new snapshot.' => 'Trop tôt pour une sauvegarde.',
  'Page redirects to' => 'La page redirige vers',
  'written' => 'sauvegardé',
  'Continue to' => 'Poursuivre vers',
  'Automatic backup' => 'Sauvegarde automatique',
  'New backup snapshot is scheduled now. This may take up to 3 minutes.' => "Une nouvelle sauvegarde est prévue maintenant. Elle peut prendre jusqu'à 3 minutes.",
  'Create snapshot now (recommended)' => 'Créer la sauvegarde (recommandé)',
  'Skip snapshot for the next %s hours' => 'Annuler la sauvegarde pour %s heures',

Notes

The recipe will try to create symbolic links to the "pub" and "uploads" directories in the export directory. This probably requires at least PHP 5.3 on Windows (any version should work on GNU/Linux).

Besides the pages, the compressed archive will contain the full "uploads", "pub" and "$FarmPub" directories. This should allow for most JavaScript recipes to work fine in the static export (Untoggle, AutoTOC, Mini, Ape, DeObMail, Formula, Math*, etc.).

The PageActions are hidden with '(:noaction:)' in the export.

Understand that this is a "static export". The search function will not work in the static export. Actually not a single "?action=XXXX" will work in the static export. If you have pagelists that can be paginated or re-ordered, only the first page will appear in the static export, like when the dynamic page is accessed without any parameters.

The export requires a lot of CPU usage. An older processor (Intel Core Solo U3500) can export a not very complex wiki with about 300 pages and the default skin in 17-20 seconds. A wiki with many pagelists and inclusions will take more time. A complex skin with many included sections (right bar, top bar, footer, header, ... besides SideBar) will take more time.

Change log / Release notes

  • 20170717 fixed omission with filesystem permissions.
  • 20170527 Remove French translation (encoding may mismatch, better to use XLPage, documented), minor changes.
  • 20170522 Add French translation, minor changes.
  • 20170521 First public release, ready to be tested.

See also

Other backup/export/conversion recipes:

Cookbook /
EPUBCreation  Assembles wiki page collections into an EPUB e-book file (EPUB output is ready for beta testing)
ExportHTML  Export PmWiki pages as "static" HTML pages
ExportHTMLWiki  Create a wiki using static HTML pages as front end for visitors (stable)
PrintGroup  Export WikiGroup pages to one large HTML file, and/or to PDF (Experimental)
SiteDump  creates a .tar.gz file of the complete site for download (stable)
WikiConversion  Recipes for converting other wikis from and to PmWiki

Contributors

  • Recipe written and maintained by Petko (5ko [snail] 5ko [period] fr).
  • If this recipe helps you or saves you time, you can help support its continued development by .

Comments

See discussion at BackupHTMLZip-Talk?

User notes? : If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.