FastCache

Summary: Caches complete wiki pages for very fast retrieval
Version: 2020-10-17
Prerequisites: DPmWiki 2.2.58 or newer, for full functionality also requires Apache mod_rewrite
Status: beta
Maintainer: Maintainer needed! (original author: Eemeli Aro)
Discussion: FastCache-Talk
Categories: Administration, CMS, PHP72

Questions answered by this recipe

  • How can I really speed up my PmWiki access times?
  • I want to use PmWiki as a CMS engine, but serving dynamic pages is so slow, what can I do?

Description

When PmWiki renders a page for viewing, it generates the HTML from the markup and combines the different page elements together into a single page that's then served, using PHP. Setting $EnableHtmlCache helps quite a bit, but there's still a huge chunk of overhead involved in starting PHP in the first place.

FastCache stores the full HTML of a wiki page, allowing for Apache's mod_rewrite rules to provide a cache hit without starting PHP. Even if you can't use mod_rewrite, FastCache will provide a cache hit as fast as possible using PHP.

FastCache won't serve pages from the cache if the user is logged in (using any password or user id), the requested URL contains a query string, or if the URL request is a POST (ie. processing a form). Additions to the cache will only be made on anonymous page viewing.

To install this recipe

  • download fastcache.phpΔ to your cookbook directory
  • add the following line to your config.php file:
    include_once("$FarmD/cookbook/fastcache.php");
  • edit your server settings (see the CleanUrls recipe for instructions where to find them) as follows:

mod_rewrite

The following assumes that you have some idea of what you're doing and that you have $EnablePathInfo set. These rules are based on the CleanUrls recipe. Take note that you'll need to replace the text in bold with whatever's appropriate for your setup.

RewriteEngine On
# Define the rewrite base. This is the URL of the directory that the
# .htaccess file (if using) resides in
RewriteBase /url/path/to/rewrite/root
# Catch requests for index.anything
RewriteRule ^index / [R=301]

# Non-capital first letter means that it's not a wiki page
RewriteRule ^[a-z].*$ - [L]


## Define the rules that exclude things from the cache:
# If any of these conditions are true ...

# Posting a form request
RewriteCond %{REQUEST_METHOD} POST [OR]
# A session exists -> the user is signed in
RewriteCond %{HTTP_COOKIE} PHPSESSID [OR]
# A query string exists; ie. we're not just viewing a page
RewriteCond %{QUERY_STRING} .

# ... serve the file using pmwiki.php
RewriteRule ^(.*)$ relative/url/path/to/pmwiki.php?n=$1 [QSA,L]


## If we're this far in the script it's ok to serve files from the cache

# Root means that we want the wiki homepage
RewriteCond %{REQUEST_URI} ^/$
# Does the file exist?
# Change 'Main.HomePage' here if you're using a different page
RewriteCond /absolute/path/to/htmlcache/Main.HomePage.html -f
# If so, serve it.
# Change 'Main.HomePage' here if you're using a different page
RewriteRule ^$ relative/url/path/to/htmlcache/Main.HomePage.html [L]

# No . or / in the URL means that we want the group main page
RewriteCond %{REQUEST_URI} ^/([^./?]+)[./]?$
# Does the file exist?
# Change '%1.%1' here if you're using a different group main page format
RewriteCond /absolute/path/to/htmlcache/%1.%1.html -f
# If so, serve it.
RewriteRule ^. relative/url/path/to/htmlcache/%1.%1.html [L]

# We want Group.Page
RewriteCond %{REQUEST_URI} ^/([^./]+)[./]([^./]+)/?$
# Does the file exist?
RewriteCond /absolute/path/to/htmlcache/%1.%2.html -f
# If so, serve it.
RewriteRule ^. relative/url/path/to/htmlcache/%1.%2.html [L]


# Cache misses and anything that doesn't fit the above goes to pmwiki.php
RewriteRule ^(.*)$ relative/url/path/to/pmwiki.php?n=$1 [QSA,L]

For example, if your wiki appears in the site root with an absolute path /var/www/example.com, but is located in /var/www/example.com/pmwiki, and you're using the default settings for FastCache,

/url/path/to/rewrite/root is /
PHPSESSID is whatever you defined with session_name() in your config.php (see here) or PHPSESSID by default
relative/url/path/to/pmwiki.php is pmwiki/pmwiki.php
/absolute/path/to/htmlcache is /var/www/example.com/pmwiki/pub/htmlcache
relative/url/path/to/htmlcache is pmwiki/pub/htmlcache

Cache HTTP headers

Logging in and out may not always appear to work since your browser might believe it doesn't need to re-fetch the page from the server. To fix this, you should add the following rules to your Apache configuration for the htmlcache directory, either using a <Directory> or the directory's .htaccess file:

FileETag None
Header set Expires "Tue, 01 Jan 2002 00:00:00 GMT"
Header set Cache-Control "public, must-revalidate"

What it does

FastCache works in three different places:

  • Editing a page: When a page is posted, FastCache invalidates the cache by deleting the cached files. If your wiki pages don't contain any elements that depend on other wiki pages, you can set $FastCacheInvalidateAllOnUpdate = FALSE; in your config.php before including fastcache.php to only invalidate the currently edited page.
  • Viewing an uncached page: If not logged in as any user and viewing an uncached page, FastCache captures the printed output and saves it to disk. FastCache isn't invoked if you're logged in using some password or user id. To disable FastCache for a specific page, use the (:nofastcache:) directive.
  • Viewing a cached page: The mod_rewrite rules defined above let the request be served directly from cache, skipping PHP altogether. If for whatever reason you can't use mod_rewrite, FastCache will still serve pages from cache as fast as possible using PHP

As implemented on the system I'm maintaining, the Apache benchmark for serving pages without FastCache is at most 10 pages/second. With FastCache, the server can handle 1500 pages/second.

How it works

FastCache uses a modified version of PmWiki's HandleBrowse to buffer the printed output when viewing a page, and stores it to a file. FastCache also inserts a function (FastCacheUpdate) into the $EditFunctions array to handle page updates.

Markup

FastCache adds just one directive to PmWiki:

(:nofastcache:)
Disable FastCache caching for this page. To disable FastCache for a group, add this to the GroupHeader.

Configuration

The following variables may be set in your config.php file to configure the way FastCache works.

$EnableFastCache
default TRUE
Set FALSE to disable FastCache altogether.
$FastCacheInvalidateAllOnUpdate
default TRUE
Set FALSE to only invalidate the currently edited page from the cache.
$FastCacheDir
default "$FarmD/pub/htmlcache"
IMPORTANT The absolute path of the directory in which the cached files are written. This should be empty of all but the cached files, as its contents will be erased when the cache is invalidated! In other words, don't use your $WikiDir or $WorkDir. If the directory doesn't exists, PmWiki will create it for you when it's first accessed.
$FastCacheValid
default:
(
empty($_POST) && 
( empty($_GET) || ( array_keys($_GET) == array('n') ) ) && 
( empty($_SESSION) || ( $_SESSION == array( 'authid' => '' ) ) ) &&
( empty( $_REQUEST[ ini_get('session.name') ] ) ) # suggested by MagicBeanDip
)
The conditions under which a page should be cached. Also the (:nofastcache:) directive and PageExists() == FALSE set $FastCacheValid to FALSE.

Notes

There are some issues with using the script as such in a standard PmWiki distribution, as it's based on my own setup. Perhaps most significantly, FastCache doesn't read the $NoHTMLCache so that some of the markups mentioned in this post don't stop caching.

Features under development

  • Prevent caching for markups mentioned here

Release Notes

See Also

Contributors

Comments

See discussion at FastCache-Talk

User notes +1: If you use, used or reviewed this recipe, you can add your name. These statistics appear in the Cookbook listings and will help newcomers browsing through the wiki.