Emulating MemCache in Pure PHP

posted by Stephan Brumme

Soccer World Cup 2010

You might wonder: what has soccer to do with MemCache ?
Well, in 2006 I started a just-for-fun website (in German) where my colleagues could bet on the games of the Soccer FIFA World Cup (2006 & 2010) and UEFA Cup (2008). The first price wasn't money but eternal fame - and I won twice ;-)
Disclaimer: I didn't cheat at all, just plain luck (and a very defensive betting strategy) !

The back-end consists of PHP5 and SQLite. Everything is created on the fly which requires quite a lot of SQL queries (about 300 queries for World Cup 2010). I added more and more features for each soccer tournament, mostly focussed on data mining. As a result, the 2008 version became noticely slow over time and needed about 500ms to generate the front page (2006: 200ms).

I expected the 2010 version to need more than 1 second for the front page. However, after adding a MemCache-like wrapper around the SQL-queries, I was able to generate the front page in less than 30 milliseconds.

Key/Value Storage

There is a MemCache PHP extension but I don't have sufficient rights to install it on my shared host. So I wrote a simple wrapper which consists of only a few lines PHP without any extensions. In my PHP code I replaced all SQL requests by a call to a new function query:
hide function query($sql) { global $db,$memcache; // already cached ? if (isset($memcache[$sql])) return $memcache[$sql]; // retrieve from database $query = $db->query($sql); $result = $query->fetch(PDO::FETCH_ASSOC); $query->closeCursor(); // fill cache $memcache[$sql] = $result; return $result; }
It's nothing special: just a lookup in an in-memory array called $memcache, and if the SQL query wasn't found, the query is forwarded to the SQLite database (actually its PDO wrapper, so you can use any database). Finally, the result returned from the database is inserted into the cache.

I manually have to take care of UPDATE and INSERT operations since they may render some or all cache entries invalid. Usually I just clear the whole cache - even though it's more efficient to delete only affected entries:
$memcache = array();

Persistency

Because less than 1% of all queries are UPDATEs or INSERTs, I made the cache persistent to make use of this skewed read/write ratio.
My initial attempt was to write a PHP dump of $memcache to disk:
// write cache to disk $serialized = "<"."? \$memcache = ".var_export($memcache, true)."?".">"; file_put_contents("memcache.data", $serialized);
... and for the next website access load the cache directly into PHP:
// load persistent cache from disk include_once './memcache.data'
To my surprise I found that this "native" approach is actually much slower than plain JSON. So far, I have no good explanation why JSON outperforms native PHP.
My best guess is that the PHP compiler spends too much time on optimizations. Here is my JSON code:
// write cache to disk (JSON) $serialized = json_encode($memcache); file_put_contents("memcache.json", $serialized);
and:
// load persistent cache from disk (JSON) $serialized = file_get_contents("./memcache.json"); $memcache = json_decode($serialized, true);
Obviously, there is no need to write the cache if it wasn't modified, e.g. because the user just looked at the results without entering new bets or modifying existing ones.
A dedicated cache entry called $memcache["memcacheModified"] is only set to true if a write operating was performed on the database/cache. I skipped it from the code above to keep it simple.

My SQLite database is very small: only 120 KBytes. The JSON cache is a bit smaller at 72 KBytes. Now that the World Cups are over, both the database and the JSON cache remain static. If you have much bigger databases then you are clearly better off using the real MemCache.

Tuning SQLite

During my tests I found the following SQLite settings to be the fastest:
hide PRAGMA journal_mode = MEMORY; PRAGMA temp_store = MEMORY; PRAGMA encoding = 'UTF-8'; PRAGMA synchronous = OFF; PRAGMA cache_size = 2048;
homepage