Send your database on vacation by using CakePHP + Memcached

I don’t think it’s any secret that in most modern, dynamic web applications the DB becomes the bottleneck, umm…. let’s say, 96.7% of the time.

Thankfully, we have a relatively painless way to speed up any DB-driven application by employing some caching mechanism.
memcached, is a popular choice, because it is fast, simple to install and use, and works very well for alleviating the DB load. Not to mention, that it comes nicely integrated into the CakePHP core.

Before we proceed there are a couple of simple things to keep in mind:

1. Memcached is best used to cache small, often repeated queries. By small, I mean queries that do not return hundreds or thousands of results (well, you really shouldn’t have queries like that to being with ;))
2. Be mindful of the cache duration time, you don’t want to draw results from cache, if the query results change very often. So set your caching duration appropriately.

As always, let’s jump into an example…

We have a site, which neatly displays five of the most recent articles on a few select pages. The site is relatively popular and gets over 30,000 visitors a day. New articles, however, get published once a day or every other day.
You can imagine that going to the database for the same exact query (and results) over and over again, is going to put absolutely unnecessary stress on your DB.

Therefore, this is a perfect example of where cache can be a life (or database) saver.

First, some prerequisites…

You need to download, install and have memecached server (daemon) up and running. The installation should be very simple, and to keep this post reasonably short, I’ll let you google for that on your own (or check this post).
Secondly, you need to ensure that your PHP installation has the memcached lib enabled. The easiest way to do so is to check the phpinfo() and look for the “memcache” section.
If it’s not available, please refer to the PHP manual on how to install one… in many cases it could be as simple as un-commenting the memcached extension in php.ini and restarting your web server.
Last, but not least, you need to switch CakePHP caching engine from the “FileEngine” (default) to “Memcache”.

Just by making a little adjustment in core.php:

Cache::config(‘default’, array(‘engine’ => ‘Memcache’));
(This assumes that you are using all defaults, see core.php for more detailed options).

If all goes well, and you have the default homepage for CakePHP, it should tell you that: “The MemcacheEngine is being used for caching”.

Now we are ready for the fun stuff…

Going back to our example, we’ll assume an Article model, with a getRecentFive() method to return the five most recent articles.

function getRecentFive () {

return $this->find('all', array('conditions' => array('Article.status' => 1),
                                       'limit' => 5,
                                       'order' => 'Article.updated DESC',
                                       'recursive' => -1));
}

So far, so good…

To keep things rather plain we’ll simply call it from the Articles Controller, like so:

pr($this->Article->getRecentFive());

It goes without saying, that we should see an array returned by the find() with the five most recent articles.

Finally, let’s see how we can take advantage of the MemcacheEngine to easily save our DB a few hits by adjusting our getRecentFive() method with a couple of lines of code…

function getRecentFive () {

          if(!$recentFive = Cache::read('recentFive')) {

            $recentFive = $this->find('all', array('conditions' => array('Article.status' => 1),
                                                'limit' => 5,
                                                'order' => 'Article.updated DESC',
                                                'recursive' => -1));

            Cache::write('recentFive', $recentFive, 86400);
          }

          return $recentFive;
        }

Let’s take a quick look at the code above.

First, we are checking if a cache with a key “recentFive” is available, while we assign the contents of the cache to the $recentFive variable.

If the cache is not present, we do a good ol’ find(‘all’) and write the results to cache…

Now, let’s break down this line of code: Cache::write(‘recentFive’, $recentFive, 86400);… we are basically writing to cache to a key called “recentFive” (it can be any arbitrary string). The data we are writing is, of course, the result of our find(‘all’). And lastly, we set the duration in seconds, in this example I set it for one day.

So, to play a little scenario, the first time we are calling the getRecentFive() method, the cache is not available. Therefore, we do a find(), get the results, write them to cache and return the result back to our controller. If you are trying out an example, you’ll see a simple query displayed in the SQL debug.

The second time, however, the cache is already available, so rather than hitting our DB, we read the results from cache and return it back to the controller. Now, you’d notice that there is no query executed in the DB, yet the same result as previously is returned back to the controller.

Thus, we save precious hits on the DB and happily send it to take a little vacation.

P.S. This example can be extended to automatically generate cache using the afterSave() and afterDelete() call-backs, but that might wait for another post or something for you to play with ;)

  • http://identoo.com/dirk.olbertz Dirk Olbertz

    I’m looking forward on your strategies for invalidating the cache – this always seems to be the most problematic case for me.

    In your example it’s pretty obvious that you need to invalidate the cache, whenever an articles is added, edited or deleted – but only if this would in fact change the initial query. If you edit or delete the sixth article, you would not need to invalidate the cache.

    But if you have a query in which you get the latest articles *plus* their latest comments, you now have to invalidate the whole cache, whenever either an articles is added, created, deleted or a comment for those articles in the cache.

    There is currently no way – or at least I’m not aware of this – to let this handle by Cake itself. You would need to have a central mechanism to observer all your caches. This observer also would need to know how the different caches are linked to each other.

    Right now, the cache is not that usefull, when you have a lot of different queries on the same data. The overhead for maintaining the cache is enourmos. And my basic concerne here is not, that you would need to have a lot of code. The danger always is, that you show old data, because you didn’t invalidate your cache.

    How do others solve this kind of issues?

    • http://www.livelovecode.com john_8

      @dirk olbertz

      In my dealings with caching things of this nature, it seems best to invalidate or empty the cached value in the action to add or edit an article. Add would be obvious, since any added article would obviously need to clear “recentFive” from memcache. With update it would be simple enough to check if the article is in the top 5 and if not, empty the “recentFive” so that it can repopulate

  • Howard

    Hey there, currently I use APC for shared memory caching – is memcached better than this implementation?

  • http://fbuser.com FBuser.com

    memcache is cool but xcache is much better :)

  • http://www.aquitanda.com Éber F Dias

    I use xCache to cache my app… Is it possible to mix things? Use xCache for everything and Memcache only for DB?

    I guess it is setting a different cache handler on the core ,right?

  • http://deizel.co.uk deizel.

    There are some nice constants for time in the Cake core, meaning you can use DAY instead of 86400:

    https://trac.cakephp.org/browser/trunk/cake/1.2.x.x/cake/basics.php#L27

  • http://teknoid.wordpress.com teknoid

    @Dirk Olbertz

    For more complicated cases, you’d need more complex solutions. As I mentioned in the beginning such cache is best utilized for quick an simple result-sets.
    Also, it highly depends on your specific needs. For example, with seriously heavy traffic even a 5 minute cache can dramatically improve performance.

    The best way to handle updating the cache is, as mentioned, using the afterSave() and afterDelte() methods.

    @Howard

    As in many similar situations there is no right or wrong answer here. For this example the approach would be essentially the same, since every cache engine is abstracted by CakePHP in the pretty much the same manner.
    Memcached has the benefit for distributed cache where you can share keys among various web servers and connect to a pool of memecached servers.
    Both APC and Xcache are designed to optimize PHP code specifically, while memcached caches whatever in whatever language you need.

    @Fbuser.com

    It’s probably faster on the local filesystem, but has drawback in larger installations. I would agree that it is a good replacement for APC, but a lot of people will probably disagree with me ;)

    @Éber F Dias
    Yes, I think that’s a good approach overall, however I’m not sure to how implement that exactly in the CakePHP environment. AFAIK, there is no way to mix various caching engines out-of-the-box. If anyone has suggestion, do share…

    @deizel

    Thanks. Good point.

  • Paolo Gabrielli

    Thanks for sharing that: memcached is the new trend and it rocks :-)
    But if the db get a new article once a day or more, can we just rely on MySQL’s Query Cache?

    Thanks again,
    P.

  • http://teknoid.wordpress.com teknoid

    @Paolo Gabrielli

    Not really, MySQL query cache serves a very different purpose and is actually not very efficient for what we are looking to do.

    I might do a follow-up to show how to employ afterSave() and afterDelete() methods to rebuild the cache…

    Or you can get a good hint from this article:
    http://teknoid.wordpress.com/2008/08/20/dynamic-menus-without-requestaction-in-cakephp-12/

  • Pingback: CakePHP : signets remarquables du 14/06/2009 au 20/06/2009 | Cherry on the...

  • http://www.aquitanda.com Éber F Dias

    Hey!

    According to the API you can define the config you wanna use to store your cache information:

    http://api.cakephp.org/class/cache#method-Cachewrite (third argument)

    So I guess that on you core.php you can define a cache config with another name and use it aside!

  • http://teknoid.wordpress.com teknoid

    @Éber F Dias

    Very good point. I think that should work.

  • unclezoot

    Teknoid, to follow up your post on why Mysql caching is bad, this is taken from the memcached manual (near the bottow – http://www.danga.com/memcached) :

    MySQL query caching is less than ideal, for a number of reasons:

    * MySQL’s query cache destroys the entire cache for a given table whenever that table is changed. On a high-traffic site with updates happening many times per second, this makes the the cache practically worthless. In fact, it’s often harmful to have it on, since there’s a overhead to maintain the cache.
    * On 32-bit architectures, the entire server (including the query cache) is limited to a 4 GB virtual address space. memcached lets you run as many processes as you want, so you have no limit on memory cache size.
    * MySQL has a query cache, not an object cache. If your objects require extra expensive construction after the data retrieval step, MySQL’s query cache can’t help you there.

    If the data you need to cache is small and you do infrequent updates, MySQL’s query caching should work for you. If not, use memcached.

  • http://teknoid.wordpress.com teknoid

    @unclezoot

    Thank you for sharing this. Very good points and very much spot on.

  • Pingback: Use memcache in CakePHP « Myles Kadusale’s Blog

  • http://chrisparaiso.com Chris Paraiso

    Thank you for this. I’ve been searching around for a SIMPLE explanation of how to use memcache in cake. all of the articles are bloated. this one is simple. thanks again.

  • http://teknoid.wordpress.com teknoid

    @Chris Paraiso

    No problem, glad it helped.

  • http://ultimate.in.rs Marko

    As Éber F Dias already mentioned, the third parameter for Cache::write, according to the Cake api documentation is

    $config string Optional – string configuration name

    and not duration as in your example.

    Is $config int an undocumented cake feature or a bug in your code?

  • http://teknoid.wordpress.com teknoid

    @Marko

    It’s possible that there was a change since the writing of the article.
    However, looking at the method it should still work…

    Let me know if you find otherwise.

  • http://www.v25media.com darren_n

    Question: Can I store View Cache in APC or Memcached? This seems like an ideal way to rapidly retrieve ‘pre-compiled’ pages instead of doing the I/O hit of normally cached views on disk.

  • http://teknoid.wordpress.com teknoid

    @darren_n

    Sure, memcached doesn’t care about the content being stored.
    APC is different, it optimizes PHP code specifically.

  • http://www.v25media.com darren_n

    @teknoid – was asking ’cause APC also has a caching feature as well as creating an opcode cache. So I’ve used it quite a bit to store queries in like you’re doing with memcached, was just wondering if the procedure was similar to get Cake to store cached views in memory instead of on disk – since this doesn’t happen in the model.

  • Pingback: Myles Kadusale » Blog Archive » Use memcache in CakePHP