main website home
  • About this blog

    This blog features updates, opinions, and technical notes from Caucho engineers about Caucho products, the enterprise Java industry, and PHP. Caucho Technology is the creator of the Resin Application Server and the Quercus PHP in Java engine. A leader in Java performance since 1998, Caucho is a Sun JavaEE licensee with over 9000 customers worldwide.
  • Tags

    ajaxworld bam candi cdi cloud cluster comet configuration deploy devoxx eclipse ejb embedded flash flex google app engine hessian hmtp ioc java ee 6 javaone javazone jms messaging newsletter nyjug osgi php pomegranate quercus resin resin 4.0 REST servlet sfjug silicon valley code camp spring testing training tssjs watchdog webbeans web profile websockets wordpress
  • Meta

    • Register
    • Log in
    • Entries RSS
    • Comments RSS
    • WordPress.org
« Developer Training Course
Caucho Newsletter »

Distributed Caching

resin cloud

Because the main purpose of Resin 4.0 was support for dynamic servers, we needed to upgrade our distributed session management to handle the case where servers can appear and disappear frequently. Resin 3.1 session replication relies on a static set of servers to choose backup and triplicate servers. Dynamic servers change that model because a 3.1 session backup might be shut down indefinitely. So we needed a new architecture.

At the same time as we redesigned sessions, I wanted to generalize the distributed store to support standard caching and storage using the javax.cache API, while retaining the scalability and reliability of our original design.

  • A hub of 3 fully-redundant servers for reliability (the triad)
  • Other servers dynamically appear and disappear for deployment flexibility
  • Updates using lightweight messaging using BAM/HMTP
  • Cache entry ownership and leasing for performance
  • Support storage (infinite expire), caching (timed expire), and session (timed idle invalidation)
  • Support serialized objects (using Hessian), and binary data

The cache architecture is heavily influenced by the messaging and extra complications of distribution.

Splitting Data from Map.Entry

The most fundamental design decision was splitting the data (payload) of a cache entry from the Map.Entry tuple. The split has a few benefits:

  • small, fixed size Map.Entry updates for get requests, put updates, and startup synchronization
  • synchronization and transaction uses small messages, so rollbacks and conflicts are cheap to resolve
  • data is constant, named by its hash like the .git repository, so never needs rollbacks or synchronization
  • startup and crash recovery can query with small, efficient packets and only needs to transfer missing data
  • put/remove are unified with a single message (simplifying testing considerably)

The Map.Entry is a small, fixed-length item with three key pieces of information: the key hash (sha-256), the value hash (sha-256), and an update version:

  <key, value, version>

The size of the tuple is fixed because the key and value are fixed-length hashes, and the version is a 64-bit integer: 32 bytes for the key, 32 bytes for the value, and 8 bytes for the version.

  <348f...c0a1, 8efa...365f, 45>

The 32 byte hash for the key and value is used to lookup the actual payload. If a server doesn’t have the data, it can query a triad server for the missing information. On startup, a server can use any data is already has locally, and only needs to exchange messages for the Map.Entry to validate changes.

Replication and Scaling the Triad

The triad serves as the backup for all the servers in a cluster pod containing up to 64 servers. Clusters with more than 64 servers split into multiple pods. Each cache Map.Entry and data is assigned a triad server as its owner, based on the 32-byte hash. The data hash determines the owning server for the data, and the key hash determines the owner server for the Map.Entry. So it’s entirely possible for a cache.put() to have different owners for the Map.Entry and the actual data.

Assigning ownership helps the triad manage load, because each triad server only owns a third of the keys and a third of the data. So the traffic and cpu requirements are split into three. The ownership also helps with synchronization because the owning server for a key can manage the cache versions in a single JVM, avoiding the complications of a fully-distributed synchronization.

Dynamic Servers

Each dynamic server stores a local copy of a cache entry, synchronizing with the triad as necessary. Because the dynamic server only has a cached copy, it can be safely taken down or started at any time. For efficiency, a server can request a lease on a cache entry, minimizing traffic with the triad as long as the lease is active. A HTTP session will use a 5 minute lease together with sticky sessions to keep all traffic local to a server for the duration of the session, making distributed sessions almost as fast as a locally managed session.

Onward to messages

To make the cache discussion concrete, it’s probably best to show the actual messages (somewhat in progress.) Hopefully by tomorrow I’ll have some time to add another post with those details.

Tags: caching, cloud, cluster, elastic, resin 4.0

This entry was posted on Wednesday, February 18th, 2009 at 11:37 am and is filed under Engineering. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to “Distributed Caching”

  1. Ophir Radnitz Says:
    February 23rd, 2009 at 3:48 pm

    Would it be possible to use Terracotta for distributing the cache? have you considered an official Terracotta support (TIM)?

    Thanks,
    Ophir

  2. ferg Says:
    February 23rd, 2009 at 8:23 pm

    That doesn’t make sense in this case, because Resin users can always use Terracotta. Having Terracotta distribute the cache really would mean using Terracotta, so we wouldn’t be adding any value with that kind of integration.

    This cache is primarily intended to replace something like ehcache in situations where adding and removing dynamic servers is important.

Leave a Reply

You must be logged in to post a comment.


Caucho Technology is proudly powered by WordPress and Quercus®
Entries (RSS) and Comments (RSS).

  • HOME |
  • CONTACT US |
  • DOCUMENTATION |
  • BLOG |
  • WIKI 4 |
  • WIKI 3 |
  • Resin: Java Application Server
Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc.
resin® is a cloud optimized, java® application server that supports the java ee webprofile ®