Developers

Scaling Applications across Multiple JVMs

PCQ Bureau

10 Mar 2007 08:21 IST

New Update

Clusters and grids remain the inevitable choice when you have to battle out
issues of scalability, high availability, high performance or even fail-over. It
is not that this has not been adopted for Java applications, the clustering or
grid is more of API based. Thus, you can cluster application servers but not the
objects or more importantly the state of these objects when you plan to expand
your application over a cluster or grid. In this series, we look into a
clustering solution called Terracotta that does clustering at the JVM level. The
Terracotta solutions allow clustering without changing the existing code of your
application. It uses a declarative pattern using which you can configure your
application for JVM level clustering. This part looks into Terracotta DSO
(Distributed Shared Objects), the core technology for clustering JVMs and the
basics concepts related to the clustering.

Direct Hit!

Applies To: Java developers

USP: Using Terracotta for distributing applications across
multi-JVMs

Primary Link:
www.terracotta-tech.com

Google Keywords: JVM clustering, POJO grid

Terracotta basics

We talked about clustering of application servers a while ago and weren't all
praises for it. Let us examine why. We used a commercially available Application
Server and created a cluster on 10 workstations, each running its own JVM for
deploying a Web app to test how efficient this clustering is in comparison to
using an application in a single application server on a server machine. We
could configure what modules of the sample application will be installed on
either a part of this cluster or throughout the cluster, or even choose a few
nodes for installing the application and assign fail-over nodes. So far so good,
but there lie a few differences in this clustering and the one Terracotta
provides. For one, you have multiple application servers

running as one unit, but is there any in-built mechanism provided for state
management or concurrency? By in-built, we mean whether these things can be
managed without topping up your application with additional APIs. That is, it
doesn't necessarily ensure that an object's state in (say) the heap of a node is
known globally. If needed, it has to be taken care of by developers. Two, it is
certainly not a cluster of the objects in our application. For overheads, you
would certainly know the dreaded 'S' (Serialization) that comes into play in
Java EE clustering.

This is where Terracotta is different. Clustering at JVM level means that all
the JVMs act as a single unit. But this eases out a lot of burden, as once there
is one single JVM acting globally, any changes in the heap of a particular JVM
are replicated throughout the network. In case network overheads are the next
issue on your mind, that's not a bottleneck because a change in any field of a
globally shared object would require replication of that field only. We will
show this a little later in the series that Terracotta does exactly that. So,
instead of a cluster of application servers with there own heaps invisible to
each other, we have a cluster of JVMs which gives a global virtual heap A lock
becomes a cluster-wide lock; a shared object's update becomes a cluster-wide
update. Most importantly, you don't have to explicitly make such things visible
throughout the cluster.

Distributed Shared Objects

Distributed Shared Objects (DSO) is the core technology enabling clustering
services at the JVM level. The best part is that you don't need to import any
DSO libraries for clustering your application. DSO works at the byte-code level.
When you declaratively configure your application for clustering, relevant
changes are applied at the JVM, which Terracotta terms as hooks. These hooks,
then, keep track of field updates in the instantiated objects that have been
declared as 'Shared Roots'. The changes are then informed to the Terracotta
Server that further replicates these changes across VMs. Configuring the
application requires declaring which objects are to be shared, or replicated
across VMs, which classes to instrument, ie those classes whose instances are
being declared shared roots, or have distributed locks or distributed methods.

While doing all this, you just have to create appropriate configuration files
for enabling your application for DSO. So, the entire exercise's critical phase
is to realize what has to be declared as shared and distributed and where locks
are to be applied and which kind of locks are to be applied.

Clustering with Terracotta

Let us look into the procedure for enabling an application for Terracotta. As we
had said earlier, you do not need to import any libraries into your code for
this. You rather need to work on generating the 'Terracotta-Config' file, which
describes the clustering for your application. First, you have to determine
which objects are to be shared. It largely depends on the characteristics of
your application. But a generalized view that can work to start with, is to look
for objects in your application that represent state information and form the
core of your applications' logic. Be careful not to include objects that appear
to be shared objects, but if scrutinized further are actually objects with a
particular JVM, for eg sockets, connections, etc). You, then, mark fields in
these objects to be replicated across the cluster. Terracotta follows a distinct
approach when it comes to initialization of the shared objects. The shared roots
are initialized only within the cluster and the very first 'not-null' reference
that is initialized is assigned to the cluster. Any subsequent assignment of
reference is ignored. Terracotta implements this semantic to prevent the
application code from changing it in context to the cluster.

This is a reason why we said earlier that you have to exercise what has to be
shared rather than getting into tearing apart your existing modules or adding
code to existing application code. While shared objects are integral to
clustering, so are locks as you need to maintain integrity of data members
throughout the cluster. These locks are termed 'Distributed Locks' in Terracotta
lingo. They help keeping data members' values synchronized cluster wide. Locks
can also be 'Named' so that methods of similar names use the same lock. These
are of help in methods written without any thread safety considerations.

In conclusion

So far, we have underlined the basic concepts regarding Terracotta DSO and
clustering. In the upcoming parts of the series, we will look into basic
examples such as how to go about configuring an application for DSO, and later
on how to enable your Spring Objects for Terracotta. In fact, the product
documentation lists a few simple examples such as Slider Application, which can
be referred to get a good idea of how to use the Terracotta Eclipse Plugin and
how to configure an application for 'Terracotting' it.

Stay connected with us through our social media channels for the latest updates and news!