Advertisment

Scaling Applications across Multiple JVMs

author-image
PCQ Bureau
New Update

Clusters and grids remain the inevitable choice when you have to battle out

issues of scalability, high availability, high performance or even fail-over. It

is not that this has not been adopted for Java applications, the clustering or

grid is more of API based. Thus, you can cluster application servers but not the

objects or more importantly the state of these objects when you plan to expand

your application over a cluster or grid. In this series, we look into a

clustering solution called Terracotta that does clustering at the JVM level. The

Terracotta solutions allow clustering without changing the existing code of your

application. It uses a declarative pattern using which you can configure your

application for JVM level clustering. This part looks into Terracotta DSO

(Distributed Shared Objects), the core technology for clustering JVMs and the

basics concepts related to the clustering.

Advertisment

Direct Hit!

Applies To: Java developers



USP: Using Terracotta for distributing applications across
multi-JVMs



Primary Link:
www.terracotta-tech.com




Google Keywords: JVM clustering, POJO grid

Terracotta basics



We talked about clustering of application servers a while ago and weren't all
praises for it. Let us examine why. We used a commercially available Application

Server and created a cluster on 10 workstations, each running its own JVM for

deploying a Web app to test how efficient this clustering is in comparison to

using an application in a single application server on a server machine. We

could configure what modules of the sample application will be installed on

either a part of this cluster or throughout the cluster, or even choose a few

nodes for installing the application and assign fail-over nodes. So far so good,

but there lie a few differences in this clustering and the one Terracotta

provides. For one, you have multiple application servers



running as one unit, but is there any in-built mechanism provided for state
management or concurrency? By in-built, we mean whether these things can be

managed without topping up your application with additional APIs. That is, it

doesn't necessarily ensure that an object's state in (say) the heap of a node is

known globally. If needed, it has to be taken care of by developers. Two, it is

certainly not a cluster of the objects in our application. For overheads, you

would certainly know the dreaded 'S' (Serialization) that comes into play in

Java EE clustering.

Advertisment

This is where Terracotta is different. Clustering at JVM level means that all

the JVMs act as a single unit. But this eases out a lot of burden, as once there

is one single JVM acting globally, any changes in the heap of a particular JVM

are replicated throughout the network. In case network overheads are the next

issue on your mind, that's not a bottleneck because a change in any field of a

globally shared object would require replication of that field only. We will

show this a little later in the series that Terracotta does exactly that. So,

instead of a cluster of application servers with there own heaps invisible to

each other, we have a cluster of JVMs which gives a global virtual heap A lock

becomes a cluster-wide lock; a shared object's update becomes a cluster-wide

update. Most importantly, you don't have to explicitly make such things visible

throughout the cluster.

Distributed Shared Objects



Distributed Shared Objects (DSO) is the core technology enabling clustering
services at the JVM level. The best part is that you don't need to import any

DSO libraries for clustering your application. DSO works at the byte-code level.

When you declaratively configure your application for clustering, relevant

changes are applied at the JVM, which Terracotta terms as hooks. These hooks,

then, keep track of field updates in the instantiated objects that have been

declared as 'Shared Roots'. The changes are then informed to the Terracotta

Server that further replicates these changes across VMs. Configuring the

application requires declaring which objects are to be shared, or replicated

across VMs, which classes to instrument, ie those classes whose instances are

being declared shared roots, or have distributed locks or distributed methods.

While doing all this, you just have to create appropriate configuration files

for enabling your application for DSO. So, the entire exercise's critical phase

is to realize what has to be declared as shared and distributed and where locks

are to be applied and which kind of locks are to be applied.

Advertisment

Clustering with Terracotta



Let us look into the procedure for enabling an application for Terracotta. As we
had said earlier, you do not need to import any libraries into your code for

this. You rather need to work on generating the 'Terracotta-Config' file, which

describes the clustering for your application. First, you have to determine

which objects are to be shared. It largely depends on the characteristics of

your application. But a generalized view that can work to start with, is to look

for objects in your application that represent state information and form the

core of your applications' logic. Be careful not to include objects that appear

to be shared objects, but if scrutinized further are actually objects with a

particular JVM, for eg sockets, connections, etc). You, then, mark fields in

these objects to be replicated across the cluster. Terracotta follows a distinct

approach when it comes to initialization of the shared objects. The shared roots

are initialized only within the cluster and the very first 'not-null' reference

that is initialized is assigned to the cluster. Any subsequent assignment of

reference is ignored. Terracotta implements this semantic to prevent the

application code from changing it in context to the cluster.

This is a reason why we said earlier that you have to exercise what has to be

shared rather than getting into tearing apart your existing modules or adding

code to existing application code. While shared objects are integral to

clustering, so are locks as you need to maintain integrity of data members

throughout the cluster. These locks are termed 'Distributed Locks' in Terracotta

lingo. They help keeping data members' values synchronized cluster wide. Locks

can also be 'Named' so that methods of similar names use the same lock. These

are of help in methods written without any thread safety considerations.

In conclusion



So far, we have underlined the basic concepts regarding Terracotta DSO and
clustering. In the upcoming parts of the series, we will look into basic

examples such as how to go about configuring an application for DSO, and later

on how to enable your Spring Objects for Terracotta. In fact, the product

documentation lists a few simple examples such as Slider Application, which can

be referred to get a good idea of how to use the Terracotta Eclipse Plugin and

how to configure an application for 'Terracotting' it.






Advertisment