EmailLinkedInGoogle+TwitterFacebook

Source code available @
svn checkout https://whiteboardjunkie.googlecode.com/svn/trunk/serialization-compare serialization-compare

Marshalling and Un-marshalling request and response parameters across client server boundary lines have been a performance bottleneck for high throughput systems. For Java developers the native solution available is to use the Serializable or Externalizable interfaces. Lately it has gone out of fashion to leverage the Java Serialization. People increasingly use, XML, JSON, Custom Serialization techniques etc. to achieve this end. Recently google has made its Protocol Buffers open source adding another option for serialization across the wire. With so many options I wanted to decide by myself the merits of choosing one over the other with speed fo execution and easy of implementation as deciding parameters. To achieve this goal I created an easily extendable testing framework to compare the competing technologies.

The preliminary results show an absolute superiority in performance by Google Protocol Buffers and is twice as fast as the closest competitor, the Java Serialization. However, please note that the number of lines of code (plus the process) for achieving the same across the two technologies was more than 10 times more complex with Google Protocol Buffer. So my conclusion is that better tooling support is required for Protocol Buffers to be adopted as a lingua franca for the rest of the world outside Mountain View. CA.

The Use case:

System should be able to serialize and de-serialize fairly complex Pojos across a socket connection.

The Solution

Implement an Echo Server that will accept a serialized version of a complex Pojo. Let this echo server reconstruct the object to a Pojo of the original type and then let it be sent back to the invoking client. On the client side we will de-serialize the stream back and compare with the original to assert the echo is accurate.

The Design

For the complex domain object I chose a classic Person Object. Person has a name and an Address. Person has friends and children. Friends and Children are both of type Person (Vicious recursion introduced). Person has a gender, which is specified as an Enum. In my opinion such an object posessed complex enough characteristics for being a good quality specimen for Serialization.

public class Person implements Serializable {
    String id = "";
    String name = "";
    int age;
    Sex sex;
    Address address;
    java.util.List<Person> children = new ArrayList<Person>();
    java.util.Map<String, Person> friends = new HashMap<String, Person>();
}

Next I implemented a simple server socket receiver with Jboss Remoting as the toolkit. This choice gives us power to seamlessly test the invocation performance against various trasnports such as HTTP, RMI and SOCKET.

public class EchoServer {
        public EchoServer(String invocationUri, EchoInvocationHandler<?, ?> invocationHandler){
                try {
                        this.invocationHandler = invocationHandler;
                        setupServer(invocationUri);
                } catch (Exception e) {
                        throw new RuntimeException(e);
                }
        }
        public synchronized void setupServer(String locatorURI) throws Exception {
                InvokerLocator locator = new InvokerLocator(locatorURI);
                logger.info("Starting remoting server with locator uri of: "
                                + locatorURI);
                connector = new Connector(locator);
                connector.create();
                logger.info("Created server with locator uri of: " + locatorURI);
                connector.addInvocationHandler("wsainvocation", invocationHandler);
                connector.start();
                logger.info("Started Connector succesfully....");
        }
	//...
}

To add the test framework characteristic, I chose to implement the invocation handler as a generic abstract class. User must implement methods ‘serialize’ and ‘deSerialize’ which will be specific to the type of stream required to echo.

public abstract class EchoInvocationHandler<T,P> implements ServerInvocationHandler{
        
        protected ServerInvoker invoker = null;
        protected MBeanServer mbeanServer = null;
        public final T invoke(InvocationRequest request) throws Throwable {
                T payLoad = (T)request.getParameter();
                P p = deSerialize(payLoad);
                T responsePayLoad = serialize(p);
                return responsePayLoad;
        }

        protected abstract T serialize(P p);
        protected abstract P deSerialize(T payLoad);
}

To clarify the context of the generics let me add an usage example. For an invocation where the client Serializes the Person Object to byte[], the definition of the invocation handler will be as follows:

EchoInvocationHandler<byte[], Person>(){
	@Override
	protected Person deSerialize(byte[] payLoad) {
	}
	@Override
	protected byte[] serialize(Person p) {
	}
} 

The test case

The testcase is a very simple aggregator that invokes testSerialization() methods on the SerializationTester implementations that we would like to to test. The method accepts a StopWatch and MAX_MESSAGES count and returns the Stopwatch reference at the end of execution. Stop watch would have recorded the time consumed by part of execution that is doing any client server communication is recorded. Presently I am comparing the java serialization, JSON Serialization (With Json-Lib) and Protocol Buffer Serialization. Each of the methods fulfills the usecase but at varying speeds.

	static SerializationTester[] serializationTesters = {
			new JavaSerializationTester("socket", 12000),
			new JsonSerializationTester("socket", 12000),
			new ProtocolBufferSerializationTester("socket", 12000),
			new JavaSerializationTester("http", 12000),
			new JsonSerializationTester("http", 12000),
			new ProtocolBufferSerializationTester("http", 12000),
			new JavaSerializationTester("rmi", 12000),
			new JsonSerializationTester("rmi", 12000),
			new ProtocolBufferSerializationTester("rmi", 12000) };

	static final String FORMAT = "|%1$-40s|%3$-10s|%2$-5s|%n";
	static final String HR = "__________________________________________________________";

	@Test
	public void testOnWireSerializationPerformance() throws Throwable {
		for (SerializationTester aSerializer : serializationTesters) {
			Formatter formatter = new Formatter();
			formatter = formatter.format(FORMAT, new String[] {
					aSerializer.getClass().getSimpleName(),
					String.valueOf((aSerializer.testSerialization(
							new StopWatch(), MAX_MESSAGES).getTime())),
					aSerializer.PROTOCOL.toUpperCase() });
			results.add(formatter);
		}
	}

	@AfterClass
	public static void printResults() {
		System.out.println("nn");
		System.out.println();
		System.out.format(FORMAT, new String[] { "Serializer", "Time",
				"Protocol" });
		System.out.println(HR);
		for (Formatter aResult : results) {
			System.out.format(aResult.toString());
		}
		System.out.println(HR);
		System.out.println("nn");
	}

The results after sending 1000 randomly created Person objects across the wire and echoing it back to the client are as follows (The number is the time taken in milli-seconds). As you can see the test runs very easily against different transports and the results are different too.

Results when Server bound to 127.0.0.1

/*
__________________________________________________________
|Serializer                              |Protocol  |Time |
__________________________________________________________
|JavaSerializationTester                 |SOCKET    |1982 |
|JsonSerializationTester                 |SOCKET    |19983|
|ProtocolBufferSerializationTester       |SOCKET    |967  |
|JavaSerializationTester                 |HTTP      |1524 |
|JsonSerializationTester                 |HTTP      |19247|
|ProtocolBufferSerializationTester       |HTTP      |456  |
|JavaSerializationTester                 |RMI       |1506 |
|JsonSerializationTester                 |RMI       |19777|
|ProtocolBufferSerializationTester       |RMI       |496  |
__________________________________________________________
*/

Results when Server bound to 0.0.0.0

/*
__________________________________________________________
|Serializer                              |Protocol  |Time |
__________________________________________________________
|JavaSerializationTester                 |SOCKET    |3945 |
|JsonSerializationTester                 |SOCKET    |21627|
|ProtocolBufferSerializationTester       |SOCKET    |1498 |
|JavaSerializationTester                 |HTTP      |4710 |
|JsonSerializationTester                 |HTTP      |22855|
|ProtocolBufferSerializationTester       |HTTP      |1590 |
|JavaSerializationTester                 |RMI       |3962 |
|JsonSerializationTester                 |RMI       |22355|
|ProtocolBufferSerializationTester       |RMI       |1122 |
__________________________________________________________
*/

Where to go from here

Get the source code from the google svn link and test my implementation for some flaws that is biasing the results one way or the other. Also you can very easily test another Serialization target technology using the test harness framework available with this code. If you think there is something obviously wrong with the way JSON serialization handling is written please let me know. Frankly I never thought the JSON one will take 10 times the time as java serialization.

3 Thoughts on “Serialization Options Compared

  1. Interesting results, thanks for the comparison!

    Why not compare with JAXB or some other form of XML serialization as well? (Did I miss that in the article?)

  2. No you did not miss it in that article. IMO XML being too verbose the results will be much poorer unless we use an efficient parser like woodstox; I could be wrong here on all counts. Suggestion, why don’t you take a crack at putting in a JAXB comparison through this framework.

  3. What would be interesting to test is the performance of using Externalizable also. Serializable adds a lot of overhead with its attempt to be backward and forward compatible.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation