Thrift and Protocol Buffers performance in Java

I’ve used Thrift for some log client in our system. I’m going to use Protocol Buffers as the internal communication protocol between our XMPP servers. But I am hard to believe from the thrift and protocol buffers Python performance comparison, that that Protocol Buffers is 4-10 slower than Thrift. I’m going to do some similar tests on Java.

The test is very similiar as the Python test. the .proto and .thrift file are copied from the above python test.

The .thrift content:

struct dns_record {
1: string key,
2: string value,
3: string type = 'A',
4: i32 ttl = 86400,
5: string first,
6: string last
}

typedef list<dns_record> biglist

struct dns_response {
1: biglist records
}

service PassiveDns {
biglist search_question(1:string q);
biglist search_answer(1:string q);
}

The .proto content

package passive_dns;

message DnsRecord {
required string key = 1;
required string value = 2;
required string first = 3;
required string last = 4;
optional string type = 5 [default = "A"];
optional int32  ttl = 6 [default = 86400];
}

message DnsResponse {
repeated DnsRecord records = 1;
}

From the document, I learn that the optional and default values are one of the benefits of both serialization libraries. A record that matches the default value does not need to be included in the serialized output.

I wrote up a simple test program to compare thrift, protocol buffers. I tested the serialize and deserialize together, because this is the most called part in most scenarioes.

Test 1: 10,000,000 times

ProtoBuf Loop  : 10,000,000
Get object     : 15,130msec
Serdes protobuf: 68,600msec
Objs per second: 145,772
Total bytes    : 829,996,683

Thrift Loop    : 10,000,000
Get object     : 12,651msec
Serdes thrift  : 36,904msec
Objs per second: 270,973
Total bytes    : 1,130,000,000

Test 2: 1,000,000 times

ProtoBuf Loop  : 1,000,000
Get object     : 1,094msec
Serdes protobuf: 7,467msec
Objs per second: 133,922
Total bytes    : 83,000,419

Thrift Loop    : 1,000,000
Get object     : 524msec
Serdes thrift  : 5,969msec
Objs per second: 167,532
Total bytes    : 113,000,000

The serde_* functions are the times needed to serialize, and de-serialize the java object to and from a byte[].

The result in Java was that Protocol Buffers 1.2-2 times slower than Thrift. (in the python test was 4~10 times). And PB binary size is smaller than Thrift. I think this is acceptable, and Google may improve the Protocol Buffers performance in the future version.

Download my test code in Java: thrift-protocol-buffers-java.tgz,

More information about thrift and protocol buffers: Thrift, Protocol Buffers installation and Java code howto

Update: There is another Thrift vs. Protocol Buffers compare non-performance factors.

UPDATE 2 (Apr 17): There is a performance tuning parameter optimize_for = SPEED (thanks Steve Chu) for Protocol Buffers, please see my next performance tests Thrift and Protocol Buffers performance in Java Round 2

Thrift, Protocol Buffers installation and Java code howto

I. Thrift installation and Java code

1. build and install thrift

install boost
cd <boost_root>/tools/jam
./build_dist.sh
# linux* will depends the platform
cp stage/bin.linux*/bjam <boost_root>
# build boost, use bjam will faster
cd <boost_root>
./configure –without-icu –prefix=/usr/local/boost
./bjam -toolset=gcc –build-type=release install –prefix=/usr/local/boost

# build thrift
./bootstrap.sh
./configure –with-boost=/usr/local
make
make install

2. Build Thrift java library

install apache ant if necessary

cd lib/java/
ant

get libthrift.jar

3. Create .thrift file and gen Java code

(See http://wiki.apache.org/thrift/Tutorial for more .thrift tutorial info)
tim.thrift

struct dns_record {
1: string key,
2: string value,
3: string type = 'A',
4: i32 ttl = 86400,
5: string first,
6: string last
}

service TestDns {
dns_record test(1:string q);
}

<thrift_root>/bin/thrift –gen java tim.thrift
code will be generated in gen-java/*.java

4. Write java code

// new object
dns_record dr = new dns_record(key, value, type, ttl, first, last)
// serialize
TSerializer serializer = new TSerializer(new TBinaryProtocol.Factory());
TDeserializer deserializer = new TDeserializer(new TBinaryProtocol.Factory());
byte[] bytes = serializer.serialize(dr);

see also: http://wiki.apache.org/thrift/ThriftUsageJava

II. Protocol Buffers install and Java code

1. Build and install Protocol buffers

./configure
make
make install

2. Build protobuf Java library

install maven if not necessary

cd protobuf/java
mvn test
mvn package

get jar from target/protobuf-java-x.x.x.jar

3. Create .proto file and gen Java code

tim.proto

package dns;

message DnsRecord {
required string key = 1;
required string value = 2;
required string first = 3;
required string last = 4;
optional string type = 5 [default = "A"];
optional int32  ttl = 6 [default = 86400];
}

message DnsResponse {
repeated DnsRecord records = 1;
}

bin/protoc –java_out . tim.proto

4. Write Java code

// protocol buffer need a builder to create object
Dns.DnsRecord.Builder b = Dns.DnsRecord.newBuilder();
b.setKey("key");
b.setValue("value...");
...
b.builder();

byte[] bytes = dr.toByteArray();
Dns.DnsRecord dr2 = Dns.DnsRecord.parseFrom(bytes);

III. Resources

Thrift: http://incubator.apache.org/thrift/

Protocol Buffers: http://code.google.com/apis/protocolbuffers/

Tim’s Blog: https://timyang.net/

BlackBerry黑莓8700优缺点

用了BlackBerry 8700一两周了,感觉如下。

1. 自带的浏览器很不错, 感觉比Opera mini好用,支持qwerty全键盘快捷键
2. 输入速度很快,键盘设计很合理 尤其是英文, 可以很方便记录很多临时想法, 比如这篇文章刚才在地铁上就完成了
3. 可以和电脑同步address, calender, task.
4. 支持多任务及copy/paste
5. 常用BlackBerry软件8700都支持
6. 可以给手机设置密码, 手机丢了也无需担心数据安全
7. 可以给PC做modem通过移动gprs/edge上网
8. 价格便宜, 大白菜价,可能是公司最便宜的手机了, 目前价格400-600元。

缺点
1. 上网经常碰到请求超时(4.2ROM),刷机4.5后解决,但是4.5没有自带拼音输入法,第三方的输入法兼容性稍差。
2. 待机时间不够长,每天上网1-2小时待机2-3天
3. 外形稍宽,一只手操作比如切换到滚轮和回车键有点费力,长相一般,不过自己觉得还行。
4. 没有相机和扩展存储卡

另外8700引以为豪的是越狱中的专用手机

blackberry 8700