Monday, February 05, 2007

Measuring Software Architecture Performance

Do you know how your software is performing on an hourly, daily, monthly basis? If you operate a distributed or multi-server software architecture you probably should know. My friend, Kirk Pepperdine at http://www.javaperformancetuning.com/ will show you how to find performance problems with his performance anti-patterns. I will talk about how to build performance monitoring so that you can know how your systems are doing at any time.

First, instrumenting the software is EASY. There is really no reason for everyone to do it. Start by deciding what you want to capture. The basics are the time lapse between the the start and end of a method.


public void doApiMethod() {
long start = System.getMilliseconds;
// do some work
long finish = System.getMilliseconds;
Log.logPerformance(29,(finish-start));
}

Ideally, for SOA, you would embed this into your architecture so that no instrument coding would need to be done.

Create a logging class or reuse one of your current ones to store an method ID (29=doApiMethod in this example) and the time lapse. The data should be stored in a non-critical system database so that usage does not affect your overall system performance. Thats it.

1. Create a logging method
2. Write performance data from logging to a database.
3. Instrument methods to write to logging method.

Once you have data for a period of a month or so, you can start to look at your system and watch for performance issues and changes. Issues would be methods that take much longer on average than they should. This is where Kirk can help you. Changes are when the average method time increases suspiciously or dramatically. For example, a method to write a new customer to db averages 60ms and then moves to 300ms. What just happened?

For immediate problems you might start with code that says:

logPerformance(int methodID, long elapsedMs) {
// write to db...
// check for performance issues
if (methodID == 29 && elapsedMs > 200)
doSomeAction();
}


This is the key to the why behind performance monitoring, being able to identify changes in system performance immediately and over time. Once you can do this you can monitor how code, database, system, network, and other dependencies affect your system and proactively manage vs. reactively manage when the system crashes or slows so much no work gets done.

Wednesday, May 17, 2006

JavaOne Johnny-Come-Latelies

Since posting my last blog entry, it has been brought to my attention that someone else is speaking at JavaOne regarding my work. Hotels quickly went through four architects after my departure and I have never heard of this person. From the Javaone catalog, "Brad Schneider, director of architecture and research, Hotels.com", TS-4069 Travel Web Services: Marrying Business Innovation with Java Technology.

I was working with Travelnow.com, helping build and stabilize the Travelnow site and affiliate network just prior to building Hotels.com. The affiliate network there became the Hotels.com Affiliate network, along with several other key pieces of technology I wrote, such as the Tnow data warehouse, and realtime error tracking system.

I guess if your late to the game, grab someone else's work, and stake your claim.

Saturday, May 13, 2006


Aisus.Com, SOA, and Hotels Architecture

Little did I know I was on the cutting edge of SOA in 2000...



I was heavily influenced by the heavy weight of EJB and session replication early in 2000. I was working on the largest Windows installation of BEA Weblogic at the time. The system performance was very poor. Studying the problem led to several revelations.

First, EJBs that should have existed only one place in memory actually existed until the maxlimit was reached. For instance, a StateEJB which should have had at the most 50 instances in memory would max the threshold at 500. If I set it to 1000, it would go to 1000, etc. Somewhere just before 5000, the server would choke and die.

The second issue was sessioning. I was in the midst of switching from Weblogic 4.52 to 6.0. I could not get a cluster of more than four servers running. Turns out, the cluster mechanism was flooding the network with 15mg of session replication traffic every few seconds. OUCH! This caused a real alarm as network congestion suddenly brought computers all over the building to a halt. This led to putting my test servers in to their own DMZ.

Further study, led me to the following conlusions. EJBs were really overkill and not very efficient. This is recognized now and dealt with better in later J2EE but not in 2000. The other was that if your servers were stable you did not need to use clustering. I was working in the travel industry and dropping a couple of reservations a month was acceptable. This led to using a stateless sessionbean for an API container and POJO beans for business logic. I first tested this by implementing it on Aisus.com servers and refining the processes. This was fast! A test on a laptop ran 100,000 insurance quotes in ~45 seconds. About this time, a large insurance company in Dallas reported with great fanfare the ability to run several thousand quotes a day on a mainframe. I knew we had something!

In November of 2001, I got a call asking if I wanted to be the architect and lead developer for a skunkworks travel project, very hush hush. I knew immediately the architecture we would use. This site went up in just under four months, averaging $1M USD per day the first thirty days with a loss of less than 50 reservations. This is a totally unheard of feat in the travel industry. Orbitz is reputed to have spent in excess of $100M and two years on their system.

Little did I know then, but this system of architectured tiering would later become popularized as SOA or service oriented architecture. I was on the cutting edge in 2000 and didn't know it. This architecture uses a stateless session EJB to encapsulate all calls to an API. At this point, all calls can be profiled, data warehouse logged, server logged, and any situation alerts can be sent. The calls are sent to POJO bean business entities. Any RMI, JSP, XML over JSP or RMI, etc can call into the SOA.

This architecture has allowed Aisus.com, Hotels.com, and several other enterprise systems Aisus has built to grow and scale without the normal problems encountered in building EJB systems.

Saturday, April 29, 2006

Hello,

My name is Mica Cooper and I am the CEO and President of Aisus.com. A site for insurance services online. I formerly worked at AMS, Agency Management Services for ten years, and was the architect for www.hotels.com. Hotels is one of the highlights for Aisus.com as it was deemed the biggest e-launch of 2002 and sold for $3.2 billion in 2004.

With this blog I hope to highlight technology issues for the insurance industry. The industry along with banking seems to be kicking and screaming its way into the 21st century. I would like to ease that transition into the e-world.