Wednesday, November 25, 2009

Mission Critical SOA

I was trawling through some old presentations over the weekend and I stumbled across something I’d presented on “Mission Critical SOA” at an Enterprise Java Australia event a couple of years ago. A colleague and I put this together a little earlier on in the evolution of SOA when it was closer to the height of its hype cycle and promising to be the answer to every CIO’s problems. Having worked through a number of challenging SOA implementations since then, the guts of this presentation is still very relevant so I figured I’d reproduce the main ideas here in a blog entry. The original presentation can be downloaded here

What is Mission Critical?


It’s clear that we’ve become increasingly reliant technology as we’ve evolved. You only have to watch how kids interact today to see technology is embedding itself deeper into way we function. I used to talk to my friends for fun but these days it’s not uncommon for kids to communicate by occasionally passing each other an earphone for a quick listen, followed by a smile and a nod then, back to the iPhone – I must be getting real old. Whilst this is hardly a mission critical situation, the basic foundations upon which we live are supported by technology.
  • We flick the switch and we expected the lights to turn on.
  • We turn the tap we expect water to flow
  • We get on a flight, we expect to arrive at our destination safely
  • We dial ‘000’ we expect to get an emergency operator
We expect mission critical technology to just work. If it doesn’t, bad stuff happens – lives may be lost, someone may lose plenty of dough, or someone’s reputation gets a caned.

So mission critical can be seen as the “technology pillars of life” and no doubt, we’ve made our lives easier, but to the extent that we’ve become complacent to the risk of these pillars crumbling, we’ve also made our lives more dangerous.
Mission Critical technologies have to just work - failure is not an option. Under this simple façade how do we actually address real mission critical concerns to make sure solutions never go down, handle exceptions elegantly, ensure data accuracy when handling massive throughput.

We’ve become reasonably good at dealing with many of these concerns but does mission critical and SOA work together?


SOA + Mission Critical


In 2007, Gartner predicted a few things:


"SOA will be used in more than 50% of new mission-critical operational applications and business processes designed in 2007 and in more than 80% by 2010."

"New software products for SOA have hit the market, but given their immaturity, have disappointed users in terms of reliability, performance and productivity."
We’re nearly at 2010, and whilst 80% is a big call, no doubt, everyone seems to be implementing SOA and which is now on the “slope of enlightenment” beginning to meet our expectations – or as Matt Wright commented at last weeks EJA futures event, this may be more about a shift in our expectations of SOA. The one thing Gartner did say back then that resonates strongly is that in many cases “SOA principles have been applied too rigidly, and this has led to unsatisfactory outcomes as projects became too costly and didn’t meet deadlines” We are still some way from maturing to the extent that we can reliably delivery Mission Critical SOA solutions. Addressing this challenge requires us to distinguish between means and ends.

Means and Ends

The fundamental business outcome (ends) we are striving for in any SOA delivery is business agility; the ability for the business to adapt to changing needs. We’re looking for rapid delivery cycles, shorter time to value, lower delivery risk and only an incremental delivery costs when introducing new capabilities.
Underpinning agility we have the “enabling” outcomes. These are the outcomes we strive for that naturally result in agility. We want maximised reuse, infinite extensibility and maximised interoperability all leading to agility. It’s obvious why we strive for these outcomes;
  • We want reuse, so we abstract service designs to produce agnostic services that are not tied to a specific business process.
  • We want extensibility, so we design loosely coupled SOA solutions that minimise dependencies between services allowing easy adaptation to future needs.
  • We want interoperability, so we follow industry standards to maximise the possibility for re-use and easy integration.
So far so good… This is all a part of the standard formula for SOA benefits realisation, however, when you add Mission Critical to the equation, a tension arises between our means for SOA outcomes and fundamental mission critical requirements such as Performance, Reliability and Availability.



Abstraction and Reuse


High levels of re-use on individual services leads to increased performance, reliability and availability requirements on these reused services.
A mobile subscriber service at the centre of the universe for a Telco can lead to a single point of failure should an enterprise rely heavily on this service to deliver core business functionality. This service must now meet the performance needs of all consumers that depend on its functionality. The service must also be as available and reliable as the neediest consumer. The key point here, is the more we centralise solution logic into reusable services, the more we need to consider the ability of the service to meet NFR requirements now and into the future.

Extensibility


Similar tensions exist when considering extensibility; In focussing on designing loosely coupled services we distribute functionality across a service taxonomy. In doing so, we significantly increase the number of service to service interactions, leading to performance overheads, especially when using a standard protocol such as SOAP/HTTP.


Interoperability


Whilst adopting standards such as SOAP/HTTP is a great idea in our quest for ultimate interoperability, we also adopt it’s baggage of being a verbose communications protocol leading to runtime performance overheads, and it’s inability to support reliable messaging and transactional integrity, all leading to reliability concerns.


Whilst there are many Web Service standards (WS-*) aiming to address these issues, they are at differing levels of maturity and as such are not supported by all SOA stacks available.
Some recommendations Firstly, it’s important to acknowledge that many of the levers sit with the technology rather than the architecture. The most brilliant architecture, implemented poorly will ultimately result in failed outcomes So:
  1. Review the SOA principles and determine which ones are important to you
  2. Define a set of standards and patterns to which the organisation will follow, but the key is to make these guidelines, the default position, and deviate where the benefits outweigh the costs.
As an example of this, on a recent project for one of our customers we’ve implemented an SOA solution with a typical mission critical profile:
  • 24x7 uptime
  • Transactional integrity is paramount
  • Transactions per day is in the millions
Some key considerations are:
  1. Solution is to be based on standard communications protocols (SOAP/HTTP)
  2. Solution is to be developed using the provided SOA stack and technologies (e.g. BPEL) as far as possible
  3. Solution must scale up and down as far as possible
Given this, we have 2 options.
  1. Throw a wall of silicon at it just to get it to run at the required volumes
  2. Deviate from the standards where necessary to get the required performance.
In this case we’ve taken the decision to make use of native communication protocols via WSIF (RMI-IIOP) and built in SOA stack optimisations to significantly improve transactional performance and to also provide transactional integrity support between service calls. The idea here is to define standards and use them as the ‘default’ where it is fit for purpose.

Break the rules provided you are doing it for the right reasons, and in a controlled manner.
The right reasons means we must understand the rules, why they are there and what their limitations are. We must also understand the technology, not only the standards but how the product sets implement them, in order to understand any traps. A controlled way means we must ensure there is some governance to avoid throwing the baby out with the bathwater. Establish an Architecture Review Committee, and ensure they don’t operate in a vacuum. There should be a good mix of architects, business representatives and hands on technical specialists to get the best holistic “outcome”.

The uptake - if managed effectively we can achieve most of the benefits of SOA and meet all of our mission critical drivers without introducing prohibitive cost
.

When should we use SOA
?

So when is it appropriate to adopt a vanilla SOA approach versus an approach that requires deviation from SOA principles?

This can be modeled via the following quadrant - the horizontal axis shows increasing levels of change and/or reuse with the vertical showing an increasing level of mission criticality



  • Sweet spot (Green): In the bottom right hand corner, with high levels of re-use, and low levels of mission criticality, we get the most our of a vanilla SOA approach. Just follow the rules and watch the benefits roll in.
  • Easy Cases (yellow): Here we have low mission criticality, but also low levels of re-use. Here we need to understand the business case for SOA. Do the additional SOA overheads such as governance really make sense? Here we should optimise for budget.
  • Hard Cases (red): As we increase the mission-criticality, we need to start thinking about the trade-offs. Do we compromise robustness? Budget (additional effort and hardware)? The hard cases are where we need to optimise the use of stock standard SOA for the outcomes we are trying to achieve and this is the arena I’ve been discussing.
  • Mission Critical (blue): It’s these cases where we really need to think about what we are trying to achieve. Does it make sense for this to be an SOA solution, what are the real tangible benefits that we can derive from an SOA approach? What defines success? Can SOA deliver it?


SOA doesn’t replace everything that preceded it, and it is not “the one true path”. Like any technology, it builds on successful ideas from the past and leverages new technology innovations. There is great value in SOA but there is even greater value in perspective. Understand your business first. Then do what makes sense.

The mapping between concept and reality is not transparent. The truth of the matter today is that to realise your vision, you do need expertise in the underlying technologies — what works well and what doesn’t.


While all of us work in an abstract industry, we are here, at the end of the day, to deliver tangible things. No matter how elegant, how compliant and how service oriented an architecture is, the business just wants a robust solution that works.