Monday, February 9, 2009

SOA/BPM performance best practices (BPEL process definitions)

WPS can handle two different types of BPEL flows: long running flows (also known as macro flows or interruptible flows) and short running flows (also known as microflows of non-interruptible flows).

Long running flows
A business transaction, that is represented through a long running flow has a lifetime that can span minutes, hours, days or even months and is typically divided into several technical transactions (embraced by begin and commit). The state of such a process instance is persisted in a database (the BPE DB) between two transactions, so that operating system resources are only occupied during an in-flight transaction. WPS allows for tuning of technical transaction boundaries, so that at process definition time the developer can e.g. extend the scope of a transaction by combining several transactions into a single one, thus saving the transaction handling overhead to a certain degree.

Short running flows
The other type of of flow, the short running flow, is used when the corresponding business transaction is fully automated, completes within a short time frame, and has no asynchronous request/response operations. Here the entire set of of flow activities run within one single technical transaction, navigation is all done in memory, and intermediate state is not saved to to a database. Such short running flows can run between 5 and 50 times faster than comparable long running flows and should be preferred if possible.

Programming at the business level
BPEL can be considered as a programming language. This aspect however should not lead to the assumption, that it would be appropriate to use it as a suitable base to develop applications that usually written in languages like C++ or Java. BPEL should be considered as an interpreted language, although there have been some investigations on the possible advantages of compiled BPEL. Comparing the execution characteristics of a BPEL flow internally with the flexibility of interaction and invocation of the orchestrated services provided through SOA, it might become obvious, that a considerable share of the overall execution path length can be accounted to SOA's invocation mechanisms for example via SOAP or messaging. So having a BPEL compiler would only optimize a smaller part of the overall execution path length. And with compiled BPEL one might loose some flexibility and interoperability, that the current implementations are offering.
In the 1980s, one of the trends in the IT industry was called Business Process Re-engineering. The solutions that were developed for that usually were more or less large monolithic programs containing the business level logic hard coded in it's modules in most of the cases. In BPEL based business applications, this business level logic is transferred to the BPEL layer. Some considerations should be made, where to place the dividing line between the BPEL layer and the orchestrated lower level business logic services. The more flow logic details are put into BPEL, the larger the BPEL related share of the overall processing gets and one might end up with doing low level or fine granular programming in BPEL. This is not, what BPEL is meant to be used for. After all, the “B” in BPEL stands for “business”. So BPEL should be used for programming on the business logic level only.

Business process data
Every business process deals with some amount of variable data, that might be used for decisions within the flow or as input or output parameters of some flow activities. The amount or size of that data can have a considerable impact on the amount of processing that needs to take place. For large business objects the amount of memory needed may quickly exhaust the available JVM heap and in case of long running processes the size of business objects directly relates to the amount of data, that needs to be saved and retrieved from a data store at each transactional boundary. And CPU capacity is affected as well for doing object serialization and de-serialization. The advice is to use as little data as possible within a business process instance. Instead of e.g. passing large images through a flow, a pointer to the image in form of a file name or image id causes much less overhead in the business flow engine.

Invocation types
SCA environments offer different kinds of invocation mechanisms. Some invocations can be done synchronously, some asynchronously. Synchronous invocations typically imply less internal processing compared to asynchronous invocations. Asynchronous invocations typically require also a currently open transaction to be committed to allow the outgoing request message to become visible to the consumer it is targeted to. Even when the used binding is synchronous, unnecessary serialization and de-serialization could be avoided, is the target service can reside within the same JVM, so that internally just object pointers can be passed.
If the target service could reside within the same module, then one could also save some internal name lookup processing.

Audit logging
Most BPM engines allow to keep a record of whatever is happening on the business logic level. Producing such audit logs doesn't come for free either since at least some I/O is associated with it. The recommendation here is to restrict audit logging to only those events, that are really relevant to the business and omit all others to keep the amount of logging overhead as small as possible.