Abstract

Right now the software industry is like a cacophony of birds, proudly singing their individual themes. While each member of the chorus is producing wonderful sounds and melodies, there is no cooperation and little harmony between the parts. What we need is a conductor capable of shaping these individual voices in to a coherent theme, as John Williams molds dozens of instruments into symphonies.

There is a developing interest in the interoperability of workflow systems and large component software systems. Workflow systems handle business processes using computer automation, coordinating the efforts of hundreds of employees. Languages for design to facilitate component synthesis in software accomplish the same task with independent programs and databases. These languages will want to integrate workflow systems along with other computerized software. All of this is taking place as computing continues in the development of distributed heterogenous networks, and the lines of communication between these types of systems grow apace.

In this paper we study two workflow protocols: the OMG jFlow specification and the Simple Workflow Access Protocol (SWAP) in comparison to the CHAIMS Protocol for Autonomous Megamodules (CPAM). We will examine the interfaces necessitated by jFlow and SWAP and inspect the methods of communication employed by each protocol. We will attempt to highlight differences between these two protocols and CPAM, and demonstrate the steps necessary for CPAM to access systems using them. We will not go in to detail about the CPAM specifications, but would recommend interested parties look at them independently (see http://www.db.stanford.edu/CHAIMS/Doc/Details/Protocols/Protocols_Overview.html).

 

 

SWAP

SWAP Basics

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">The Simple Workflow Access Protocol (SWAP) is under development by the Internet Engineering Task Force (IETF) for the purpose of integrating work providers across the internet.  Much of the information here was obtained from the SWAP Working Group (http://www.ics.uci.edu/~ietfswap/). The IETF proposes to make SWAP  applicable for a variety of situations and devices, as well as allowing control and monitoring capability for clients. SWAP’s was designed with the workflow community in mind because many of the methods invoked by SWAP will create asynchronous processes which run for days, weeks, or months. SWAP will operate using newly defined HTTP method requests, and will return information encoded in XML.

SWAP Interfaces

SWAP is built around three interfaces: the Process Definition, the Process Instance, and the Observer.

Each Process Definition is a specification of a service provided to companies, such as a financial function or a information search.  Clients using SWAP-supporting tools will have a menu of Process Definitions to choose from, each of which is associated with a Unique Resource Indentifier (URI).  By using this menu of services, clients can create Process Instances by sending an HTTP request to the URI.

Process Instances represent actual tasks invoked by clients using SWAP.  Each Process Instance has a URI as well.  While it is in progress, a Process Instance can be stopped, restarted, altered or eliminated by HTTP requests. Data can be sent or results can be checked.

The Observer is any interface which can receive reports from a Process Instance which has completed its work.  The Observer's URI is provided to the Process Instance, and the Process Instance will notify the Observer of state in a number of different situations. Although the client can keep sending messages to a Process Instance to check whether it has completed or not, the IETF views this as an unnecessary waste of resources on the part of both client and work provider.

The interfaces and actors described in the SWAP protocol may be modeled by a food court at a shopping mall. The clients are the customers, and the service providers (e.g. a bank, a travel agency, or a research center) are the food vendors. The services provided are the various foods and drinks offered by each vendor. The basic idea of the food court is clear: the customers wish to select and order food items of interest from the vendors. Similarly, the clients imagined in the SWAP protocol are interested in obtaining services and results from other suppliers, agencies, or groups.

The SWAP protocol defines the language of interaction between customers and vendors. The Process Definition, Process Instance, and Observer interfaces can also be modeled by the food court. The Process Definitions are the items on the menus of the various vendors. Basic information is available from the menu: description and cost. Further information may be obtained by inquiring about the item to one of the staff. By itself, the menu description doesn't do anything but inform the customer (the description is distinct from the food itself). The Process Definition is similar: most clients will likely know only the URI at first, they can query the Process Definition interface for more information, and the Process Definition does no work by itself.

The Process Instances are like actually food orders, placed by the customer (using the menu) with the intent of receiving food or drink. Many people can place orders with a vendor at one time, and those customers may request similar items. A customer may make specific requests or specify certain aspects of their order. Customers may even check on the status of their order, change it, or cancel it. They do not, however, have any knowledge about the details of the preparation (the kitchen is closed), nor do the know exactly when the order will be ready. This is the black-box behavior envisioned by the SWAP protocol. Clients can control and monitor Process Instances they have created, but they don't know how the work is done. A provider could be handling many Process Instances invoked by one Process Definition at one time.

The Observer interface is a chosen by the client to receive notification of a completed order. At many food courts, the customer can wait at the counter for the food, listen for their number over a speaker system, or carry an electronic beeper back to their table. When the vendor has completed the order, they signal the customer, and the customer picks up the food. This is very similar to the Observer interface. As described in the protocol, the Observers could be email addresses, ftp servers, even phone numbers. The provider signals the observer when the work is completed, and the client checks the results.

Because workflow systems are designed to handle a wide variety of tasks, the analogy can’t be stretched much further. There are several more performers in the SWAP system apart from these interfaces, such as the Activity Observer, Work Items, and Work Item Factories. The Activity Observer is created when a Process Instance is waiting on the results of one or more other systems. It allows clients and providers to look up information on both the calling process and the sub-processes. Work Items are a type of Process Instance, which don’t do any work but represent a task being accomplished by a person. Work Item Factories are the Process Definitions which create Work Items.

SWAP Methods

A large portion of the draft protocol covers the methods defined by SWAP for the interactions of these interfaces. SWAP intends to utilize HTTP 1.1 format for messaging. The protocol document answers questions about user authentification, error-handling, and data structures. Anonymous messages are allowed in HTTP, but most workflow systems will use authenticated messages in order to protect their resources (time, money, and information). All SWAP-defined methods will handle exceptions and error situations in their return values. Attention is given to the interpretation of numbers, booleans, and data/time values.

There are two basic methods, shared by several of the interfaces. One is PROPFIND, a method which returns en masse all the values of the interface (either default or previously set values alike). This includes such things as the name, the URI, the description, the state of a process, or the results. The other is PROPPATCH, a method for setting any number of the interface values. This includes setting the priority of a Process Instance, or notifying an Observer of results.

The Process Instance interfaces respond to the messages PROPFIND, PROPPATCH, TERMINATE (self-explanatory), SUBSCRIBE (a message from an observer), UNSUBSCRIBE, and GETHISTORY (which returns a transaction log of the Process Instance).

The Process Definition interfaces have the message PROPFIND, but not PROPPATCH. They also implement CREATEPROCESSINSTANCE and LISTINSTANCES.

Observers interfaces should receive PROPFIND, PROPPATCH, and three signals from their associated Process Instance. COMPLETE for a normal completion, TERMINATED for a terminated process, and NOTIFY for any events prior to completion (requests for data, intermediate results, etc.).

SWAP also defines PROPFIND, PROPPATCH, and COMPLETE for Activity Observers, and it defines PROPFIND and CREATEPROCESSINSTANCE for EntryPoint Interfaces (referenced by Process Definitions when there are multiple ways of creating a Process Instance).

The charter of the IETF working group posits the completion of the SWAP specifications, goals, and requirements by September, 1999. Although a tentative date of November, 1998 was set for the draft version of the goals and scenarios documents, they are not yet available. The requirements document delves into the terminology of SWAP and goals for the interoperability of SWAP with other workflow management systems. It also places priority on scalability, security considerations, and quality of service.

SWAP and CPAM

There are two principle differences between SWAP and the CHAIMS Protocol for Autonomous Megamodules (CPAM). SWAP by itself makes no estimate of the amount of time needed for the completion of a task, while CPAM gives client programs the ability to call ESTIMATE on ongoing processes to determine the amount of time required. Perhaps a CHAIMS compliant wrapper of SWAP would need to come up with estimates for each Process Instance beforehand. SWAP makes use of the Observer interface as a sort of replacement for giving an estimate. CHAIMS megaprograms make all requests to megamodules directly, so perhaps a wrapper megamodule for SWAP procedures would create an Observer interface for each Process Instance it creates on behalf of a request from a megaprogram.

The other CPAM primitives (SETUP, GETPARAM, SETPARAM, INVOKE, EXAMINE, EXTRACT, TERMINATE and TERMINATEALL) can be easily replicated by similar methods in the SWAP protocol. SWAP data information is designed to be easily parsed rather than readable, so transitions to and from CPAM data should likewise be simple.

OMG jFlow

Intro to jFLOW

jFlow addresses the requirements for interoperability between different workflow object implementations in a CORBA environment. The interfaces specified are intended to be sufficiently general to support a wide range of application domains. These domains include healthcare, electronic commerce, manufacturing, insurance, finance, transportation, printing and publishing. JFlow specifies interfaces for workflow execution control, monitoring, and interoperability between workflows defined and managed independently of each other. The interfaces are based on a model of workflow objects which includes their relationships and dependencies with requesters, assignments, and resources.

Workflow Interfaces

The core of jFlow is a set of interfaces that a compliant system must implement and support. Systems then use these interfaces over IIOP to communicate and interact. A UML class diagram of the primary interfaces follows:

The core workflow interfaces are:

subtypes of this interface are defined to record change of the state of a workflow

object, process data associated with it, and change in the assignment of resources to

WfActivities.

Process Enactment

To start a given workflow process, a Requester that is responsible for the process must either be created or selected from the set of existing Requesters. Next, An appropriate Process Manager (the process factory, as in the Factory Patter) is identified and the Process is created using the create_process operation of that manager. The Requester is associated with the Process when it is created and will receive status change notifications from the Process. When the Process is instantiated it might create a set of Activities representing the steps necessary to complete the process. Additionally context data may be used to parameterize the process (i.e identify resources to be used by the process, etc.). The Process is actually started by invoking its start operation; the process implementation will use context data and build-in logic to determine which Activities are to be activated. It may also initiate other (sub-) Processes. When an Activity is activated, its context is set and resources may be assigned to it by creating Assignments linking it to Resources; the resource selection mechanism is not defined here, but an implementation might, e.g., use another Process to determine which resources to assign to a particular Activity, using the Activity’s context information and other process parameters. An Activity might be implemented by another (sub-) Process, i.e., it can be registered as the Requester of that Process; the sub-process can be initiated when the Activity is activated. In this case, the Activity is completed when the sub-Process completes and the result of the Activity is obtained from the result of the Process. An Activity can also be realized by an application that uses Activity’s the set_result and complete operations to return results and signal completion of the Activity. When an Activity is completed, its results will be used by the workflow logic to determine follow-on Activities; the results can also be used to determine the overall result of the container Process. A Process is completed when there are no more Activities to be activated; it will signal its completion to the associated Requester and make available the results of the process. A small but important footnote is that intermediate results may be accessible while the Process is running, which may in turn be used to determine the degree of completion and what is causing any delays. The overall status of a Process can be queried using the state, get_context and get_result operations. The Requester associated with a Process also receives notifications about status changes of the Process. More detailed information on the status of the process steps can be obtained by navigating the step relationship between Process and its Activities and using the status inquiries provided by the Activity interface. Navigation of nested workflows is supported by the process relationship between an Activity (which is a Requester) and potential sub-Processes. Whenever an Execution Object (Process or Activity) performs a (workflow relevant) status change, an EventAudit is recorded. For each Execution Object, the history of Event Audit items can be accessed to analyze the execution history of that object. Event Audits might be published using the OMG Notification Service. Perhaps one useful extension would be to extend this history functionality to create more analytical features about the history and use this analysis to predict current processes. For example, an excellent starting place might be to keep a record of completion type for a given type of Process and then use that information as a prognosticator for how long a currently running Process would probably have left or how long a given Process would take to run.

JFlow vs CHAIMS/CPAM

So, now that we have looked at the specifics of jFlow, it would be interesting to compare it to solution presented in CHAIMS/CPAM. Basically, the big difference between jFlow and CHAIMS is simply the scope of the problem the two try to solve. CHAIMS attempts to solve the much more generic problem of interconnecting components in general whereas jFlow is specifically designed for the workflow management system. The result of this is that CHAIMS is far more flexible but also more complicated. Creating a system of communicating components in CHAIMS requires writing mega-modules for each component and then writing the mega-program to control the flow. However, if your various workflow management systems support jFlow then interconnecting them should be as simple as setting up your workflow processes. Additionally, jFlow requires an environment that supports CORBA and IIOP which pretty much locks vendors into using C++ whereas CHAIMS also allow RMI and thus Java. So, the conclusion I come to about the two is that if you are in the market for a workflow management system then it would be an excellent idea to find one that supports jFlow. However, it is obvious that jFlow is specialized for workflow so if your components are not workflow then CHAIMS would be a superior option. The one interesting facet of jFlow vs. CHAIMS is that because jFlow essentially creates a standard CORBA interface for workflow management systems to support, it would be relatively easy to create a standard CHAIMS compliant mega-module wrapper for all jFlow compliant workflow management systems. From there it would be very easy to integrate workflow management systems into larger systems of CHAIMS components.

 

References

OMG jFlow Specification ftp://ftp.omg.org/pub/docs/bom/98-06-07.pdf

SWAP Requirements (in progress): http://www.ietf.org/internet-drafts/draft-sreddy-swap-requirements-02.txt

SWAP Protocol (in progress) http://www.ietf.org/internet-drafts/draft-swenson-swap-prot-00.txt

CPAM Protocol (CHAIMS) http://www.db.stanford.edu/CHAIMS/Doc/Details/Protocols/Protocols_Overview.html