Model free scheduling of invocation, extract and other primitives in CHAIMS

Dorothea Beringer, Laurence Melloul, Neal Sample, Gio Wiederhold
Draft, Jan 7, 1998, by db

Abstract
1. Introduction

2. Optimization objectives

The purpose of the scheduling of primitives in CHAIMS is:

Minimal overall execution time of a megaprogram.

the effective time needed by an invoked remote method to provide the desired result
the time needed to transfer invocation parameters to and results back from the megamodule

Minimal overall fee from the execution of a megaprogram.

the fee charged for the providing of results by a remote method (per invocation, or per result)
the fee charged for the recourses needed by an active invocation (per time)
the fee charged for the transfer of input parameters or results over the network

In cases where these two things are contradictory, a cost function is necessary to determine the end-user specific optimum.

Techniques for reducing overall execution time:

exploiting the inherent parallelism of remote methods
invoking methods even if it is not yet known if their results are needed
direct dataflows between megamodules (deferred: we do not yet investigate this issue in depth in the following, but we should keep it in mind)

Techniques for reducing overall fees:

invoke methods only if it is clear that their results are needed
terminate invocations as soon as they are no longer needed (saves resources at the megamodule side which could also be charged to the client)

3. Preprocessing

A scheduling preprocessor reads a megaprogram and writes a new, hopefully optimized, megaprogram. In the new megaprogram statements may have been moved around as well as new statements introduced.

Simple optimizations:

Moving all EXTRACT statements down to where its results are used by another statement (most often an IF or an INVOKE statement).
Moving all INVOKE statements up as far as possible.

Adding TERMINATE statements whenever no more results will be extracted from an invocation.

Implemented in the existing preprocessor, see also readmes to that preprocessor.
Experiences: not yet integrated, so no experimental results exist.

Optimization with cost function:

Moving an INVOKE statement up and outside an IF-block, based on the evaluation of a cost function. For the cost function ESTIMATE statements have to be inserted for the method whose invocation statement is under consideration as well as for all other methods that are invoked as well as extracted between the IF-block and the new location of the INVOCATION statement.

Preprocessor with access to megamodules

Much more optimization possibilities arise when the preprocessor can execute ESTIMATE calls to the megamodules.

Based on the results of ESTIMATE calls the order of EXTRACT and INVOKE statements can reflect the estimated duration of the various methods, thus overcoming the problem mentioned under simple optimization.
Insertion of rescheduling points for runtime environment (see NGS proposal)

Other possibilities

yet to be investigated...

Limits of a pure preprocessing approach to scheduling

yet to be listed....

4. Runtime with graph

Runtime scheduling in an execution machine based on a graph of the megamodule is totally different from the idea of optimizing a sequential program by rearranging and maybe adding statements. The megaprogram is a mere collection of statements, some of these statements are related to each other by preconditions on their possible order. These preconditions are given mainly by the dataflow (only by the dataflow?). The runtime system takes a megaprogram as input, converts it into a dependancy graph of statements, and then executes that graph.

The setup of runtime scheduling

Figure: program - parser - interpreter and scheduler
...

The megaprogram graph

The megaprogram is represented as an (acylic?) directed graph.
Nodes: represent either a statement or a sub graph
Edges: represent dependencies, edges can have additional conditions.

Conditions on edges:

The default condition of edges between an invoke and an extract statement of the same invocation is that the extract only takes place if the results to be extracted are ready.

When creating the tree, all EXWDONE statements can be replaced by an EXTRACT statement, connected by an edge to the corresponding INVOKE statement. This edge has the default condition mentioned before.

....

How to represent IF and WHILE blocks? How to handle the cases where a statement could be executed already earlier on, but it is not yet clear, if its results are needed?

Should we rather take a variant of a Petri Net?

Complexity of this approach?

Execution of the graph

The runtime system typcically has several ongoing invocations, and waits for one of them to fulfill the conditions of one or several edges so the next node (most often an INVOKE statement) can be executed. As CPAM does not know any callbacks from a megamodule to a client, these testing the readiness of invocations is up to the runtime system. Easiest way is probably polling.
Whenever the graph contains a serie of statements executed locally (e.g. assignments), these are executed without further optimization considerations, as we assume that their time consumption is neglectable.

5. Scheduling in other systems

Compilers

who is adding something here?

Workflow Management Systems

who is adding something here?

Disclaimer

This is a paper in progress, many more cases have to be considered!
Add ons and input is welcome!