Draft, prepared for the September 1998 AFCEA meeting, San Diego CA

 

An Interface Language for Projecting Alternatives in Decision-Making

Gio Wiederhold, Rushan Jiang, and Hector Garcia-Molina

Stanford University
Computer Science Department.
Gates Computer Science Building 4A
Stanford CA 94305-9040
650 725-8363 fax 725-2588
<gio@cs.stanford.edu>

Abstract

We have performed proof-of-concept research on a general language interface to aid in the projection of the effects of alternate decisions, when multiple future Courses-of-Action (CoAs) are being considered. The objective is to significantly augment Command and Control decision services by making results of simulations as accessible as other information components, as databases and web-based data are today. The central concept is that an interface language allows separation of customers and providers, and that the autonomy created allows progress to be made independently.

Motivation

Basic information systems, as exemplified in the military by Command and Control (C2) systems have grown into broader information systems to encompass the communication and analysis capabilities that are now broadly available. These C4I (C2, Communication, Computing, and Information) systems may include a variety of logistics, intelligence, and tactical databases, message links, and geographical base data, providing essential background data for C2 decision-making. However, the military commander also has to plan and schedule actions beyond the current point-in-time. Databases can make past and near-current data available, but another family of diverse simulation tools must come into play for projecting the effect in the future of decisions to be made now. These tools range from spreadsheets to war-gaming simulations. They provide information which is complementary to the information about the past provided by databases, and help in selecting the best course-of-action [LindenG:92]. Quoting from "New World Vistas, Air and Space Power for the 21st Century" [McCall:96]: The two `Capabilities Requiring Military Investment in Information Technology' are:

1. Highly robust real-time software modules for data fusion and integration;

2. Integration of simulation software into military information systems.

Today rapid progress is being made in information fusion from heterogeneous resources such as databases, text, and semi-structured information bases [WiederholdG:97]. Much of this work is ready for transfer to practical settings. However, the predictive requirements for decision-making have been rarely addressed in terms of integration and fusion [Orsborn:94]. Specifically, many war-gaming simulations are very costly and most are impossible to reuse [Zyda:97].

Past Work:

We were funded by DARPA DSO for a small investigation to define and demonstrate a simulation access language, `SimQL'. Such a language is NOT intended for writing new simulations, but for providing information systems access to the results of existing predictive tools, via a wrapper infrastructure. The tools we explored ranged from simple spreadsheets, to weather forecasting, to computational assessments of future resource availability. Within the limited demonstration, we were unable to access fully distributed simulations as performed in military training exercises (SIMNET and its successors [MillerT:95]), although we used such data in a related project [MalufWLP:97].

Infrastructure:

Technology has made great strides in accessing past information, stored in databases, object-bases, or the World-Wide Web. Access to information about current events is also improving dramatically. We now must expand the scope to project the effect of candidate events into the future. Decision-making in planning, both in military and business environments, depends on knowing past, present, and likely future situations, as depicted in Figure 1. For the latter we must access simulations. Many simulations are available from remote sites [FishwickH:98]. Distributed simulations also communicate with each other, using highly interactive protocols [IEEE:98], but are rarely accessible to be part of general information systems [Singhal:96]. Simulation access should handle both local and remote, and distributed simulation services. If the simulation is a federated distributed simulation, as envisaged by the HLA protocol, then one of the federates may supply the data to the decision-making support system, typically first aggregating data from detailed events that occur many times per second to the level of minutes or hours that are typical for initiating planning interactions.

Concepts:

The concept of our simulation access language, SimQL, mirrors that of SQL for databases. Modern versions of SQL provide now also remote access [DateD:93]. An ability to access simulations as part of an information system adds a significant new capability, by allowing simultaneous and seamless access to factual data and projections (e.g., logistics data with future deployment projections). Interfaces such as SimQL should adhere closely to emerging conventions for information systems. For instance, they should use a CORBA communication framework, and `Java' for client-based services. Such use of COTS technology will facilitate the integration of our SimQL interface to other systems that provide access to diverse non-predictive data resources.

Figure 1: The Place of Simulation Access in Information Systems

To make the results obtained from a simulation clear and useful for the decision maker the interface should use a simple model. Computer screens today focus on providing a desktop image, with cut and paste capability, while relational databases use tables to present their contents, and spreadsheets a matrix. To be effectively used simulations should also present a coherent interface model. Since we expect to often have to integrate past information form databases with simulation results we start with the relational model. However the objects to be described have a time dimension and also an uncertainty associated with them. We hence will use a simple object model as the descriptive interface for SimQL.

Predictive Tools

Projecting the outcome of current decisions into the future requires some form of simulation. Traditionally such a simulation is performed in a planner's mind. That is, the planner would sketch reasonable scenarios, mentally developing alternate courses-of-action, focusing on those that had been worked out in earlier situations. Mental models are the basis for most decisions, and only fail when the factors are complex, or the planning horizon is long. Human short-term memory can only manage about 7 factors at a time, so that a game as chess can be nearly played as well by a dumb computer with a lot a memory as by a trained chess master [Miller:56]. As the plans become complex, tools are used for pruning, presentation, and assessment. For example, sand tables are still used for the training of military planners, although they are increasingly being replaced by computer-based simulations. Similarly, spreadsheets are used frequently in business. Simple heuristics sometimes help the planners in pruning, but they are often barely relevant in complex situations. For instance, traditional military doctrine demands air superiority, sufficient armaments to overwhelm the enemy, and enough uncommitted forces and logistical support for backup. While these ``heuristics'' still hold, the world is getting more complex, with fewer and more specialized resources, so that the heuristics may not always be sufficient. Recent discussions of military doctrine consider smaller, flexible forces, with a small local footprint, augmented by considerable information system resources to make them effective [McGregorK:97].

Rapid, ad hoc, access to information for planning is extremely important. This is exemplified by the use of DART during the preparations for Desert Storm. This system effectively superseded many ponderous support planning systems that had been developed using older technology. However, DART could not execute arbitrary planning scenarios, and only one planning tool (originally developed for airport gate allocation by American Airlines) was adapted for use. Today, simulations are crucial in all aspects of military and commercial planning. For instance, logistics plans are developed by simulating alternate transport modalities, capacities, and risks. Production planners execute simulations to see how they can best exploit their resources. Financial planners use spreadsheets to work out alternate budgets. Most importantly, tactical plans are fully or partially simulated in all military exercises. The expectation, as cited in [McCall:96], is that simulation technology will transfer into military operations in the future. Even limited situational assessment of current status requires projection. In particular, since C2 data are often out-of-date, commanders must routinely make undocumented projections even to judge the current readiness situation and obtain a complete tactical picture.

Objectives

We focus on accessing pre-existing predictive tools. Wrappers are used to provide compatible, robust, and `machine-friendly' access to their model parameters and execution results [HammerEa:97]. Our wrappers also convert the uncertainty associated with simulation results (say, 50% probability of rain) to a standard range ( 1.0 - 0.0 ) or may estimate a value if the simulation does not provide a value for its uncertainty.

In terms of system structure, we follow the SQL approach. SQL is not a language in which to write a database system; those may be written in C or Ada. Rather, it is a language to select results for further use in information systems. Similarly, we expect that SimQL will provide access to the growing portfolio of simulation technology and predictive services maintained by others. In doing so, having a language interface will break the bottleneck now experienced when predictions are to be integrated with larger planning systems.

In particular, there are two aspects of SQL that SimQL mimics:

  1. The existence of a Schema that describes the accessible content to an invoking program, its programmers, or its customers.
  2. The existence of a Query Language that provides the actual access to information resources.

There are also some differences, of course, and some of them will require further investigation to assure effectiveness and seamless interoperation of SimQL with advanced information systems.

  1. Not all simulation information is described in the schema. Simulations are often controlled by hundreds of variables, and mapping all of them into a schema is inappropriate. Only those variables that are needed for querying results and for controlling the simulation will be made externally accessible. The rest will still be accessible to the simulation developer. Defining the appropriate schema require the joint efforts of the developer, the model builder, and the customer.
  2. Predictions always incorporate uncertainty. Thus, measures of uncertainty are being reported with the results. There have been multiple definitions of uncertainty [BhatnagarK:86]. In time, the information systems that process the results will have to take uncertainty explicitly into account, so that the decision-maker can weigh risks versus costs.
  3. Interoperation with past information is required. SimQL must integrate past, present, and simulated information, providing a continuous view. Furthermore, it must indicate what information is valid when. This is especially important since most databases are not fully up-to-date. The time-of-validity capability alone, while modest, can be of great value to decision-makers, and also help in validation of the reliability of the tools being accessed.
  4. Multiple courses-of-action (CoAs) must be supported. Multiple candidate alternatives may be valid simultaneously in some future domains. Thus, systems that access both databases and predictive information must deal with multiple courses-of-action.
  5. We do not expect to need persistent update capabilities in SimQL. Model updates are the responsibility of the providers of the simulations. The queries submitted to SimQL supply temporary variables that parameterize the simulations for a specific instance, but are not intended to update the simulation models.

Assessing the current state

We have focused on using simulation to assess the future. There is however an important task for SimQL in assessing the current state. Databases can never be completely current. Some may be a few minutes behind, others may be several days behind in reporting the state of resources and materiel. Intelligence information about foreign forces often lags even further behind, although it is the most crucial element in decision-making. The traditional, consistency preserving approach in database technology is to present all information at the same point in time, which reduces all information to the

worst lag of all sources. It would be better to use the latest data from each source, and then project the information to the current point-in-time. In fact, that is certainly what a commander does today when faced with data of varying times of validity. SimQL can support this approach since tot provides an interface that is consistent over databases (assumed to have data with probability 1.0) and simulations,

as shown in Figure 2.

Figure 2: Even the present needs SimQL

These extrapolation of last know database states to the current point-in-time will help in providing a consistent, even if less certain picture of, say, where the supply trucks are now, where river crossings are stressed, and where the troops are that need the materiel. This picture is more useful than a perfect picture of the situation 2 days ago.

Specifics

The research carried out under the proof-of-concept support include two phases:

  1. Wrapping several existing simulations to assess the generality of the SimQL concept
  2. Defining an initial specification for SimQL.

This paper focuses on the language aspects, but we will first list the simulations that were wrapped to provide information to a SimQL interface.

  1. A spreadsheet containing formulas that projected business costs and profits into the future. Inputs were investment amounts, and results were made available for years into the future.
  2. A short-range weather forecast available from NOAA on the world-wide web. Temperature and preciptation results were available for major cities, with an indication of uncertainty, which rapidly increased beyond 5 days.
  3. A long-range agricultural weather forecast for areas that overlapped with the cities. The initial uncertainty here was quite high, but increased little over a period of a several months.
  4. A discrete simulation of the operation of a gasoline station, giving required refill schedules and profits.

A customer application can invoke multiple SimQL simulations. Our experiments only combined simulations b. and c., selecting the forecast based on the lowest uncertainty. Still, these experiments with a few real-world simulation ensured the applicability of SimQL to a range of settings and provided a foundation for the design of SimQL.

Language design

By borrowing ideas from SQL and the database programming paradigm we can make the schema language and the query language as easy to grasp as possible. Also, we try to preserve the simplicity of the language and its interface by providing only the minimum functionalities and data types needed. Since our experience was modest we made the language flexible and scalable for more complex simulations by taking concepts from object-oriented programming and leaving "hooks" in the language for future expansion.

While designing the languages, it’s also very important to draw distinctions between the customers, who only need the results, wrapper developers (people who write SimQL-compliant interface code for simulations), simulation developers (people who write and maintain simulations), and finally SimQL system developers, our own domain at this point. Only the customers are the users of SimQL, system development can take place in languages as C and C++, and the actual simulations are often written in specialized languages.

The SimQL language system shows many parallels to the database management systems. It includes catalogs, clusters, schema, and models that are analogs to SQL catalogs, clusters, schema, and views. Given this similarity, the SimQL query language employs concepts from SQL, CORBA, and KQML., while the syntax of the SimQL schema language was defined using concepts from SQL, ILU, IDL, and Ontolingua.

The SimQL environment consists essentially of a SimQL server, several SimQL clients, a interface to wrappers, several wrappers for simulations, and several actual simulators.

Schema facilities

Since the first thing a wrapper developer needs to do after a he finishes writing a SimQL wrapper for a simulation is to let SimQL know that such a wrapper exists, he has to somehow represent the wrapper in SimQL. Therefore, a REGISTER statement enables a wrapper developer to "register" a wrapper with SimQL. Because wrappers can come in different "shapes", we borrowed some object-oriented concepts to make the REGISTER statement flexible and scalable enough to handle complex wrappers. A wrapped simulation can be viewed as having a number of attributes and simulation methods. In a REGISTER statement, a wrapper developer can specify different ATTRIBUTEs of a simulator such as its idle time and its accuracy in the past, and the METHODs available to the customers for invoking the simulation.

Once a wrapper is registered in SimQL, the wrapper developer needs to create a simulation model for each intended customer based on the registered wrapper. This is because

Therefore, a CREATE MODEL statement is added to enable the wrapper developers to do just that. By using the CREATE MODEL statement, a wrapper developer can specify a simulation model for each customer based on the registered wrapper along with its input/output variables (specified by IN, OUT, or INOUT) and its associated method (specified in the AS clause). The CREATE MODEL statement constructs the core of the SimQL schema language. Other schema language elements include DROP MODEL, HELP, etc.

The model created for the business model represented in the spreadsheet only exposed the investment amounts, and the year or which the result was desired as IN variables, and the value of the investment, with a probability as the OUT variable. Interest rates, taxes, and growth assumptions and computations remained hidden from the customer, and were only under control of the author of the spreadsheet. A more realistic scenario would have given the customer also a choice of investment policies.

All these SimQL schema language elements are very similar to the SQL views (i.e. CREATE MODEL is analogous to CREATE VIEW, etc.), both in syntax and concept. These similarities make the language easy to understand and expand for someone trained in database technology.

Query facilities

The next step was to specify the initial SimQL query language, which was built around the SIMULATE statement. Just as SELECT in SQL is used to query data in a created table or a view, SIMULATE in SimQL is used to invoke a simulation and obtain the results from a created simulation model. Simulation customers specify the target simulation models via the FROM clause, the input variables via the WHERE clause, and the conditions of simulation via the HAVING clause.

Despite all the similarities, SimQL is different from SQL in many ways, among which the following are the most prominent.

With all the schema and query language elements sketched, we developed the formal language specifications for SimQL.

Figure 3: The SimQL prototype implementation

Implementation

The proof-of-concept implementation was achieved by modifying an existing public SQL implementation. This approach allowed rapid implementation, although the result is not as tight a specific implementation would have been. The benefit was to gain early on experience with compiling SimQL. The functions to be implemented were

The SimQL implementation includes four components, as depicted in Figure 3.

Written in Lex and Yacc, the SimQL parser takes SimQL statements from the customers and interprets them. After simple syntactical checking, the parser parses each statement to generate a parse tree and interprets the statement by resolving all the nodes on the tree. During the interpretation, more complex syntactical checking is performed. Depending on the type of the SimQL statement (schema vs. query), the parser packages the parsed statement accordingly and sends it to the SimQL Schema Manager or the SimQL Query Manager. A lower-level SimQL Metadata Manager was implemented to handle the file operations required by the Schema Manager and the Query Manager. The metadata files on disk store permanent information about registered wrappers and their corresponding attributes and methods, defined simulation models and their input/output variables as well as their corresponding wrappers. These metadata files are read-only to the SimQL Query Manager, which does schema lookup before accessing a required simulation.

The data structures used in all four components of the SimQL implementation originated from the SQL implementation and were adapted for simplicity. The whole implementation has about 6,000 lines of C and C++ code and is divided into those four modules. The SimQL Schema Manager, the SimQL Query Manager, and the SimQL Metadata Manager are written in C++, with each manager represented by a super C++ class and each SimQL statement having a method in a class. The use of object-oriented programming here has made those managers very scalable and expandable. Each of the managers can be independently compiled for testing purposes.

Results

The SimQL implementation realized the following SimQL elements/features.

The system was tested on the wrapped weather-forecasting model in a local setting and performed as planned.

Future work

We have not yet presented SimQL to any real simulation customers and thus we do not know how receptive they will be towards the language. For realistic technology transfer we will have to address deficiencies typical for an early, academic exercise:

We plan to seek further support for the development of SimQL concepts in a setting where a realistic evaluation by potential customers can take place.

Conclusion

We have investigated the feasibility of SimQL and gained experience for a more realistic SimQL project. We have some early results, indicating that highly diverse predictive tools may be accessed in an integrated fashion via language as SimQL. Despite the limitations of our initial prototype , we believe that high-level simulation access has the potential of a major augmentation for future C4I systems. The SimQL concept is, of course, not restricted to military simulations. An increasing number of simulations are available on the Web, but these also are difficult to integrate into information systems without an access language. Because of the importance of simulations to decision-making, we expect that concepts as demonstrated in SimQL will in time enter large-scale information systems and become a foundation which will make a crucial difference in the way that simulations will be accessed and managed.

Acknowledgments

This research was supported by DARPA DSO, Pradeep Khosla was the Program Manager; and awarded through NIST, Award 60NANB6D0038, managed by Ram Sriram. The original SQL compiler was written by Mark McAuliffe, of the University of Wisconsin - Madison, and modified at Stanford by Dallan Quass and Jan Jannink. James Chiu, a Stanford CSD Master’s student, provided and wrapped the gas station simulation. Julia Loughran of ThoughtLink provided useful comments to an earlier version of this paper [WiederholdJG:98].

References

[BeringerTJW:98] D. Beringer, C. Tornabene, P. Jain, G. Wiederhold: "A Language and System for Composing Autonomous, Heterogeneous and Distributed Megamodules"; Stanford CSD, submitted for publication 1998, available at http//:www-db.stanford.edu/CHAIMS.

[BhatnagarK:86] Bhatnagar and L.N. Kanal: "Handling Uncertain Information: A Review of Numeric and Non-numeric Methods"; in Kanal and Lemmer(eds.): Uncertainty in AI, North-Holland publishers, 1986.

[DateD:93] C.J. Date and Hugh Darwen: A Guide to the SQL Standard, 3rd ed; Addison Wesley, June 1993.

[FishwickH:98] Paul Fishwick and David Hill, eds: 1998 International Conference on Web-Based Modeling & Simulation; Society for Computer Simulation, Jan 1998, http://www.cis.ufl.edu/~fishwick/webconf.html.

[GarciaMolinaBP:92] Hector GarciaMolina, D. Barbara, and D. Porter: "The Management of Probabilistic Data"; IEEE Transactions on Knowledge and Data Engineering,Vol. 4, No. 5, October 1992, pp. 487-502.

[HammerEa:97] J. Hammer, M. Breunig, H. Garcia-Molina, S. Nestorov, V. Vassalos, R. Yerneni: "Template-Based Wrappers in the TSIMMIS System"; ACM Sigmod 26, May, 1997.

[IEEE:98] P1561, Draft IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA); IEEE, 1998.

[INEL:93] Idaho National Engineering Laboratory: "Ada Electronic Combat Modeling"; OOPSLA'93 Proceedings, ACM 1993.

[Jiang:96] Rushan Jiang: Report on the SimQL project; submitted to Prof. Wiederhold, CSD Stanford, August

[LindenG:92] Ted Linden and D. Gaw 1992: "JIGSAW: Preference-directed, Co-operative Scheduling," AAAI Spring Symposium: Practical Approaches to Scheduling and Planning, March 1992

[Kohavi:96] Ron Kohavi: Wrappers for Performance Enhancement and Oblivious Decision Graphs; PhD thesis, 1996.

[MalufWLP:97] David A. Maluf, Gio Wiederhold, Ted Linden, and Priya Panchapa-gesan: "Mediation to Implement Feedback in Training"; CrossTalk: Journal of Defense Software Engineering, Software Technology Support Center, Department of Defense, August 1997.

[McCall:96] Gene McCall (editor): New World Vistas, Air and Space Power for the 21st Century; Air Force Scientific Advisory Board, April 1996, Information Technology volume, pp. 9.

[McGregorK:97] Doug McGregor and Donald Kagan: Breaking the Phalanx: A New Design for Landpower in the 21st Century; Praeger, 1997.

[Miller:56] George Miller: "The Magical Number Seven ± Two"; Psych.Review, Vol.68, 1956, pp.81-97.

[MillerT:95] Duncan C. Miller and Jack A. Thorpe: "SIMNET: The Advent of Computer Networking"; Proceedings of the IEEE, August 1995, Vol.83 No.8, pages 1116-1123.

[Orsborn:94] Kjell Orsborn: "Applying Next Generation Object-Oriented DBMS for Finite Element Analysis"; ADB conference, Vadstena, Sweden, in Litwin, Risch Applications of Database', Lecture Notes In Computer Science, Springer, 1994.

[Singhal:96] Sandeep Singhal: Effective Remote Modeling in Large-Scale Distributed Interactive Simulation Environments; PhD Thesis, Stanford CSD, 1996.

[Wiederhold:93] Gio Wiederhold: "Intelligent Integration in Simulation"; MORS Mini-symposium, Fairfax VA, Military Operations Research Society, Alexandria VA, November 1993.

[WiederholdG:97] Gio Wiederhold and Michael Genesereth: "The Conceptual Basis for Mediation Services"; IEEE Expert, Intelligent Systems and their Applications, Vol.12 No.5, Sep-Oct.1997.

[WiederholdJG:98] Gio Wiederhold, Rushan Jiang, and Hector Garcia-Molina: "An Interface for Projecting CoAs in Support of C2; Proc.1998 Command & Control Research & Technology Symposium, Naval Postgraduate School, June 1998, pp.549-558.

[Zyda:97] Michael Zyda, chair: Modeling and Simulation, Linking Entertainment and Defense; Committee on Modeling and Simulation, National Academy Press, 1997.