CS73N Meeting 08 Notes: Databases and Prediction in Information systems

Entered by Gio Wiederhold, 9 March 2002, updated 20 March 2002, 26Feb, 1, 8 March 2004.

Topics Covered briefly

Databases

Definition: organized and maintained collections of data, i.e., factual information or observations.

Examples:

The results will that exactly match a query -- sometimes too much, sometimes too little.

Much of such data is indirectly available on the web, sometimes for free, sometimes for a price.

When a client accesses such a web-site, a query must be made, specifying what in particular is wanted, and then the systems that supports the web site will retrieve the relevant data a from the database and present it as a HTML page. (use of XML may simplify that process)

 

However, such information is not available to browsers, who cannot (afford to) ask every possible queries from web sets that generate HTML pages from their databases.   That part of the web is referred to us the deep web.  Some people consider it much larger (100 x) than the visible web, but that is part due to the very large satellite image collections and similar material that are included.

 

Browsers present multiple entries, and the top entry has been selected by optimization. Subsequent entries may be similar, and not provide good alternatives for selection.

 

Observational, factual databases only provide information about the past. When getting data about the past there should be only one correct past history, the time-line of the past.

Using Data for Decision-making

Decision-making means projecting into the future - but the future is always uncertain. For decision-making we also need to know what effect decisions made today have in the future. Other alternatives in the future will not be under our control. Those have to be considered as well. Since we don't know what the right future is we have to project multiple futures:

For planning the decision maker has to make a plan for futures. If we draw the plan for the future it looks like a bush. At the end of the bush will be outcomes, say financial gains or losses.

 

 

A decision maker in an enterprise is expected to make decision which have a positive effect on its future. Information systems should support their activities. Today, databases and web-based resources, accessed through effective communications, make information about the past rapidly available. To project the future the decision maker either has to use intuition or employ other tools, and initialize them with information obtained from an information system to such tools. An effective information system should also support forecasting the future.  Since choices are to be made, including the case of not doing anything, such a system must also support the comparative assessment of the effects of alternate decisions.  We recommend the use of an SQL-like interface language, to access existing tools to assess the future, as spreadsheets and simulations. Making results of simulations as accessible as other resources of integrated information systems has the potential of greatly augmenting their effectiveness and really support decision-making.

 

The process involves creating a list of alternatives of future actions into the future, say, failure to get a product out, and alternatives not under the control of the decision-maker, say good weather causing customers to go out and not use the Internet, or a competitor gaining market share. After that new branches created, based on the reactions to those events, say having to increase advertising or lower your price.  That sequence can go on for a long time.  Very low probability branches can be omitted or combined.

 

 

At the end of the bush will be outcomes, say financial gains or losses.

 

Then the decision-makers (or the person/computer helping them) can work backwards to assess the possible positive or negative effects of each prior decision.

Once time goes on, the decision will have been made, explicitly or by default. Now the entire bush can be recalculated, maybe with some new probabilities due external changes as well, and be used to make a subsequent decision. 

A refinement would have to include the uncertainties at each point, so that the current risk can be computed as well.

 

Since the process gets complex, there have been many tools developed, as simulations, planning models, and an integration language (SimQL).  But much is still done today by intuition, supported by databases and spreadsheets.  Intuition is suspect when decision-maker enter new business areas, or when the complexity is such that more than 7±2 factors have to be juggled.

 

I know of no business that is doing it all in coherent manner.   It seems to be a great opportunity for the future.

Who will do it, who will use it first?

 

Institutions and People

Is the government the proper source for Internet support? We discussed the differences between European/Asian and U.S. approaches. To invest in risky ventures requires a setting where the value of the potential gain outweighs the cost of the potential loss. For an investor who has not-needed cash -- or can collect such a group of people -- the equation is simply:

gain * p(success) * n(success) > loss * p(failure) * n(failure)

if the amount of gain is >@gt; (100* ?) than the amount of loss (the investment made), then lower values of p(success) versus p(failure) and n(success) versus n(failure) can be tolerated.

That seems also be true for a government, since it represents a very large group of individuals, so large, that in total they wouldn't be very sensitive to loss. However, the actions are executed by actual people, bureaucrats. An employee of the government, as an individual, is not very tolerant of loss. If money is gained, the bureaucrat will gain little, perhaps a promotion, a 10% raise, a few years earlier than woould happen otherwise. However, a failure will cause loss of promotion and raises. So for the bureaucrat the value of gain = loss.

There some anciliary conclusions from that reasoning:

1.      Don't blame the poor bureaucrat, it is the setting and the reward mechanism.

2.      Relieve bureaucrats of responsibility when dealing with them: Never ask them `Can you do X', but rather `How should I do X'.

In business dealing with people is of paramount importance, and all institutions have people in their interfaces.

 

Notes:

See research papers on the topic in http://www-db.stanford.edu/pub/inprogress.html#SimQL; the slides that are there. A summary paper is Information Systems that Really Support Decision-making.