Case-Oriented Paper

Journal of the Operational Research Society advance online publication 14 May 2008; doi: 10.1057/palgrave.jors.2602616

The value of information to decision makers: an experimental approach using card-based decision gaming

J Medhurst1, I M Stanton2, H Bird2 and A Berry2

  1. 1Larrainzar Consulting Solutions Ltd, Salisbury, Wiltshire, UK
  2. 2Defence Science and Technology Laboratories, Salisbury, Wiltshire, UK

Correspondence: J Medhurst, 1 The Maples, Devizes Rd, Salisbury, Wilts, SP2 7LL, UK. E-mail: john@larrainzar.co.uk

Received February 2007; Accepted March 2008; Published online 14 May 2008.

Top

Abstract

This paper describes a novel experimental method for determining the value of different types of information to military decision makers. The experimental method used a simple scenario and a set of serials constructed from cards, each presenting a single piece of information and presented sequentially. Each of a number of pairs of players were taken through the scenario and asked to judge when they would make each of a pair of escalating responses to the situation. The data proved well suited to analysis using a probit model and is consistent with the hypothesis of a Bayesian decision mechanism with normally distributed 'action points'. The methodology allowed the determination of weights for each of a number of different classes of information, together with estimates of the human and situational elements of variation, including estimates of the 'prior belief' of the different pairs of players.

Keywords:

command and control, decision making, military, experimentation, biological, detectors

Top

Introduction

Background

Modern warfare is increasingly dependent upon the flow of information. Modern initiatives such as Network Centric Warfare depend for their potential benefits on the ways in which the increasing amounts of information passed around the battlefield are used to improve decision making. One of the problems of assessing the impact of this information on operations is a lack of understanding of how decisions are actually made by commanders, and how those decisions depend upon the different kinds of information that have been received—the value of that information.

This problem is particularly acute in the area of chemical, biological, radiological and nuclear (CBRN) defence. Since the UK does not pursue an offensive capability in chemical and biological weapons, the defence system has always relied upon commanders taking decisions to adopt various protective and preventive measures. The defensive system as a whole can be divided into those elements requiring decisions to use—such as putting on protective suits and respirators—and those elements providing the information to fuel such decisions—chemical and biological detectors.

Setting the balance in investment across the different areas requires an understanding of the relationship between the information provided and the decisions made by commanders. Critical to this is an understanding of the value of different types of information in making decisions.

Modelling command decision making

Although commanders can be questioned to elicit their 'rules of thumb', there always remains the nagging suspicion that the true behaviour may be different. Studies of the psychology literature (Kahneman et al, 1982; Kahneman and Tversky, 2000) indicate that human beings often employ heuristics of whose limitations that they are only partially aware.

Previous studies of command and control decision making have tended to indicate either that information had little effect on decision making, or that any effects from information were dominated by variability between decision makers (Mathieson, 2001; Daniel et al, 2002).

Despite this, work continues to represent command decision making in models (Forder, 2004). This work is based upon particular psychological theories of decision making in situations of uncertainty and pressure, often referred to as naturalistic or recognition-primed decision making (Zsambok and Klein, 1997; Klein, 2001).

An important recent development has been the introduction of mathematical models to represent decision making in combat models (Moffat, 2000, 2002). This approach has been built on subsequently (Moffat and Witty, 2002; Moffat et al, 2004; Perry and Moffat, 2004; Dodd et al, 2006).

Selecting an initial problem

These models provide a starting point for understanding decision making, but it has yet to be demonstrated experimentally that the models are a good representation of command decision making. Although the models make sense from a mathematical and psychological point of view, do human beings really make decisions that way, and if so what is the level of requisite variety needed to represent the relevant human decision-making process to an acceptable level of accuracy?

The role of information in the many different aspects of CBRN defence is potentially a huge area of research. It was decided that the best way to develop the necessary techniques to understand that role was to identify a particular problem that could be used as a starting point for the investigation. The problem selected was that of making decisions in response to biological detector alarms, such as might be encountered in a field situation where a threat of biological attack exists.

Biological warfare (BW) and biological weapons are banned under international conventions, but both terrorists and states have the potential to develop and use such weapons in contravention of such agreements. The consequences of such attacks can be very serious, and such attacks can be delivered without any immediate obvious signs or indicators.

Systems exist to detect such attacks, using both generic and specific sensors. These sensors detect either elevated concentrations of biological material in the air or rely upon unique identification of particular threat agents. Alarms from these sensors are passed to a central authority responsible for issuing warnings and taking action.

These decisions will be made using not only the incidence of alarms, but also taking into account other information to build an overall picture of the situation. The point at which such decisions are made is obviously important in determining the effectiveness of the system as a whole.

This problem is well defined, with certain clear inputs—the detector alarms—but with a good deal of additional subjective information to complicate the decision-making process and to ensure that decisions are not straightforward.

The problem is also an important one, understanding the way that the information used is important not just from the point of view of information systems design but additionally has potentially serious operational and political implications. Failure to order precautionary measures when an attack has occurred could lead to serious casualties, and conversely over-frequent responses could lead to a reluctance to take such orders seriously—a 'Cry Wolf' effect.

Isolating information

Having identified a potential problem, the next challenge is to identify a way of treating information experimentally. For our purposes, information can be considered as the semantic content of communications. We are not concerned with the form of presentation; the important thing is to have a controlled method of presenting information unambiguously and in such a way that we can isolate the impact of particular pieces of information.

Similarly, in a more general context, the form of presentation can be very significant in determining whether or not information is absorbed and what the weight placed upon that information might be. For the purposes of this experiment however, we want to strip down information into its essentials as far as possible. We require a method of providing information that will firstly allow us to separate out individual pieces of information and secondly ensure as much as possible that the information is absorbed by the test subjects. This also removes the variable of the means of presentation from the problem, allowing us to concentrate upon the semantic content, the fact that is communicated.

The mechanism chosen to allow this is the separation of information into individual pieces or 'motes'. These 'motes' of information correspond to what can be represented in a single, simple sentence—a logical proposition. Examples of these 'motes' of information are sentences such as 'Enemy forces are observed carrying out tactical ballistic missile (TBM) firing drills' or 'Bill lives in Guildford'.

These simple propositions or 'motes' are presented to the subjects on individual cards, as shown in Figure 1. These cards are the fundamental experimental mechanism, and setting the number and types of card to be presented, together with their order, defines the individual experiment.


Top

Experimental method

Outline of the method

The experiment was conducted in a similar format to a desktop training exercise. The players were presented with an introduction to the scenario, containing a minimum amount of contextual information, together with a map of the area of operations and a short verbal briefing on the situation. They were then presented with a number of serials, each consisting of a number of cards. Each card gave a single, simple piece of information. The cards were presented in sequence, one at a time, with the next card being presented when the players requested it. The players could retain all of the cards in the current serial and arrange them as they wished. The players mostly operated in pairs, with the pairs being permitted to discuss any proposed course of action and being required to come to a joint decision.

The introductory scenario material asked the decision makers to decide, after each card, whether or not to take either or both of two courses of action:

  1. Issuing a precautionary alert to troops in theatre, leading to the donning of individual protective equipment.
  2. Declaring a probable attack—including recommending the taking of medical countermeasures and reporting up the chain of command as a probable biological attack.

After each serial was completed, the players were asked to 'clear their minds' of the information from the previous serial and the next serial was begun. The order of the serials for the individual pairs of players was determined using a randomized Latin Square design. This ensured that each players encountered the serials at a different point in the sequence and following on from each of the other serials, thereby allowing the analysis to compensate for learning and carryover effects.

As the players went through the serials, they were asked to maintain a log sheet, showing for each card what action they took, together with any comments. Data were also collected by two observers in the form of comments and observations on the responses of the players to the individual cards. At the end of each serial, the players were asked to confirm at what point they had taken either of the actions described above.

Map and scenario

The map was deliberately made to be abstract and high level in order to minimize the amount of uncontrolled information given to the players, while remaining sufficiently realistic to ensure that the players retained a sense of context and treated the situation seriously. To keep the compilation of serials as simple as possible, the detector locations have been specified on the map as six locations, indicated from A to F. These locations are fixed and constant between individual serials.

The scenario description was kept as short as possible—two sides of A4–to minimize the amount of uncontrolled information being presented to the players. From observing discussions between pairs of players, it seemed that the absorption of the individual items of information within the scenario description was good, with the most significant items being picked up by one or the other of the players and taken account of in group decisions.

The scenario information contains a description of the overall military situation, a brief description of the individual detectors and their capabilities, some information on prevailing meteorology and a description of the responsibilities of the players within the scenario, including the two courses of action on which they were required to decide.

Briefing material

In addition to the written scenario description, the players were also given an introductory verbal briefing read out by the exercise controller. The verbal briefing re-iterated the responsibility. The exercise controller was also given a single-page guide to the conduct of the experiment. This outlined how the experiment was to be conducted.

Information cards

The key to the experiment was the use of the information cards. Examples of these cards are shown in Figure 1. The information element of the cards was contained in a short sentence, for the purposes of the experiment each of these cards would correspond to a 'mote' of information.

In addition to the short sentence, each of the cards has a reference number, used to identify the card and a symbol. Each of the symbols corresponds to a category of card and these can be thought of as similar to suits in a deck of regular cards. The different categories used are shown in Figure 2. The symbol is included on the card to give the players an immediate identifier and thereby aid them in absorbing the information presented.


Five categories of information were used:

  1. Alarms. These were split for the experiment into generic alarms, where the only indication is that there is an unusual level of biological material or particle activity in the atmosphere and specific alarms, which indicate that a particular agent has been detected.
  2. Indicators. These correspond to the kind of attack indicators that might accompany a biological attack, including incoming TBMs, aircraft observed flying low and crosswind, and possible indicators of unconventional attack.
  3. Context. This was information of the sort that might come in through intelligence or from the media, general information on the state of the enemy, progress of negotiations to resolve the conflict, likely meteorological conditions and indicators that the enemy might have changed his posture relative to use of weapons of mass destruction.
  4. Negative. This category was similar to the context information but generally indicated a situation in which the enemy was less likely to carry out an attack including unfavourable meteorology and loss of offensive capability.
  5. Confirmation. This category covered the sort of information that might come in after an attack has taken place. It included general corroboration such as unexpected illness, reports of test results from the medical chain and media reports of suspicious illness in the civilian population.

The six categories of information (including the two different types of alarm) would be the basis for the analysis. Because different subjective information was presented on each card, the number of categories was an important decision—too many categories and the analysis would have insufficient data to produce results, too few and different types of information would end up being crammed together in the same category, blurring the different effects. Since this was the first trial of the methodology, the number of categories was kept as low as possible, if necessary the tracking of individual cards would allow later analysis by sub-category.

Compiling serials

The information cards in the deck were assigned to the serials at random. To do this, a small spreadsheet was used to generate the serials. The spreadsheet was a simple random number generator and was used to determine

  1. How many specific and generic alarms there would be and at what locations.
  2. How many of each of the other types of cards would be present.

In Table 1, each of the '1' entries indicates that a card should be present in this serial. For the specific and generic alarms, this also shows the location of the alarm. Note that the specific and generic alarms are generated independently of each other. The probability listed at the top left is used to determine the likelihood of a '1' appearing in each of the cells shown.


For the other four types of information card, the spreadsheet only indicates how many will be present. Together with the alarms, there could be up to 36 cards in a serial, the chosen probability of 33% would produce an average of 12 cards per serial and an average of 96 cards for the entire experiment. Deciding which cards should be used was done by simply selecting cards at random from the stack of cards of that type, and allocating them to serials, with some discretion to select cards to maintain credibility. The cards were then shuffled to introduce randomness and run through briefly, with the sequence in some cases being adjusted where cards could not logically be present before others.

This was particularly important with the corroboration cards, many of which would logically only appear in the latter part of the sequence. An overview of the frequency of cards is shown in Table 2.


Choice of subjects

The game required subjects with a detailed knowledge of CBRN, at least at the standard of the JFHQ CBRN cell controllers that the players were intended to represent. The closer the subjects were to the likely decision-making population in terms of experience and background, the more representative the results are likely to be of the behaviour of the target population.

The subjects consisted of seven pairs of military CBRN specialists. Two pairs were drawn from Porton internal military staff, one from the CBRN staff at RAF Strike Command, two from the Defence CBRN Centre at Winterbourne Gunner and two from the Joint CBRN Regiment. All had a thorough CBRN background and, in the case of the pair from Strike Command and the two from the CBRN regiment, recent relevant operational experience. All of the remaining trials were conducted as pairs, although in two cases one member of a pair had to leave part way through the trial.

Conduct of trials

Although the initial single trial subject went through the serials in numerical order, the remaining pairs were allocated at random to an 8times8 Latin square design.

Efforts were made to keep the presentation of the serials and the circumstances in which the trials were carried out as consistent as possible, although the fact that many of the subjects were difficult to get hold of meant that in some cases the trial had to conform to the convenience of the subjects rather than vice versa.

All trials were conducted in conditions of good lighting in an office or similar environment, with a sufficient flat area to allow the subjects to lay out the map, the scenario description and the cards as they received them. The players were given access to water and hot drinks and trials were conducted for the most part within normal working hours.

The same experimenter was responsible for administering the trial throughout, presenting the cards to the players one by one as requested. On occasions, clarifications of technical points were made but for the most part, if players were interested in further information about the card they were asked to note down what they would want to know on their log-sheets.

The length of time taken to conduct the experiment varied widely and in some cases not all of the serials could be completed. Time to complete a single serial varied between 10 min and almost 2 h. All teams took longer over the first serial they played. This seemed to be due to uncertainty as to exactly what was wanted. Most teams seemed to speed up as the trial proceeded.

Players seemed to take the trials seriously, with several commenting that the trials were quite realistic in representing the way in which information is received in the field. Some particular combinations of cards were not felt to be particularly likely and a particular problem raised by all players was the lack of positional information on the indicator cards. Future trials may address the impact of including more positional information. Another piece of information that was felt to be missing from the scenario data was the CBRN alert state. When it was pointed out to them that this was, as the CBRN cell in JFHQ, set on their recommendation, most settled for starting the serials at CBRN medium.

Top

Results

Two types of data were collected from the trials, qualitative data including the filled in log sheets for the subject pairs and quantitative data, including the level of response the players had reached as the serial progressed.

Qualitative results

The qualitative results include the players' completed log sheets, showing for each card what action they would take, including seeking more information, and any comments on the information on the card. Some players combined the two columns or filled in mostly one column or the other.

Not all pairs were able to complete all serials. Since the trial used a Latin Square design, with lines from the square allocated at random without replacement, this should not bias the results obtained.

Quantitative results

Two types of quantitative results were collected. The first was the response level of the players to the information provided, this was recorded for each card presented and checked with the players at the end of each serial, measured on a scale of zero for no response, one for precautionary alert issued and two for probable attack declared. The second type of quantitative results was the time taken by the players in considering each card before requesting another. These times were collected as an alternative potential measure for the importance of the information.

The job of the observers was to note down what action the players took at what point during the sequence of cards. This was then confirmed with the players at the end of the serial, to ensure that the correct point of action had been identified. The issue of probable attack always went with or followed the issuing of a precautionary alert and on only one occasion did a player revert from a previous precautionary alert.

Top

Analysis

The quantitative data were analysed using the MINITAB statistics package. A variety of statistical approaches were attempted, including the use of the general linear model (GLM) and logistic regression, with and without a probit or normit link. The main effort was put into analysis of the likelihood of action using probit analysis. This seemed to fit the data well and gave good results. Results of analysis of the timings data are also presented.

The statistical model used

The use of probit analysis proved very successful in analysing the response data.

The model used is as follows.

There are three possible actions:

  • j=0 (no alert issued)
  • j=1 (precautionary alert issued)
  • j=2 (probable attack issued)

An ordinal logistic regression with a probit link is made with a number of possible covariates (x1,x2,...,xn).

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where Phi-1 is the inverse cumulative density function of the standard normal distribution and italic gammaj(x1,x2,...,xn)="Pr";(Yless than or equal toj|x1,x2,...,xn) is the cumulative probability that the action Y will be in category j or less given observation of the covariates (x1,x2,...,xn).

The covariates (x1,x2,...,xn) can be numeric variables such as the number of cards of a given type, or conditions, such as the pair carrying out the trial or the serial number.

Thus, for any combination of circumstances, the probability of a given level of action in those circumstances is transformed into its equivalent on the cumulative standard normal distribution, a probability of 5% becomes -1.64, for example, and a probability of 75% becomes +0.67.

Once the probability values have been transformed, a multivariate regression can be carried out. In the models used here, two lines are generated: one for the probability of no alert being issued (j=0) and one for the probability of either no alert or a precautionary alert being issued (j=0 or j=1). The model treats the effects of the covariates (ie the beta values) on the two probabilities as being the same, thus the two lines are parallel, but with different intercepts (alpha values).

The slopes and intercepts of the regression lines can be used to calculate the probability of having taken a given level of action or less. The obtained regression is used to calculate a value of the function for the particular set of covariates (x1,x2,...,xn) and this is then used to read back onto the cumulative normal distribution to obtain a probability.

Therefore, if the regression were to have an intercept of +1 and a coefficient of -0.25 per card, then after four cards we would expect the probability to be equal to the zero point on the cumulative normal distribution equivalent to a probability of 50%.

Things are slightly more complicated if we want to know the probability of action at a given level or above, rather than at a given level or below. For this, we simply use one minus the probability that the action will be at a level below this.

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Use of probit analysis

The use of probit analysis in analysing response to information is analogous to the use of such analysis in toxicology (Finney, 1971). Here the amount of information presented is equivalent to the dose of toxic material and the proportion of the population responding to that level of information takes the same role as the proportion of the test population suffering the effects of the toxic material. One of the advantages of probit analysis is that it is inherently probabilistic. The human response to information is not deterministic and using a statistical methodology that recognizes the probabilistic element seems to have advantages.

There are some important differences between the use of probit analysis in toxicology and the approach adopted here. Firstly, the extensive use of multivariate analysis to account for what are essentially multiple treatments, the different types of information. Also the way information is fed piece by piece, with the opportunity to respond after each piece. The possibility of an influence of previous response on current response. The use of the total number of cards as a measure of information rather than the log of the dose as is normally used in toxicology. The use of the log of the number of cards was also investigated but did not give nearly as good a fit to the data.

There are reasons for supposing that the number of cards may already be a 'logarithmic' quantity, in that, if we assume that each card has associated with it a probability that the event might have arisen by chance, in the absence of an enemy attack. As time goes on, and more and more cards are presented, the probability that an attack is not happening, but is instead due to chance, steadily reduces.

The probabilities of more than one card occurring by chance is the product of the probabilities of the individual cards, P1timesP2, assuming no correlation. For this to be consistent with an additive model of the value of the cards, it makes sense to consider the value of the cards to be proportional to logP, where P is the probability that the event described on the card would have occurred under non-attack conditions—the false alarm probability in the case of detectors.

One advantage of the probit transformation, which transforms probabilities into a linear, normal-distribution space, where the y-axis is measured in standard deviations, is that it naturally generates the characteristic S-curve of the cumulative normal distribution.

One of the implications of the probit model is that the probability of action for no cards received is not zero. In fact, the values of the intercept for the different actions show that these probabilities range from approximately 0.5% for the probable attack, and 8% for the precautionary alert. These could reasonably be interpreted as a measure of the 'prior belief' of the decision makers that such a course of action would be required.

The probability predictions that result from the probit model assume that some element of the situation remains uncontrolled, be it individual variation or differences between individual exposures. The greater the number of elements included in the probit model, the less the remaining elements that are assumed stochastic and whose influence is represented in the probabilistic nature of the output. So, the question is not necessarily what model is the best explanation for the data, but rather what model includes those elements reasonably under our control in the situation of interest, and what elements are outside that control and therefore have to be treated as probabilistic.

The simple model

The analysis discussed in this paper is based upon what is termed the simple model. This allowed for two intercepts, one for each of the two types of response, and a coefficient, calculated from multivariate probit analysis, for each of the six card types. This is the simplest model giving a good fit to the data, although more complex models were also investigated.

Models with previous response as an explanatory variable were fitted to see whether the data were better described using a time series approach. Also models with factors to describe the last card seen, factors to allow for differences in the serials not accounted for by the measures of information received, factors to identify which pair of players were responding, trends over time within and between serial and influences of sub-types of card. More influential parameters were fitted with and without less influential parameters in the model.

The card types model can be used to generate estimates of probability of action as a function of the kinds of information to which the players have been exposed, as measured by the different cards they have seen. To do this we use the formula value—calculated from the intercept for the appropriate decision—together with the contribution from the individual cards, to generate a value on the cumulative normal distribution.

This can be expressed using Equations (1) and (2).

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Equation (1) gives us the total value of the intercept and the individual weights for the covariates. This is then used to generate a value expressed in standard deviations that can be used to calculate a probability taken from the cumulative standard normal distribution (Phi). Since our calculated parameters are for probability of action j or less, the probabilities shown in the graphs, which are for action j or greater, need to be calculated using one minus this value (Equation (2)).

Applying this model to Serial 2 gives results as shown in Figure 3, showing the probability of a precautionary alert or higher and Figure 4, showing the probability of a probable attack being declared.

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Probability of declaring precautionary alert—Serial 2.

Full figure and legend (59K)

Figure 4.
Figure 4 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Probability of declaring probable attack—Serial 2.

Full figure and legend (58K)

The match between probabilities estimated from the simple formula and the actual values is strikingly good, reproducing not just the overall trend but also taking steps up in the same places as the actual track.

Performance of the simple model

Further analysis of the performance of the simple model is shown in Figures 5 and 6 that show the modelled probability of reacting plotted against the actual proportion of the sample responding. If the model were perfect, and there were enough pairs to show the observed probabilities with greater precision, all points would lie along a straight line towards the origin.

Figure 5.
Figure 5 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Modelled P versus actual P—precautionary alert.

Full figure and legend (73K)

Figure 6.
Figure 6 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Modelled P versus actual P—probable attack.

Full figure and legend (70K)

Figure 5 shows the performance of the simple model for the precautionary alert. The simple model accounts for 78% of the variance in probability of taking the action. The tendency for there to appear to be vertical bands is due to the actual probabilities of responding coming from a comparatively small group—it would be even more pronounced but that not all pairs were able to complete all serials.

Figure 6 shows the same graph but for the probability of issuing a probable attack. This shows greater dispersion, particularly at the lower probabilities of response. The simple model accounts for 63% of the variance in probability of declaring a probable attack.

These graphs show that the model does not completely describe the behaviour of the decision makers. This is quite reasonable; we could never hope to understand decision makers completely. But it does describe a predictable component of behaviour that has the potential to be useful in military OA.

Analysis of the residuals from fitting the model show that they are not strictly normal, but are from a symmetric distribution. Because the assumption of normality of the residuals is not met, making decisions based on significance tests of the coefficients would not be justified. However, the model fitting still gives us best estimates of these coefficients to aid our understanding of the decisions made.

Serial and pair variability

The statistical analysis can also be used to investigate the impact of the different sets of players and the impact of the variation from serial to serial due to the subjective variation in the story being told by the cards. This is useful in that it will give a relative impression of the degree of variability from each of these sources. The spread of parameter values will also give a useful baseline, allowing comparison with other groups that may undertake the experiment.

Serial and pair have to be considered together since not all pairs did all serials, considering one without the other may lead to distortion of the results, with serial effects being mixed up with pair effects and vice versa.

The directing staff solution

Once the trials had been completed, it was decided to attempt to create a 'directing staff solution' for the experiment. Several of the subjects had asked how many of the serials they 'got right'. The original intention was for there not to be any right or wrong answers, the experiment was designed to see how people responded to information, not to test their accuracy. The situations were, after all, not produced by running dispersion models or indeed with any clear view as to whether or not an attack had taken place. Since the serials were generated randomly, they are in some senses 'white noise' and players saw in them what they projected onto them. There are a lot of non-attack events and information mixed in, and the serials are deliberately challenging, with information coming in that needs to be pieced together in a similar way to the information coming into a real headquarters. Despite this, even if it is not a reasonable question to ask whether an attack had taken place, it is reasonable to ask whether the players ought to have responded to the information. In the final analysis, the information in any of the serials could have been consistent with an attack—a real attack might generate very little information. The question is where we choose to set the threshold for action.

Analysing time data

The second sort of quantitative data collected was on the time taken to respond to each individual card, measured by the time taken to request the next card. This was analysed using the GLM. The best fits were to log (time) and so the model was essentially a linear regression on log time. MINITAB was used to produce an analysis of variance to test for significance.

One large effect was from series order, with cards being dealt with quicker as the players moved through the serials. This was a noticeable effect in the trials, with the first serial in particular always taking a long time to complete as the players became familiar with the format of the trial and began to understand what was expected of them. As the serials progressed, the format became more familiar and the lengthy discussions on what to do about a particular card became less common as there was a greater likelihood of having encountered a similar card before. This learning effect was quite pronounced and also doubtless incorporated some team-building effects as the pairs learnt to work together.

Top

Discussion

This section discusses the implications of the results presented in the rest of the report, both for the future direction of this work programme and for the overall problem of providing an adequate system of BW detection and warning.

Utility of the method

The results of the experiment have shown that it is possible to get at the relative value of different types of information to decision makers. The method is flexible and offers a number of advantages both in characterizing information problems and in terms of providing a structured and controllable format for presenting information, collecting data and analysing results. The use of the cards allows for a clear audit trail from experimental design to final results and should allow for further analysis—looking for example at other sub-sets of cards—without needing further replications.

The use of probit analysis seems to fit well with the probabilistic nature of decision making under conditions of uncertainty.

One area that remains to be tested is how well the method will work with a less monotonic decision-making process. If the players were expected to move an alert state up and down in response to information then it might require a larger sample to show this effect. Another aspect that may need to be explored is the effect of different conditions on decision-making behaviour, for example if we had a variant scenario in which CBRN had already been used then we might have seen radically different behaviour from the players.

Also not yet proven is the consistency between different groups of players or different replications of the test. This is important, for example, if we wished to compare performance in the current game with performance with changed doctrine or different numbers and types of sensors.

Validation of the model

The problem of validation of the experiment is an interesting one. There are two principal senses in which an experiment like this can be said to be valid. The first is whether the results observed in the experiment are likely to carry over into the real-world situations that the experiment is designed to mimic. If we have real people making real decisions, will they make them in a similar way to our experimental subjects in an experimental situation? Clearly there are many differences between real biological alarm situations and the experiment. These are essentially questions about the closeness of the subject population to the potential decision-making population and the closeness of the experimental situations to the possible real situations. Usefully we have measures from the experiment of the variation due to these elements of the problem, which will be useful when comparing results across populations.

The stress levels are not likely to be as high in the experiment, since in the real situation lives are at risk. There may be an 'operational degradation' factor that applies in real situations and either raises or lowers the barrier to action. The design of the experiment is by its nature artificial, in order to make the situation as controllable as possible. The types of information were limited and the variety of information—and non-information—was not necessarily as great as in a real situation.

The second sense of validation is a more difficult one to assess. Is the behaviour that is being generated the 'right' behaviour for the overall performance of the system?

The behaviour of the decision makers can be thought of as being part of the overall system for providing timely warning. Just as with any detection and warning system, there is a risk of making two types of error—calling an alarm when there is nothing there and failing to alarm when something is there. Most detection systems such as sonar or radar are tuned to a level of sensitivity that maximizes the probability of detection while minimizing the probability of false alarm. The human element of the detection system is in this case an important part of the overall sensitivity of the system. Rational design of the system as a whole must decide on the acceptable level of these two types of error.

The level of sensitivity of the human element of the system can be adjusted using training. The problem is deciding what that level of sensitivity ought to be.

The card game described above also provides a potential mechanism for training potential cell controllers in the kinds of analysis and thinking that are necessary to interpret the information coming into the CBRN cell. Through training, personnel can learn to recognize what a BW attack might look like and can establish standards of comparison that will allow them to act appropriately, with or without expert advice, when action is necessary. With the introduction of more sophisticated detectors and information systems, CBRN training in general needs to move away from training in procedures and the mechanics of plotting and towards training people in the kind of thinking that they will need to do on a CBRN battlefield, including dealing with ambiguous and incomplete information, competing requirements and multi-layered problems. The card game used for this trial can help develop the kind of mental agility and thought processes necessary to deal with these kinds of situations.

Potential exploitation

The most obvious area of potential exploitation is the use of the game as a training aid as has been discussed above. There will need to be some changes to the cards to give more positional and time information and there will need to be a process of 'training the trainers' but, with a 'DS Solution' identified, little further work is required. If a training needs analysis were to be conducted, it would identify the most effective use of this game approach.

The algorithms developed for the decision-making process have longer term potential for exploitation in the Battlefield Information System Application (BISA) as a command aid and 'wake-up' system that will alert commanders to situations where the evidence is beginning to build up to a point where they should consider taking action.

There is also potential to use these algorithms in other simulations that require models of human decision making, such as the virtual CBRN battlespace and many of the synthetic environments. This would also include OA tools such as IMPACT, where decision-making analysis could be combined with atmospheric dispersion modelling to explore the implications of different levels of sensitivity for the overall performance of the BW detection system.

Some of the results of this study can also be used to inform the future direction of the BW detection programme, although some care is needed in interpretation of the results, since the results are based upon the perception by the players of the capabilities of the current equipment. Nonetheless, there seems to be a relation between the rate of false positives and the value placed on the information that will offer some insight into the kinds of performance level required.

Finally, there is potential to apply the method developed here to other problems in command and control and information warfare. The methodology is potentially applicable to a wide variety of decision problems. There are also indications that some of the results are likely to be compatible with some of the models of decision making set out in Moffat (2000, 2002). A particularly important point is the successful use of probit analysis to investigate decision making. A key feature of the models set out in Moffat (2002) is the importance of normally distributed decision variables. Bayesian approaches are also central to this approach and it seems that some of the features of the decision-making behaviour observed are compatible with Bayesian concepts such as that of prior belief.

Top

Conclusions

From the results outlined in this paper it is concluded that

  1. The use of the methodology to investigate the importance of information in decision making has been successful.
  2. The impact of different types of information can be separated using statistical analysis.
  3. The use of probit analysis has great potential for modelling the decision-making process.
  4. A simple formula has been developed using a probit model that appears to account for 60–80% of the variability in the observed data.
  5. That models of this sort are likely to share some underlying features with mathematical models of decision making such as are described in Moffat (2002).

Top

References

  1. Daniel D, Holt J, Mathieson GL (2002). What influences a decision, 19 International Symposium on Military Operations Research (ISMOR), http://www.dcmt.cranfield.ac.uk/ismor.
  2. Dodd L, Moffat J and Smith J (2006). Discontinuity in decision making when objectives conflict: A military command decision case study. J Opl Res Soc 57: 643–654. | Article |
  3. Finney D (1971). Probit Analysis. Cambridge University Press: Cambridge.
  4. Forder RA (2004). Operational research in the UK Ministry of Defence: An overview. J Opl Res Soc 55: 319–332. | Article |
  5. Kahneman D, Tversky A (ed) (2000). Choices, Values and Frames. Cambridge University Press: Cambridge.
  6. Kahneman D, Slovic P and Tversky A (1982). Judgement Under Uncertainty: Heuristics and Biases. Cambridge University Press: Cambridge.
  7. Klein G (2001). Sources of Power, How People Make Decisions. MIT Press: Cambridge, MA.
  8. Mathieson GL (2001). The impact of information on decision making, 18 International Symposium on Military Operations Research (ISMOR), http://www.dcmt.cranfield.ac.uk/ismor.
  9. Moffat J (2000). Representing the command and control process in simulation models of conflict. J Opl Res Soc 51: 431–439. | Article |
  10. Moffat J (2002). Command and Control in the Information Age, Representing its Impact. TSO: London.
  11. Moffat J and Witty S (2002). Bayesian decision making and military command and control. J Opl Res Soc 53: 709–718. | Article |
  12. Moffat J, Campbell I and Glover P (2004). Validation of the mission-based approach to representing command and control in simulation models of conflict. J Opl Res Soc 55: 340–349. | Article |
  13. Perry W and Moffat J (2004). Information Sharing Among Military Headquarters, The Effects on Decisionmaking. RAND Corporation: Santa Monica, CA.
  14. Zsambok C, Klein G (ed) (1997). Naturalistic Decision Making. LEA: Mahwah, NJ.
Top

Acknowledgements

The study team like to thank the players for their patience and good humour when carrying out the trials.