Introduction
When customers arrive in an environment within which service may be provided for them at one of several stations, a natural question that arises concerns how those customers might be assigned to those stations in an optimal manner. Models of such routing problems have proved difficult to analyse, even very simple ones. See, for example, Whitt.1 The reader is referred to Liu and Righter2 for a helpful overview of the theoretical literature. In the main, work has focused on the routing of a single class of arriving jobs to a collection of homogeneous stations. For such problems, simple round robin policies and Bernoulli routing with equal probabilities (which we call the equal splitting policy, abbreviated to ESP) have been shown to provide optimal regimes when little information is available to guide decisions regarding the destination of incoming jobs. See, for example, Koole.3 When full information on the queue lengths at each station is available, the 'join the shortest queue' strategy has been shown to be optimal for a range of models. See, for example, Johri.4 The motivating applications for much of this work have often concerned the control and design of computer and communication systems. Houck's5 work concerned applications in telephony, while Kleinrock6 discussed the optimal design of data networks. Altman7 gave a helpful overview of contributions that relate to communication networks.
However, the single customer class/homogeneous stations set-up encountered in much of the classical literature is too simple to address many contemporary applications. Consider, for example, the following:
1. The Grid: With the advent of low-cost, high-performance computing hardware and the widespread availability of high-speed network access, various novel types of service provision environments have become commercially viable. The provision of computing and storage resources as a commodity (a concept known as 'the grid') is becoming practical and is exciting considerable interest. The reader is referred to Foster and Kesselman8 for a fuller discussion. In a grid environment, a provider offers a number of different services to the public, using a collection of networked machines, which may or may not have other tasks to perform. Users submit requests for services (which may be of many different kinds) without necessarily knowing, or caring, where they will be executed. The routing problem is how to distribute these heterogeneous requests among the servers, so as to make the best possible use of the available resources and provide the best possible quality of service. Braun et al9 gave a detailed discussion of high-performance heterogeneous computing (HC) environments, which are well suited to meet the computational demands of large diverse groups of applications. They stated that 'one key feature in achieving the best possible performance from HC environments is the ability to assign effectively the applications to machines and schedule their execution'.
2. Distributed expertise: Becker et al10 considered a routing problem motivated by call centres of companies producing a range of products. Customers telephone such centres with requests for service or technical support. These calls are then routed to agents. Calls concerning a particular product should be preferably assigned to an agent with the requisite expertise. However, if the rate of such requests is not reasonably consistent across the company's products, such an approach may overload the agents with expertise in, say, new products with a high sales volume. To reduce the load on these servers, and possibly also to reduce customer waiting time, some calls may need to be routed to less expert servers. Becker et al10 discussed how to do that routing optimally.
The above scenarios underline the need to study routing problems in systems where many classes of customer seek service provided by servers of differing capabilities. These characteristics are also present in the so-called lane selecting models of Schwartz11 (also analysed by Green12), who discussed a range of further applications. In their work concerning routing within distributed computer systems, Ross and Yao13 pointed to two additional features that may be present in such problems. First, some arriving customers or jobs may be dedicated, in that they can only be processed at a specified location. The routing problem then concerns only the so-called generic jobs for which the choice of service station is open. Second, when a station is processing many classes of jobs, the controller may wish to implement local scheduling rules for prioritising tasks within its workload. This was also a feature of the models of Becker et al.10
To the authors' knowledge, this is the first paper to consider the problem of how dynamic (ie queue length dependent) routing policies may be developed in contexts which reflect the complexities of the previous paragraph. Our model is described in the next section and allows for many customer classes, heterogeneous servers and local scheduling capability. The problems and accompanying analysis are necessarily complex and so to keep the development as simple as possible, we shall initially suppose that each service station may be modelled as a multi-class M/M/1 queue. In fact, our approach is considerably more general than that, and extensions are flagged in a later section. These include the case in which stations are modelled as multi-class M/G/1 queues, as assumed in the work of Becker et al10 and Ross and Yao,13 cited above. In the development of dynamic routing policies, we will apply a policy improvement step to an optimal static (ie queue length independent) policy. The major theoretical achievement of the paper is the demonstration that this results in a class of policies that has a simple and intuitive structure, and which generalises 'join the shortest queue' to our complex multi-class environment in a natural way. Under these policies, arriving jobs in the system are routed to the least congested station, where the measure of station congestion is linear in the class-specific queue lengths there. The coefficients of this linear measure reflect both the individual station dynamics and the class of the arriving job. Further simplifications arise in special cases. A computational investigation is reported and shows that these policies enjoy an increasing cost advantage over 'join the shortest queue' as the heterogeneity of the job population increases. They were found to be very close to optimal in all cases studied.
Routing problems for systems of multi-class M/M/1 queues
A distributed system comprises a set of stations M={1, 2, ..., M} and a system controller who routes the incoming jobs to stations on the basis of information received. Jobs from a number of different classes arrive at the system for processing. Jobs (and the classes to which they belong) are either generic or dedicated. Dedicated jobs can be processed only by a specified station and are assumed to arrive directly at that station. For generic jobs, the choice of station is open. We denote the set of classes of generic jobs by G and the set of classes of jobs dedicated to station m by Dm, m
M. Hence E=G
D1

DM is the set of job classes allowed access to the system, while Em=G
Dm is the set of job classes allowed access to station m
M. Jobs arrive at the system in independent Poisson streams. Those dedicated to station m arrive there at rates {
jm}j
Dm. The routing problem of interest concerns the choice of station to process the generic jobs. These arrive at the system with rates {
g}g
G. We formulate this as a decision problem as follows:
(i) We write N(t)={Nm(t)}m
M for the state of the system at time t
+, with Nm(t) for the state of station m. The latter is given by Nm(t) = {Njm(t)}j
G
Dm, where Njm(t) is the number of class j jobs present at station m (including any in service) at time t.
(ii) The decision epochs for the routing problem are the times at which generic jobs arrive at the system and will be the event times of a Poisson process with rate
g
G
g. We say that a type g decision epoch occurs when the generic arrival is from class g
G. At a type g epoch, the actions available to the system controller are Ag={a1g, a2g,..., aMg}, where amg denotes the routing of the newly arrived job to station m, thus increasing the class g queue length there by one.
(iii) As indicated in the introduction, we shall focus on the simple case of individual stations as multi-class M/M/1 queues. Hence, between successive decision epochs, each station m evolves (via the arrival of dedicated jobs and service completions) as follows: new jobs from classes in Dm arrive in independent Poisson streams with rates {
jm}j
Dm. A single server implements a simple priority policy um for scheduling the work at the station, that is, jobs are served pre-emptively according to a fixed ordering among the job classes in G
Dm. The service times for class j jobs are exponentially distributed with rate
jm, j
G
Dm. Since these rates are assumed m-dependent, we can follow Becker et al10 in using them to model varying capacities among the stations for processing jobs from different generic classes. For example,
jm=0 indicates that station m cannot process jobs from generic class j. We assume that all arrival and service processes are mutually independent.
(iv) A holding cost rate cj
0 is associated with each job class j
E. Our goal is to develop routing policies (ie rules for making decisions regarding the routing of generic jobs) to minimise the long-run average holding cost rate

where the expectation in (1) is taken in steady state. Note that we assume that the system is stable, in that a steady-state solution with finite queue lengths exists.
In the course of our discussion, we shall consider two classes of routing policies for the above problem. The class of static policies chooses action amg at a type g epoch with fixed probability pgm, where
m
Mpgm=1, g
G. These policies make no use of information regarding the current or past states of the system. The optimisation problem then concerns the best choice of the controller routing matrix P=(pgm)g
G,m
M. The reader should note that, with any P, all traffic arrives at each station m in independent Poisson streams. When this happens, a natural choice for the local scheduling rule um is the so-called c-mu rule, which establishes priorities between the job classes at station m according to the values {cj
jm}j
G
Dm, with the class having the largest associated index being given the highest priority. This rule minimises the holding cost rate at station m among all dynamic scheduling rules. For extensive discussion of the static routing problem, which results when local scheduling is according to the c-mu rule, the reader should consult Dacre et al.14 For our purposes, it will be sufficient for the reader to note the following facts:
1. In the special case in which generic jobs are stochastically indistinguishable (
gm=
m, g
G) at each station m and where the stations are homogeneous, the static routing problem is solved by the equal splitting policy (ESP) given by routing matrix P*, where

2. An iterative LP-based heuristic has been found to provide an excellent tool for solution of static routing problems quite generally.
Our second class of policies and of primary interest in the current work are the dynamic policies, which can in principle make routing decisions on the basis of the entire history of the system to date. However, the theory of stochastic dynamic programming (DP) (see, for example, Puterman15) asserts the existence of an optimal dynamic policy that is deterministic, stationary and Markov. Hence, an optimal decision at time t
+ is made with reference to N(t) only. That said, the combinatorial explosion means that DP is unlikely to be an effective tool of solution for problems of reasonable size. Further, the theoretical difficulties posed even by single-class systems with homogeneous stations suggest strongly that the quest for simple closed form optimal policies in a set-up as complex as ours is unlikely to be fruitful. Hence, the primary quest is for good heuristic policies. We describe how these may be developed in the next section.
Comment
As noted above, the c-mu rule is guaranteed to provide an optimal schedule in a static routing regime when the arrivals at each station form independent Poisson streams. While this is technically no longer the case when routing policies are dynamic, all our computations show that the degree of suboptimality in the local schedule is very small. Hence, for all practical purposes, in solving the resulting dynamic routing problem we are actually solving a joint routing/local scheduling problem that is fully dynamic. Also note that some applications may call for a choice of local scheduling policies um other than the c-mu rule. We may, for example, wish to give dedicated jobs a degree of priority over generic ones, as was the case in Ross and Yao.13
A heuristic dynamic policy for routing to multi-class M/M/1 queues
Here we develop dynamic routing heuristics by the application of a single policy improvement step to some given static policy P. A natural choice for the latter is an optimal static policy, denoted P* and we shall suppose that this is the case in what follows. In pursuing this approach we develop an idea proposed in the context of simple single-class systems by Krishnan16 and discussed by Tijms.17 In this section, we shall show that the dynamic routing policies, which result in our complex multi-class environment, are simple in structure and are easily computable. They are natural developments of the 'join the shortest queue' routing policy, extended in a way that is appropriate for a heterogeneous job population and a set of diverse service stations. In the final section, we shall demonstrate numerically how well these heuristics perform. We now develop our dynamic routing heuristics as follows.
The policy improvement step utilises the quantities
g(N, m, P*) defined as follows:
In a situation in which the system state is N and a generic job of class g has just arrived,
g(N, m, P*) is the difference in total expected costs over an infinite horizon between the policy that allocates the class g job to station m and which then uses P*, and the static policy that uses P* throughout.
In the situation described in the preceding paragraph, our heuristic
will route the newly arrived class g job to whichever station m has the smallest value of
g(N, m, P*). In the event of a tie, any of the qualifying stations are chosen. We write

where
(g, N) is used for the station chosen by policy
for the class g job when the system state is N. However, since under the static policy P* the processes at the M stations evolve independently, the policy
may be described more simply in terms of station-specific quantities
mg(Nm) defined as follows:
Consider station m in isolation, evolving under its local scheduling policy um with generic arrival rates {pgm*
g}g
G determined by P*. The quantity
mg(Nm) is defined to be the difference in total costs over an infinite horizon for station m alone between starting in state Nm+1mg and in state Nm. We use 1mg to denote a |G
Dm|-vector whose gth component is 1 with zeroes elsewhere. By a simple conditioning argument we deduce that

and so (2) may be re-expressed as

It should be clear from (3) that policy
will never route a class j job to station m when
jm=0 since any corresponding
mj is infinite. Hence, without loss of generality we can assume that
jm>0 for all j, m in what follows.
Under the routing heuristic given by (3), each station has an index dependent both upon its current state and the class of job to be allocated. The station chosen by the policy is the one with the smallest index value. The advantages of (3) over (2) for computation are clear from the reduction in dimensionality of the state variables concerned. The key quantities
mg(Nm) are readily available and may be expressed in terms of relative costs associated with station m (evolving under scheduling rule um and with generic arrival rates pjm*
jG), regarded as an undiscounted Markov decision process (MDP). Use 0m to denote the zero-state in which station m is empty and
m(P*) for the long-run average cost per unit time for station m under static policy P*. In fact, simple formulae for
m(P*) are readily available. See Ansell et al.18 We now have from MDP theory that

where Km(Nm) is defined as the total expected cost incurred by station m from initial state Nm until it first enters state 0m (ie, until it first becomes empty) and Tm(Nm) is the corresponding expected time. Suppose that local scheduling rule um at station m processes a job from class j(Nm)
G
Dm in state Nm. We then have

and

The solution of the recursions yields the key index values
mg (Nm) via (4).
We now assert that the indices
mg(Nm) are simple congestion measures, which are linear in the queue lengths Nm. The proof of Theorem 1 is long and is given in the appendix. The following discussion will be simplified, if in what follows, we extend the definition of
mj(Nm) to all j
G
Dm via the expression in (4).
Theorem 1 (Linear characterisation of
mj(Nm)) For each station m
M there exists a matrix
m = {
jkm}j,k
G
Dm and a vector
m = {
jm}j
G
Dm such that

In the light of Theorem 1, we may now re-express (3), which characterises our routing heuristic
, as

Theorem 2 contains further information on the matrix
m. Firstly it is symmetric and hence has only ½|G
Dm|(|G
Dm| + 1) distinct entries. Secondly, all its entries are non-negative, implying that each
mg(Nm) on the r.h.s. of (7) is indeed a congestion measure in the sense of being increasing in the queue lengths Nm.
Theorem 2 The matrices
m, m
M, which determine the indices
mj(Nm) are such that
jkm=
kjm for all j, k
G
Dm, m
M (symmetry)
jkm
0 for all j, k
G
Dm, m
M (non-negativity)
Proof
From the expression for
mj(Nm) in the statement of Theorem 1, it is straightforward to conclude that, for all j, k
G
Dm and all m
M,

by the symmetry between j and k in (9). In order to obtain (8) from (4) we have used the fact that Km(0m)=Tm(0m)=0. To obtain (9), a simple argument based on the evolution of station m from initial state 1mk+1mj similar to that used in (b) below yields

This establishes (a).
From (9), it is enough to prove (b) that, for all choices of j, k and m,

Consider station m in initial state 1mk+1mj, processed according to um until it reaches the empty state 0m. In an obvious way, the system evolution may be regarded as the (disjoint union of) two branching processes emanating, respectively, from the class k and the class j jobs present initially. Suppose that holding costs are reduced such that when processing is allocated to the k-branching process, only costs from jobs within that process are incurred, and similarly for the j-branching process. Under this cost reduction, the total expected cost incurred until the system empties is evidently Km(1mk)+Km(1mj), and inequality (10) follows. This completes the proof.

We obtain further simplification of the indices
mj(Nm) in an important special case. Following the earlier usage, we say that the jobs (both generic and dedicated) at station m are stochastically indistinguishable if
jm=
m, j
G
Dm. Theorem 3 asserts that in this case, the symmetric matrix
m has a special structure and just |G
Dm| distinct elements. Hence, significant computational savings are available in this case. It will simplify the discussion if we fix m
M and suppose that at station m, local scheduling rule um imposes the priorities 1
2

|G
Dm|, namely that jobs from class 1 have the highest priority, followed by jobs from class 2, and so on. We also write
jm for the arrival rate at station m for all job classes j, that is,
gm=pgm*
g, g
G.
Theorem 3 (The stochastically indistinguishable case) If the jobs at station m are stochastically indistinguishable, then the matrix
m is such that

and the |G
Dm| constants
jm, 1
j
|G
Dm|, are determined by the recursive scheme


Proof Fix j and k such that 1
k
j
|G
Dm|. We use (5) and (6) to develop expressions for Km(1mk+1mj), Km(1mk), Tm(1mk+1mj) and Tm(1mk). From these we can utilise (4) to infer the following expression for
mj(1mk):

From (14) we obtain

which establishes (11)). Now allow j to take any value in the range 1
j
|G
Dm|. By appeal to the symmetry of
m, established in Theorem 2(a), and from (11) we deduce that for any such j,

We now use (16) within (15) to obtain (12). Finally, we use (9) to infer that

and (13) follows easily from an analysis of the costs incurred in emptying station m from initial state 21|mG
Dm. This concludes the proof. 
Comments
1. Suppose that we are in the stochastically indistinguishable case and that stations are homogeneous. A generic job belonging to the class of lowest priority under the local scheduling rule arrives at the system. According to Theorem 3 the form of the station m index in this case is

Hence in this scenario, our routing heuristic will send the incoming job to the station with the smallest (total) number of jobs present.
2. The reader should note that Theorem 1 establishes the linearity of the key indices
mg(Nm) for any choice of initialising static policy P and any choice of priority policies for local scheduling. Plainly, these choices will impact the matrices
m and the vectors
m which determine the indices. Our computational experience has been that, once the choice of local scheduling rule has been made, the performance of the dynamic routing heuristic is insensitive to the choice of P. With regard to the latter, ESP is sometimes a convenient alternative to the optimal static policy P* when it does not coincide with it.
3. An advantage for computation of the routing heuristic of its linear index structure is that only sufficient evaluations of the
mj(Nm) are required to determine the coefficients {
jkm}j,k
G
Dm and {
jm}j
G
Dm. In our work such evaluations utilised (4) together with a value iteration approach to the computation of the required Km(Nm) and Tm(Nm) via (5) and (6), respectively. In most problems this represents a huge computational saving over full DP.
Extensions to more general models
In the above we have assumed that each individual station is a multi-class M/M/1 system operating under a simple priority policy for local scheduling. A range of extensions to more general scenarios is readily available. We now sketch some of the more significant ones:
1. A version of the above analysis is available for the multi-class M/M/1 case, but when jobs are served locally in a first-come-first-served fashion. With regard to (an appropriate form of) the expression on the r.h.s. of (4), while the costs Km now depend not only on the vector of queue lengths Nm, but also on the order in which jobs arrived, the differencing in (4) yields an index that is again linearly dependent on Nm only. This is plainly relevant to contexts where scheduling of the local workload is not possible. However, foregoing the opportunity to schedule locally when it is possible to do so may result in considerably increased costs, especially in moderate to heavy traffic. This is as reported in the context of static routing policies by Becker et al.10
2. Consider now the case where station m operates as a multi-class M/G/1 queue. Use Sjm to denote a generic class j service time at station m, which is now a random variable with general probability distribution, j
G
Dm. At each station, service times are assumed iid within each class. A single server implements a simple priority policy um, but this now operates non-preemptively, that is, a job once started must be processed through to completion. This is the local model considered by Becker et al10 in their work related to call centres and by Ross and Yao13 in their work on distributed computer systems.
A full information version of the resulting dynamic routing problem requires the system controller to have knowledge of the expired processing time of any job currently in service, in addition to the vector of queue lengths at each station. We suppose this information is available whenever a generic job arrives at the system. Use (j, t)m to denote the event that a class j job is being served at station m and has already received t units of service. When this is the case and Nm is the vector of class-specific queue lengths, we write Nm-
1mj+1m(j,t) for the state of station m. Use of Nm alone for the state will denote the queue length vector at a service completion epoch (ie, no new job has yet entered service). Our dynamic routing policy will now be based on indices
mg{Nm-
1mj+1m(j,t)} with
mg defined in an obvious manner, as above. However, an analysis based on suitable versions of (4),(5) and (6) and slight adjustments to the proof of Theorem 1 yields

Hence, an index that is linear in the queue lengths applies at service completion epochs. The coefficients {
jkm}j,k
G
Dm and {
jm}j
G
Dm in (17) may be computed via forms of the recursions (5) and (6) appropriate to the M/G/1 case. Now consider station m in general state Nm-
1mj+1m(j,t) with a generic job class of g newly arrived. By considering both the costs incurred at station m during the remaining service time of the class j job currently in service together with the (random) state of station m at the conclusion of that service, we can show that

This is the form of the station m index, which is appropriate for the general situation in which a new job arrives while some job is undergoing service. Note that in (18), E(Sjm- t|Sjm>t) is the expected remaining service time of the class j job currently in service. This expectation (and hence also the expression in (18)) has no dependence on t when service times are exponentially distributed. In general, large values of this quantity will inflate the index, as might be expected. Our heuristic will continue to route incoming generic jobs to whichever station has the smallest value of the appropriate index.
3. The analysis of the previous section continues to hold when each station is modelled by one of a collection of single-server multi-class systems, called Klimov networks. Under such models, the traffic at a station can be quite general in structure. We can, for example, provide for jobs which have a state which evolves under processing as a continuous time Markov process. The reader is referred to Ansell et al18 for descriptions, both of Klimov networks and of how the results of the current paper extend to them.
Numerical study
We conclude the paper with the results of a computational investigation of the comparative performance of our simple routing heuristic with competitor policies cited both within the paper and in the literature. We consider scenarios that are sufficiently simple to allow the computation of a fully optimal dynamic routing policy via DP, but sufficiently rich to yield insight. All these involve two homogeneous stations, each admitting two generic job classes. Since particular simplifications arise when the jobs are stochastically indistinguishable (see Theorem 3), we consider this special case first.
Stochastically indistinguishable jobs
We shall suppose that both stations are preemptive two-class M/M/1 systems with local scheduling according to the c-mu rule. We take
1=
2=1 for the completion rates of the exponential processing times of the two job classes. In this set-up, the c-mu rule gives higher priority to whichever class has the larger holding cost rate. We fix c2=1 and take c1
1, ensuring that class 1 always has higher priority. Theorem 3 applies and the matrix of coefficients determining the key indices
mg will have the form

Further, it is straightforward to show that
1
2 as c1
c2=1, and hence our routing heuristic, becomes 'join the shortest queue' in the single job class limit. We shall consider the following heuristics (all identified by single letter abbreviations) for such problems:
Equal splitting policy (E): Under E, each station receives traffic at rate (
1/2,
2/2). This is the optimal static policy whose implementation has no information requirements;
Round robin (R): Incoming jobs are sent alternately to the two stations. Each station now faces an arrival stream, which is a renewal process with iid gamma
(2,
1+
2) interarrival times. The only information requirement for the policy is the destination of the last arrival. Theoretical considerations suggest that R should offer a significant improvement in performance over E. See Hajek;19
Join the shortest queue (J): Incoming jobs are sent to whichever station has the smaller total number of jobs present and to either station in the event of a tie;
Routing heuristic (H): This is the heuristic developed in the paper by implementing a policy improvement step from the optimal static policy E. The information requirement is marginally greater than for J, in that the queue lengths for each job class at each station are needed;
Optimal dynamic routing policy (O): This is developed via DP. Although it has the same information requirements as H, it will in general be much more difficult to compute and considerably more complex in structure.
The long-run average holding cost rates incurred under the above policies were computed for problems with c1 ranging from 1.5 to 10 and
=
1+
2, taking the values 0.6 (light traffic), 1.0 (moderate traffic) and 1.5 (moderate-to-heavy traffic). For each (c1,
)-combination, 100 (
1,
2) pairs were generated randomly by sampling
1 from a U[
0.2,
-
0.2]
distribution then choosing
2=
-
1. For each generated problem, the cost rates CE (the long-run average cost rate under E), CR, CJ, CH and CO were computed. A simple closed form expression for CE is available. For the other policies, an appropriate form of value iteration was used to compute costs. Theoretical considerations indicate that CE
CH
CO. In fact, in all 1200 problems studied, it emerged that

and so costs decrease as the information requirements of the policies grow. In light of (19), we present our results in terms of the percentage cost degradation experienced as we move from right to left through inequalities in (19). For example, we write

and similarly for
(R, J),
(J, H) and
(H, O). In almost all cases, the samples of size 100 produced little within-sample variation in these quantities, and so in Table 1 we simply report median values.
Table 1 - Relative performance of routing policies E, R, J, H and O where
(a, b) is the median percentage cost degradation from policy b to policy a. Problems have two homogeneous stations with stochastically indistinguishable generic job classes.
From Table 1, note that in the main cost degradations increase with
, implying that the value of increased information grows with the traffic intensity. Note also that, while the medians of
(E, R) and
(R, J) both decrease as c1 increases, for fixed
, the values of
(J, H) increase with c1. Hence the relative cost advantage H enjoys over J grows as the job classes become more heterogeneous. Recall that J and H are approximately equal for c1 close to 1. Finally, and very significantly, note the uniformly strong performance of H as evidenced by the small median values of
(H, O).
Stochastically distinct job classes
We shall suppose, as before, that both stations are preemptive two-class M/M/1 systems under the c-mu rule and we now fix c2=
2=1. The class 1 parameters c1 and
1 are allowed to take a range of values such that c1
1
c2
2, guaranteeing that class 1 always enjoys higher priority. We take
=0.6, 1 and 1.5 again and compute median values of percentage cost degradations from samples of size 10 (with randomly drawn
1 and
2). The five routing policies under investigation are as above, but with R replaced by S, the optimal static policy. The reader should note that, with
1
2, E is no longer generally optimal in the class of static policies. Theoretical considerations indicate that CE
CS
CH
CO. In fact, for all 600 problems studied, it emerged that

and
(E, S),
(S, J),
(J, H) and
(H, O) are the percentage cost degradations of interest. In Table 2 are presented median values of these quantities for a collection of problems for which c1=
1-
1. Each problem set is then characterised by a choice of the pair (c1,
). Values of c1 closer to 1 correspond to greater similarity between the job classes. The relative performance of J with respect to H strengthens for such cases, but weakens as c1 increases to 10 and decreases to 0.1. As previously, the cost advantage of the dynamic policies tends to grow with
. We also note the uniformly strong performance of the routing heuristic H whose median level of cost suboptimality never exceeds 0.271%
.
Table 2 - Relative performance of routing policies E, S, J, H and O where
(a, b) is the median percentage cost degradation from policy b to policy a. Problems have two homogeneous stations and two job classes with c1=
1-
1 and c2=
2=1.
Conclusions
The paper has highlighted the practical importance of studying routing problems in a multi-class context in which traffic types may have widely differing cost and stochastic characteristics, and where service stations may have diverse capabilities. However, such problems present a significant challenge for analysis. We describe a policy improvement approach that results in the development of simply structured dynamic routing policies. The heuristics generalise the 'join the shortest queue' policy in a way appropriate to this complex environment. A numerical study provides evidence of very strong performance of the routing heuristics.
Future work will include an investigation of the implication of our results for the design of systems seeking to service a heterogeneous job population.
References
- Whitt W (1986). Deciding which queue to join: some counter-examples. Opns Res 34: 55–62.
- Liu Z and Righter R (1998). Optimal load balancing on distributed homogeneous unreliable processors. Opns Res 46: 563–573.
- Koole G (1996). On the pathwise optimal Bernoulli routing policy for homogeneous parallel servers. Math Opns Res 21: 469–476.
- Johri PK (1989). Optimality of the shortest line discipline with state dependent service times. Eur J Opl Res 41: 157–161.
- Houck DJ (1987). Comparison of policies for routing customers to parallel queueing systems. Opns Res 35: 306–310.
- Kleinrock L (2002). Creating a mathematical theory of computer networks. Opns Res 50: 125–131.
- Altman E (2000). Applications of Markov decision processes in communication networks: a survey. Rapport de Recherche 3984, INRIA.
- Foster I and Kesselman C (eds) (1998). The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufman: San Francisco.
- Braun TD, Siegel HJ, and Maciejewski AA. (2001). Heterogeneous computing: goals, methods and open problems. "PDPTA 2001: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, (ed. Arabnia HR), pp 1–12. CSREA: Athens".
- Becker KJ, Gaver DP, Glazebrook KD, Jacobs PA and Lawphongpanich S (2000). Allocation of tasks to specialized processors: a planning approach. Eur J Opl Res 126: 80–88. | Article |
- Schwartz B (1974). Queueing models with lane selection: a new class of problems. Opns Res 22: 331–339.
- Green L (1985). A queueing system with general-use and limited-use servers. Opns Res 33: 168–182.
- Ross KW and Yao DD (1991). Optimal load balancing and scheduling in a distributed computer system. J Assoc Comput Mach 38: 676–690. | Article |
- Dacre MJ, Glazebrook KD and Niño-Mora J (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). J R Stat Soc B 61: 747–791. | Article |
- Puterman ML (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley: New York.
- Krishan KR (1987). Joining the right queue: a Markov decision rule. In: Proceedings of the 28th IEEE Conference on Decision and Control, pp 1863–1868.
- Tijms HC (1994). Stochastic Models: an Algorithmic Approach. Wiley: Chichester.
- Ansell PS, Dacre MJ, Glazebrook KD and Kirkbride C (2000). Optimal load balancing and scheduling in distributed multi-class service systems. Technical Report, Newcastle University.
- Hajek B (1985). Extremal splittings of point processes. Math Opns Res 10: 543–556.
Appendices
Appendix : Proof of Theorem 1
To minimise notational complexities, we fix m
M and, where possible, drop m from the notation. We shall suppose that job classes at station m are numbered according to the priority accorded them by local scheduling rule um, that is, that um imposes the class priorities 1
2

|G
Dm|. K(N) now stands for the total expected cost incurred in emptying station m from an initial state N when the station operates under um, while receiving generic traffic at rates {pgm*
g}g
G. The key to the proof is in demonstrating that K(N) is quadratic in N in the sense that

for some constants {Kkl}k
G
Dm,l
G
Dm and {Kk}k
G
Dm. As before, we use 1j to denote a |G
Dm|-vector whose jth component is 1, with zeroes elsewhere.
We shall prove (21) by an induction on
, the job class of highest priority under um represented in N. Firstly, consider the case
=|G
Dm|, that is, N = N1G
Dm for some positive integer N. As a preliminary, focus on the initial state 1|G
Dm|. We write
|G
Dm| for the random time it takes to empty the system from 1|G
Dm| and
(1|G
Dm|) for the corresponding random cost. It must then follow that if we now begin in state N1|G
Dm| and process jobs until the station enters state (N - 1)1|G
Dm| for the first time, then the time elapsed is
N|G
Dm| and the cost incurred is

where

The notation

in (23) indicates that the random variables concerned have the same probability distribution. To understand (22) and (23), imagine that (N-
1) of the |G
Dm|-jobs present at time 0 are laid on one side and play no part in the processing. These jobs will incur a cost (N – 1)c|G
Dm|
N|G
Dm|. Otherwise, the situation is stochastically identical to that beginning from 1|G
Dm| above. We now empty the system by considering successive first-time transitions N1|G
Dm|
(N -
1)1|G
Dm|
(N -
2)1|G
Dm|
1|G
Dm|
0. By repetition of the argument to (22) and (23), we can write the total cost incurred in emptying the system from N1|G
Dm| as

In (24), the random variables
j|G
Dm|, 1
j
N, are iid as are
j(1|G
Dm|), 1
j
N. The corresponding expectations are T(1|G
Dm) and K(1|G
Dm|), respectively. From (24) we deduce that the expected cost incurred in emptying the system from N1G
Dm is

which is quadratic in N. The inductive hypothesis is established for the case
=|G
Dm|.
We now suppose that the inductive hypothesis holds whenever initial state N is such that k+1

|G
Dm| and consider the case
=k. Hence, the initial state N is assumed to be

with N>0. As a preliminary, focus on an initial state 1k. We write
k for the random time it takes to empty the system of all jobs from class k (and hence also from classes 1, 2, ..., k-
1) for the first time and
(1k) for the corresponding random cost. At time
k, the (random) state of the system is written

and comprises jobs from classes k+1, k+2, ..., |G
Dm| which arrived at the system during [
0,
k). It must then follow that, if we now begin in state N in (25) and process jobs until the number of class k jobs drops to (N-
1) for the first time, then the time elapsed is
Nk and the cost incurred is

where

The reasoning that yields (27) and (28) is a simple development of the argument following (23). Additionally, the (random) state of the system at time
Nk may be written as

where

We now empty the system of class k jobs by considering successive transitions

By suitable repetition of the argument to (27) and (28) we can write the total cost incurred in emptying the system of class k jobs from state N in (25) as

In (31) and (32), the (Ñl(k + 1), Ñl(k + 2),
, Ñl|G
Dm|), 1
l
N, are iid as are the
lk, 1
l
N, and the
l(1k), 1
l
N. The respective means are written (
(k + 1),
(k + 2),
,
|G
Dm|),
k and &Kcirc;(1k). We also note that
l'N=l+1 Ñl'j and
lk are independent for all choices of j, l. From (31) and (32) we deduce that the expected cost incurred in emptying the system from N is

Consider the system state that is the argument of K in the last term on the r.h.s. in (33). The job class of highest index represented is k+1 and the inductive hypothesis applies. It follows straightforwardly that the final term on the r.h.s. of (33) is quadratic in N as, plainly, are the first three terms. Hence K(N) is quadratic in N for all choices of N.
A similar, but simpler, induction argument yields the conclusion that T(N) is linear in N for any choice of N. We may thus write that

for some constants {tj}j
G
Dm. Note that the absence of constant terms in (21) and (34) is accounted for by the fact that K(0)=T(0)=0. We now restore the station suffix m fully to the notation. We have established that Km(Nm) is quadratic and Tm(Nm) is linear in Nm. It now follows immediately from the expression on the r.h.s. of (4) that, for all j
G
Dm,
mj(Nm) must be linear in Nm. The result follows.
Acknowledgements
We express our appreciation to a referee, whose comments stimulated a range of improvements to the paper, and to the Engineering and Physical Sciences Research Council for supporting the work of the first and second authors through the award of grant GR/M09308. We thank Professor Isi Mitrani of the Department of Computing Science, Newcastle University, for his invaluable input regarding the modelling of computer systems.




