1. Introduction
Deterministic location–allocation problems (LAPs) are concerned with locating a set of facilities and allocating their capacity to satisfy the demand of a set of customers with known locations so that the total transportation cost is minimized. Supply centres such as plants and warehouses may constitute the facilities while retailers and dealers may be considered as customers. When the facility locations have to be selected from a given set of candidate locations, the corresponding LAP becomes a discrete optimization problem. LAP belongs to a difficult family of problems. Sherali and Nordai (1988) have shown that it is NP-hard even if all the customers are located on a straight-line. A continuous LAP is obtained when facilities can be located anywhere in the continuous space. The latter problem is known as the multi-facility Weber problem (MFWP) when the facilities are to be located in the two-dimensional Euclidean space (Wesolowsky, 1993). It becomes the so-called single-facility Weber problem if the objective is to determine an optimal location for a single facility. In some situations, facilities can have capacity constraints, which gives rise to the capacitated multi-facility Weber problem (CMFWP). As can be easily observed, in an optimal solution of the uncapacitated problem each customer is served from the nearest facility, which is not true for the more restricted CMFWP because of the capacity constraints. As a consequence, although there have been exact methods and a multitude of heuristics in the literature to solve the uncapacitated MFWP (Brimberg et al, 2000), the number of proposed solution methods is rather scarce for the CMFWP. Notice that in the CMFWP considered in this work, no fixed setup cost incurs when a facility is opened. Instead, the number of facilities to be opened is given. Besides, the demand of a customer can be satisfied from different facilities and hence the CMFWP is a multi-source problem. If, due to some additional considerations, every customer should be served by a single facility, the problem is formulated as a single-source CMFWP, which requires the use of additional binary variables.
The transportation cost is usually assumed to be directly proportional to the amount shipped and the distance between the facilities and customers over which the transportation occurs. Starting with the seminal work of Cooper (1972) different distance functions have been used in the literature to model the distances in CMFWPs. Among them we can mention the Euclidean, squared Euclidean, rectilinear, and ℓp distances. In this paper, we focus on the CMFWP with rectilinear distances (RCMFWP) the mathematical formulation of which is given in the next section.
The earliest attempt to solve the RCMFWP is due to Vaish and Shetty (1976) where the authors formulated the RCMFWP as a bilinear programming problem by replacing each quantity within absolute sign by the difference of two non-negative variables. This formulation was then used by Sherali and Shetty (1977) to develop a cutting plane algorithm, whose efficiency was improved later on by Shetty and Sherali (1980) via deeper cuts and Newton-based parametric programming techniques. They also extended the model to consider more general RCMFWP problems involving multiple products and product flow interactions among facilities. More recently, Sherali et al (1994) developed an efficient algorithm for the RCMFWP by applying the reformulation-linearization technique (RLT) (Sherali and Adams, 1999) on an equivalent mixed integer bilinear programming formulation of the RCMFWP. Their new formulation is based on the early results of Wendell and Hurter (1973), and Hansen et al (1980).
In this paper, we propose a new mixed integer linear programming (MILP) formulation that is more efficient than that of Sherali et al (1994). In Section 2 we explain the new formulation, which is followed in Section 3 by the description of the new heuristics based on this formulation. Section 4 contains the computational results where we compare the performance of the heuristics with those of existing methods in the literature both in terms of solution quality and computation time. Conclusions and directions for future research are given in the last section.
2. Discrete formulations of the RCMFWP
The mathematical formulation of the RCMFWP can be stated as




Here, n is the number of customers and m is the number of facilities to be located. dj and aj=(aj1,aj2)T represent, respectively, the demand and coordinates of customer j. The capacity of facility i is given by si and xi=(xi1,xi2)T denotes its unknown coordinates. The allocations wij are also unknown and represent the amount to be shipped from facility i to customer j with the unit shipment cost per unit distance being cij. This formulation assumes that the problem is balanced, that is, the total supply is equal to the total demand. If the total supply is larger than the total demand, the problem can be made balanced by adding a dummy customer with zero unit shipment cost. If the total supply is less than the total demand, there exists no feasible solution. Note that when allocations wij are known, the RCMFWP reduces to a pure location problem separable into m rectilinear single-facility location problems, each of which can be solved by the median location method (Love et al, 1988). On the other hand, when the locations of the facilities are given, the RCMFWP becomes the ordinary transportation problem. Although pure location and transportation subproblems are easy to solve when considered separately, the RCMFWP belongs to a difficult family.
As shown by Hansen et al (1980), the RCMFWP has always an optimal solution with facilities located within the convex hull of the set of customers and moreover the facility locations occur at the intersection points of vertical and horizontal lines drawn through the customer locations. This implies that all customer locations are candidate sites to locate facilities. Thus, the RCMFWP can be viewed as a problem of determining the best combination of intersection points with respect to the total transportation cost. We illustrate the idea by an example in which there are eight customers with coordinates {(15,8)T,(8,19)T,(22,46)T,(32,43)T,(13,28)T,(10,31)T,(49,21)T,(41,47)T} which are shown with "
" in Figure 1. Observe that although there is a total of 64 intersection points, only 34 of them lie within the convex hull of the customer locations.
Figure 1.
Intersection points and the convex hull of an eight-customer problem.
Full figure and legend (18K)The previous result is actually the starting point of the solution procedure developed by Sherali et al (1994) where the authors obtain a discrete formulation of the RCMFWP. By using new assignment variables yik, i=1,...,m,k=1,...,K with yik=1 if facility i is located at intersection point k and yik=0 otherwise, they give the following formulation


This discrete LAP is a mixed integer bilinear programming problem where (bk1,bk2)T are the coordinates of intersection point k. Note that it is now possible to calculate the rectilinear distance between the kth point and jth customer as (|bk1-aj1| + |bk2-aj2|). Sherali et al (1994) define new coefficients cijk=cij(|bk1-aj1| + |bk2-aj2|) to denote the cost of sending one unit from facility i located at point k to customer j and replace the objective function with the equivalent expression
i=1m
j=1n
k=1Kcijkwijyik. Notice that the new objective function is the sum of nonlinear terms each being the product of a zero-one variable with a continuous one. Therefore, the authors adopt a specialization of the general procedure proposed by Adams and Sherali (1993), which essentially uses the reformulation-linearization technique (RLT) proposed by Sherali and Adams (1999), and reformulate this problem as a mixed integer linear programming problem that possesses a tight linear programming relaxation. A cutting plane procedure is employed to strengthen the developed linear programming relaxation. In order to obtain a quick lower bound on the linear programming problem, a Lagrangean relaxation scheme is applied, which requires the solution of transportation and knapsack subproblems. The Lagrangean dual problem is solved using a modified version of the conjugate subgradient algorithm. This lower bounding technique is embedded within a branch-and-bound algorithm that implicitly enumerates the locational decision space. Furthermore, a heuristic method is proposed to derive upper bounds. This method enables the authors to solve problems of sizes ranging from (m,n)=(4,20) to (5,12) within one percent optimality and problems of size (5,20) within five percent optimality. Unfortunately, the whole scheme becomes computationally very intensive as the number of variables increases. For example, for a RCMFWP with five facilities and 20 customers the initial lower bounding problem has over 20 000 variables and approximately that many constraints. In the following, we propose two new formulations with a smaller number of variables, which are also based on the result of Hansen et al (1980). As will be observed, the second formulation is actually an aggregation of the first one.
Let the continuous variables uikj be the amount shipped from facility i located at point k to customer j, and binary variables yik determine whether facility i should be located at point k or not. The cost coefficient cikj represents the unit shipment cost from facility i located at point k to customer j. The first one of the new formulations referred to as the DLAP is given below.
DLAP:





Constraints (8) guarantee that the demand of each customer is satisfied. Constraints (9) force each facility i to be located at exactly one of the intersection points k=1,...,K. Constraints (10) ensure that the total amount shipped from facility i located at point k is equal to its capacity. Clearly, if there is no open facility at point k, then no shipment can originate there. Notice that equalities (8) and (10) can be replaced, respectively, with inequalities
and
; but this does not affect the optimal solution since the problem is balanced. This formulation allows the opening of more than one facility at the same point. Besides, it is a mixed integer linear programming model, and therefore an additional linearization effort with unfavourable consequences such as the increase in the number of variables and constraints is avoided. An instance of the DLAP with n customers, m facilities, and a maximum number of n2 intersection points within the convex hull results in mn2 binary variables yik and mn3 continuous variables uikj in the worst case. This instance can be solved to optimality by one of the existing commercial solvers that use standard methods of integer programming. Unfortunately, due to the hardness of the problem the required computation time may grow exponentially with any exact method.
As mentioned before, the formulation of the DLAP allows the opening of more than one facility with different capacities at a given point k. Moreover, the unit shipment cost does not only depend on the location of a facility, but also on the facility itself. This means that sending the same amount of goods from two different facilities located at the same point to the same customer may incur different costs. The formulation becomes more compact if these facilities are uniform and the unit shipment cost is only location-dependent. In this case, we redefine flow variables ukj as the amount shipped from point k to customer j, and cost coefficients ckj as the unit cost of shipment from point k to customer j. The latter is obtained by multiplying the unit shipment cost per unit distance that is independent of facility i with the rectilinear distance between point k and customer j. In fact, this second formulation can be obtained directly from DLAP by setting cikj=ckj for all i=1,...,m and using the aggregated flow variables ukj=
i=1muikj in the objective function as well as constraints (8) and (10).
3. New heuristics for the RCMFWP
Solving the RCMFWP to optimality is a difficult task even for moderate-sized instances. As a result, efficient and accurate heuristics are necessary. We propose three heuristic methods in this section. Although they are all based on the formulation of the DLAP, they are different in their philosophy. The first one applies Lagrangean relaxation and the subgradient optimization in order to determine good lower and upper bounds for the RCMFWP. The main motivation of the last two heuristics is that small instances of the RCMFWP can be solved efficiently by any commercial integer programming solver. As mentioned before, the size of the DLAP increases by the number of intersection points within the convex hull of customer locations. Therefore, eliminating intersection points that are unlikely to be an optimal facility location by using intelligent procedures can result in a smaller DLAP which can be solved efficiently. As a result, both of these two heuristics try to determine promising candidate sites to be used in the formulation of a smaller DLAP.
3.1. Lagrangean heuristic
Lagrangean relaxation (LR) is a technique commonly used to find lower (upper) bounds for an integer or mixed integer linear programming problem with a minimization (maximization) objective function (Fisher, 1981). It is based on the idea that in an integer programming problem there are some constraints that make the problem difficult to solve, and the removal of these "hard" constraints from the constraint set results in a problem which can be solved easily, sometimes even by inspection. It is well-known that for any MILP formulation the lower bound that can be obtained by LR is at least as good as the one provided by the linear programming (LP) relaxation. In fact, for the case of DLAP the lower bound of the LP relaxation is far from being satisfactory as stated in the following proposition.
Proposition 1
The LP relaxation of the DLAP has always zero optimal value.
Proof
Let
,
, and
be respectively the set of facilities, customers and intersection points. Clearly |
|=m, |
|=n, |
|=K. Denote the location of customer j
and intersection point k
as locj and lock, respectively. Let yik=
j:locj=lockdj/
j
dj for i
,k
. This choice of yik satisfies constraint (9) because
k
yik=
k
(
j:locj=lockdj/
j
dj)=1. The second equality follows from the fact that 

. The total available capacity at intersection point k
then becomes

since the problem is balanced. This means that the total demand of customers located at locj is fully satisfied from a facility fractionally opened at the same site. This solution, however, has a zero objective value. 
In order to solve the DLAP by LR, we relax the demand constraints (8) with Lagrange multipliers
j,j=1,...,n to obtain the relaxed problem RDLAP.
RDLAP:

Note that RDLAP decomposes by facility; that is, each facility can be considered separately. The formulation of the subproblem associated with facility i is
RDLAPi:





Notice that simple upper bounds uikj
dj are added to the formulation. Although they are redundant for the original problem DLAP, they may improve the value of the lower bound to the DLAP. An important characteristic of RDLAPi is that it does not have the integrality property, that is, the solution to RDLAPi is changed when the integrality restriction on yik is replaced by its linear relaxation 0
yik
1. This has the consequence that the maximum lower bound attainable from RDLAP will be greater than the value of the LP relaxation of the DLAP, which is zero. Another important characteristic of RDLAPi is that it can be solved in polynomial time.
Assume that the multiplier vector
is fixed. Facility i has to be located at one of the intersection points. In order to find the best point, we use a "greedy" inspection procedure where for each point k we determine those customers that are supplied from facility i when located at point k so that the shipment cost ZLRik=
j=1n (cikj-
j)uikj is minimized subject to
j=1nuikj=si and simple upper bounds uikj
dj j=1,...,n. This is equivalent to setting yik=1 for one specific candidate location k and 0 for the rest. This type of problems are known as continuous knapsack problems and can be solved in O(n) times (Martello and Toth, 1990). The approach we have adopted here is conceptually simpler, but computationally less efficient. It can be found in any standard textbook of linear programming (Bazaraa et al, 2004). A sorting effort is required at the beginning which results in O(nlog(n)) time complexity. In the first step of the procedure, coefficients
j=(cikj-
j) are calculated for each customer j. These coefficients are then sorted in non-decreasing order
j(1)
j(2)


j(n). Customers are denoted as j(1),j(2),...,j(n) when this order is respected. Starting with customer j(1) having the smallest coefficient
j(1), we determine the quantities that have to be sent to the customers in such a way that constraints (16) and (17) are not violated. The quantity shipped to the first customer in the list is found as the minimum of the remaining capacity of facility i and the demand of that customer. Initially, the remaining capacity, sirem, of facility i is equal to its capacity si. If the demand of this customer is totally satisfied, that is, dj(1)
sirem, we update the remaining capacity of facility i by setting sirem
sirem-dj(1). If the remaining capacity does not allow satisfying all the demand of the next customer in the list, we ship all the remaining capacity. In other words, for any customer j(l), the shipment uikj(l) is equal to min (sirem,dj(l)). This allocation scheme minimizes the objective function ZLRik to reach the minimum value ZLRik*. We repeat this procedure for each intersection point k=1,...,K. The optimal location of facility i occurs at a point where ZLRi is minimized. As a result, the optimal objective value ZLRi* is determined by setting ZLRi*=minkZLRik*. As soon as we solve subproblems RDLAPi to optimality, we can calculate the optimal objective value of the RDLAP as ZLR*=
i=1mZLRi*+
j=1n
jdj for fixed values of
j. The steps of the method used to solve the relaxed problem RDLAP are formally given below.
Algorithm 1
Solution of the RDLAP for fixed 

ZLR* is a lower bound on the optimal solution of the DLAP for any Lagrange multiplier vector 

n. To find the best lower bound the Lagrangean dual max
ZLR* has to be solved. This can be achieved by using the following subgradient algorithm.
Algorithm 2
Subgradient algorithm
- Initialization: Set
j=0. - Repeat steps 3, 4, and 5 for a predetermined number of iterations.
- With the current values of
j solve the relaxed problem RDLAP associated with DLAP. - Set ZLR*=
i=1mZLRi*+
j=1n
jdj. - Update the multipliers by setting
j=
j+
(dj-
i=1m
k=1Kuikj) where

is the step length.
At each iteration of the subgradient algorithm, we have to determine the step length
to update the multipliers. This requires an upper bound ZUB on the optimal objective value of the RDLAP, which can be calculated using a method that can find a feasible solution to the DLAP. We propose two approaches for this purpose. The first one is solving a transportation problem by using the unit shipment costs between the facility locations provided by the optimal solution to the relaxed problem RDLAP and customer locations. The second approach is based on the two-phase heuristic of Cooper (1972) in which starting from a given set of initial locations, allocation (transportation) and location subproblems are alternately solved until no improvement is possible. The optimal facility locations obtained by solving the RDLAP are used as the initial locations. The two-phase heuristic for the RCMFWP is given next.
Algorithm 3
Two-phase heuristic
- Locate the facilities at arbitrarily selected intersection points xi=(xi1,xi2)T,i=1,...,m.
- Calculate the rectilinear distance (|xi1-aj1| + |xi2-aj2|) between each customer–facility pair.
- Using the distances define cost coefficients
ij=cij(|xi1-aj1| + |xi2-aj2|) and solve the transportation problem {min
i=1m
j=1n
ijwij:
i=1mwij=dj,
j=1nwij=si,wij
0} to determine allocations wij. - Using allocations wij solve m rectilinear distance single-facility Weber problems by the median location method.
- Repeat Steps 2, 3, and 4 until either the facility locations xi=(xi1,xi2)T or the allocations wij remain unchanged.
The two-phase heuristic starts from an arbitrary initial location and ends up at a local minimum by alternatively solving transportation and rectilinear multi-facility location problems, which are respectively done at Steps 3 and 4 of Algorithm 3. As can be observed, the solution of the rectilinear multi-facility location problem at Step 4 reduces to the solution of m rectilinear single-facility location problems since the solution of the transportation problem at Step 3 provides not only the allocations wij, but also the subset of customers supplied by each facility. Subsequently, each facility is moved to the best location with respect to its customers. Using the new locations, the cost coefficients are calculated and the transportation problem is solved again. In order to avoid inferior local minima, the multi-start version of the heuristic can be employed, where each run is performed by selecting different initial locations for facilities. As a reminder we want to mention that the original version of the two-phase heuristic was proposed for the Euclidean distance CMFWP (Cooper, 1972), and the Euclidean distance single-facility location problems were solved by using Weiszfeld's algorithm (Weiszfeld, 1937) during the location phase. However, we use the median location method in each location phase since distances are not Euclidean but rectilinear.
The advantage of Lagrangean relaxation and the corresponding Lagrangean heuristic is that we can compute both lower and upper bounds for moderate-size instances of the DLAP which cannot be solved to optimality by the branch-and-bound method employed in general-purpose solvers within reasonable amount of computation time. On the other hand, there may exist a duality gap which measures the sharpness of the best lower bound by computing the difference between the optimal objective value of the DLAP and the best lower bound obtained by solving the Lagrangean dual problem. Indeed, computational results reported in the next section indicate that the duality gap is about four percent on average (see Tables 10 and 11). On the other hand, the quality or sharpness of the best (smallest) upper bound is dependent on the approach that is used in generating feasible solutions to the DLAP. Based on the experimental results, we find out that using the two-phase heuristic turns out to be a better strategy than solving the transportation problem.
Table 10 - Results of the LH when upper bounds are obtained by solving the transportation problem.
3.2. P-median heuristic
The p-median heuristic developed by Hansen et al (1998) for the Euclidean distance uncapacitated MFWP is based on the observation that in an optimal solution to this problem the facility locations are usually close to the customer locations. In the first step of this heuristic, which yields very good results, a p-median problem is solved by taking the customer locations as candidate sites for the facilities. Then, for each facility and the set of customers it serves, a Euclidean distance single-facility Weber problem is solved by applying Weiszfeld's algorithm (Weiszfeld, 1937). Notice that the solution of the p-median problem provides initial solutions for the single-facility Weber problems.
The same idea can also be adopted for the solution of the RCMFWP. In this approach rather than considering the intersection points within the convex hull of customer locations as candidate sites for the facilities, only customer locations are taken into account in the formulation of the DLAP. Also note that all customer locations are intersection points as well. This significantly reduces the size of the mixed integer linear program DLAP. There are at most n2 intersection points within the convex hull for an instance of the RCMFWP with n customers. Hence, the number of binary variables yik decreases from mn2 to mn resulting in an n-fold reduction. There is another instance-dependent factor which also affects the size of the DLAP. This is the degree of collinearity of customer locations. As the collinearity of customer locations increases, their proportion within the set of all intersection points increases as well. For such cases, the probability that optimal facility locations coincide with customer locations is higher and the solution of the reduced DLAP gives a better approximation to the optimal solution of the RCMFWP.
We first solve the reduced DLAP to optimality and obtain a solution in which optimal facility locations coincide with some of the customer locations. Using them as initial locations, we then apply the two-phase heuristic given in Algorithm 3. Notice that the reduced DLAP replaces the p-median problem in our case.
3.3. Cellular heuristic
Like the p-median heuristic, the third heuristic we propose also aims at obtaining a reduced DLAP formulation. However, rather than using the customer locations as candidate facility sites, we try to determine promising candidate sites by means of the two-phase heuristic, which are then used in the reduced DLAP formulation.
The selection of the promising candidate sites is crucial since it has a significant effect on the solution quality of the reduced DLAP. Gamal and Salhi (2003) propose a cellular heuristic to solve the Euclidean distance uncapacitated MFWP. In this approach, the rectangle covering all customer locations is divided into cells and the cells compete with each other with respect to their potential of containing the optimal facility locations. We adopt this idea and tailor it for the solution of the RCMFWP. First, we run the two-phase heuristic for a large number of times starting with initial facility locations chosen arbitrarily from the set of the intersection points. Then, we draw a rectangle large enough to cover all the locally optimal facility locations obtained as a result of the runs. The rectangle is divided into cells according to a predefined resolution. Given a cell and facility locations that reside within this cell, we compute the centroid of the facility coordinates that can be considered as the representative point for the cell. Empty cells that do not contain any facility locations do not have representative points. The representative points corresponding to the nonempty cells are subsequently used to formulate a reduced DLAP.
We illustrate our cellular heuristic in Figure 2. The customers have the same coordinate values as depicted in Figure 1 and are represented by "
". Locally optimal facility locations obtained by the two-phase runs are shown by "
". The representative points used in the formulation of the reduced DLAP are depicted by "
". They are the centroids of the facility locations within the same cell. We now describe the steps of the cellular heuristic more formally.
Assume that T two-phase runs are performed. Let xi(t)=(xi(t), yi(t))T: i=1,...,m; t=1,...,T be the (x,y) coordinates of facility i obtained in the tth two-phase run. We define the minimum and maximum of the x-coordinates obtained in T runs as xmin=mini,t{xi(t)} and xmax=maxi, t{xi(t)}, respectively. ymin=mini,t{yi(t)} and ymax=maxi,t{yi(t)} are defined similarly. Let Nx and Ny denote the number of cells (partitions) along the x-axis and y-axis, respectively. Also let Lx and Ly represent the width and height of the cells, which are given as Lx=(xmax-xmin)/Nx and Ly=(ymax-ymin)/Ny. Creating a large number of cells by choosing high values for Nx and Ny would require a large number of two-phase runs in order to reach a meaningful distribution of locally optimal locations over the cells. On the other hand, having only a few cells would result in the aggregation of different locations in one cell. In this case, the information conveyed by the facility locations of different two-phase runs would be lost. In the experiments, we set the number of the cells equal to 25 by setting Nx=Ny=N=5 initially.
The next step is to assign each facility location {xi(t): i=1, ..., m; t=1,..., T} found by the two-phase heuristic into the appropriate cell. To do so, each cell is defined by its bottom left corner with coordinates (xcx,ycy)T: 1
cx,cy
N where xcx=xmin+(cx-1)Lx and ycy=ymin+(cy-1)Ly. As a result, facility i with coordinates (xi(t),yi(t))T found in the tth two-phase run belongs to the cell with bottom left corner (xcx,ycy) if and only if xcx
xi(t)<xcx+1 and ycy
yi(t)<ycy+1. In order to handle the case in which xi(t)=xmax and/or yi(t)=ymax, we add a small positive number
to xN and/or yN so that each facility belongs to a cell of the grid.
Once each facility xi(t) is assigned to exactly one cell, we eliminate the empty cells from further consideration. For each nonempty cell p we compute the centroid of the facility locations which are found in that cell. The centroid of cell p designated as
p=(
p,
p) represents a candidate site for the facilities to be used in the formulation of the reduced DLAP. Notice that in this procedure we do not differentiate between facilities when we find the centroids. This is best explained by an example. Suppose that for a problem with five facilities, a cell contains the locations for facility 1 and 2 given by two different two-phase runs and no other run locates a facility in this cell. We average the coordinates of these two locations and the resulting centroid is considered as a candidate site not only for facilities 1 and 2 but for all five facilities. An alternative approach would be to find the centroids (and thus candidate sites) for each facility separately.
The advantage of the cellular heuristic is that the reduced DLAP formulated using the centroids of the nonempty cells is smaller in size than its counterpart formulated with all intersection points as candidate sites. Thus, it is possible to solve the reduced DLAP in a reasonable amount of time by a commercial solver. However, it must be kept in mind that the centroids associated with nonempty cells do not necessarily coincide with the intersection points within the convex hull. Therefore, the facility locations obtained by solving the reduced DLAP to optimality may not coincide exactly with intersection points. At this step, we resort again to the two-phase heuristic which eventually gives a solution for the original RCMFWP.
There are two implementation issues for the new cellular heuristic. The first one is on the number of the two-phase runs. A large number of runs would certainly increase the chance of obtaining good results at the expense of increased computational effort. On the other hand, generating a relatively small number of local optimal facility locations may produce inaccurate candidate sites. In case the generated local optimal solutions do not exhibit a clustering tendency, additional two-phase runs need to be carried out. The second issue is related to determining initial locations for the two-phase heuristic. If different two-phase runs are started with similar or the same initial locations, it is highly probable that the same solutions are produced for a number of times. To avoid this, initial locations have to be generated in such a way that intersection points are selected with approximately the same frequency.
4. Computational results
In this section, we assess the performance of the new methods both in terms of the solution quality and computation time on 33 test instances. The first eight instances are from the literature and we refer to them by the same labels as used in the original papers. Instances 16, 23, 26, 29, 30 are from Sherali et al (1994) and instances 8, 9, and 15 are from Sherali et al (2002). The selection of these instances is based on the availability of the input data (ie, xi, dj, si) in the respective papers. Instances I1–I20 are generated randomly as follows. The x- and y-coordinates of the customers are chosen independently from discrete uniform distributions in the interval [0,99]. Their demand also follow the same type of distribution, but in the interval [1,50]. The generation of the capacities is slightly more complicated. First, the total demand of the customers is averaged over the number of facilities in the instance, say cap. The capacity of each facility except one is then assigned to a value determined from a discrete uniform distribution in the range
. If the total assigned capacity does not exceed the total demand, the capacity of the last facility is found by subtracting the total assigned capacity from the total demand, which ensures that a balanced instance is obtained. If the total assigned capacity turns out to be more than the total demand, we generate a new set of capacity values for the facilities. The last five instances (I21–I25) have the same number of customers and their demands are also equal. They differ in the clustering tendency of the customer locations. Customers of I21 are uniformly distributed within a 99
99 square whereas customers for instances I22, I23, I24, and I25 are generated in such a fashion that there are two, three, four, and five clusters, respectively, within the same 99
99 square. This is done as follows. First, coordinates are determined for the centroids of each cluster. Then, using the x-coordinate (y-coordinate) of the centroids as the mean we generate the x-coordinate (y-coordinate) of the customers from a normal distribution with the standard deviation being equal to eight. An equal number of customers belong to each cluster except instance I23 that consists of three clusters. In this case, one of the clusters has one more customer than the others. The cluster centroids are outlined in Table 1.
For each instance, we give in Tables 2 and 3 the number of facilities to be located (m), the number of customers (n), the number of intersection points within the convex hull of customer locations (K), the optimal or best objective value and the CPU time used by the mixed integer programming solver Cplex 9.1 employed with the default settings on. We set a time limit of 4000 s. This means that if at the end of 4000 seconds Cplex 9.1 does not obtain a provably optimal solution, then we report the current best integer solution found thus far. An asterisk (*) next to the instance label in Tables 2 and 3 indicates that this instance has been solved to optimality.
For the existing benchmark instances we also provide the CPU times required by the exact procedure of Sherali et al (1994). The CPU times of instances 16, 23, 26, 29, 30 are those reported in Sherali et al (1994) using an IBM 3090 machine. For instances 8, 9, 15, the CPU times are taken from Sherali et al (2002) where the authors conduct the experiments using a Sun Ultra 1 workstation. New methods are coded in Microsoft Visual Basic 6.0 and run on a desktop computer equipped with a 3.2 GHz Pentium 4 processor having 2 GB of RAM. The speed of our hardware platform is nearly eight times as fast as an IBM 3090 machine and four times as fast as a Sun Ultra 1 workstation [http://www.netlib.org/benchmark/performance.ps accessed 01 February 2006]). To give an idea of the efficiency of the previous methods in terms of our platform we convert the reported CPU times using the above-mentioned factors and give them in Table 2.
The code for the transportation simplex method that is utilized in solving the transportation problem in the two-phase heuristic as well as in the Lagrangean heuristic (to compute the upper bounds) was adapted from the software available in the Operations Research Laboratory [http://www.orlab.org/software/or_prog/or1/index.html accessed 20 January 2006]. To find the convex hull of the customer locations, we implemented Graham's scan (Graham, 1972). The reduced DLAPs obtained during the application of the p-median heuristic as well as the cellular heuristic are solved by Cplex 9.1.
4.1. Two-phase heuristic
For each test instance, we perform 25 runs with different initial facility locations that are selected arbitrarily from the set of all intersection points (not necessarily within the convex hull). In Table 4 we present the results that are expressed as a percent deviation from the optimal/best objective value for all instances except I23, I24, and I25. The percent deviations are calculated as 100
(Z-Zbest)/Zbest where Zbest denotes the optimal/best objective value found by Cplex 9.1. Since Cplex does not yield a solution within the allowed time limit of 4000 s for the last three instances, the corresponding entries in Table 4 are objective values rather than percent deviations.
The third, fourth, and fifth columns of the table show, respectively, the best, average, and worst objective values attained in 25 runs. The sixth column contains for each instance the total CPU time of 25 runs. The average number of iterations for a single two-phase run is given in the last column. As can be observed by the worst percent deviations, these solutions can be significantly worse than the optimal/best one. On the other hand, we obtain the optimal solution for instances 9, 16, and 26.
We also carried out experiments where the two-phase heuristic is started with facility locations arbitrarily selected from the set of intersection points that lie within the convex hull of customer locations. The results were not better than those of Table 4 and the additional computations required to find the convex hull and to check whether a randomly selected intersection point lies within the convex hull is not worth the effort. The most significant benefit of the two-phase heuristic is that it is very fast and can be applied to any RCMFWP instance regardless of its size. Although the number of iterations per run and the computation time increase with the size of the instance, even for one of the largest instances in the set (instance I21), it takes only 18.52 s to perform 25 runs.
The two-phase heuristic is very sensitive to the initial locations of the facilities as pointed out by Cooper (1972). This is the reason why a large number of runs have to be performed in order to possibly obtain good solutions. To investigate the dependence of the solution quality on the selection of the initial locations, we carry out the following experiment. The optimum facility locations are randomly perturbed by a certain amount which is a percentage of the difference between the maximum and minimum coordinates of the customer locations. The new locations are then used as starting locations in the two-phase heuristic. Our aim is to see whether or not the two-phase heuristic is able to generate near-optimal solutions when it is started with facility locations that are close to optimal ones. Table 5 includes the results when 2.5, 5, 10, 25, and 50% of the coordinate differences are randomly added to or subtracted from the optimum facility locations. For each test instance, we report the average percent deviation from the optimal objective value obtained over 25 replications.
It can be seen that the solutions provided by the two-phase heuristic are fairly robust to the perturbations that are within 10% of optimal locations. Therefore, we conclude that if the initial locations of the facilities are selected close enough to optimal ones, the two-phase heuristic is able to find the optimal solution. To this end, we utilize the p-median heuristic and the cellular heuristic the details of which are explained previously. Recall that both of these heuristics make use of the solution of a reduced DLAP formulation. The heuristics differ only in the selection of the candidate facility sites used in the formulation.
4.2. P-median heuristic
Each test instance except the last five (ie, I21–I25) can be solved using the p-median heuristic where the reduced DLAP is formulated by considering only the customer locations as candidate facility sites. The reason why instances I21–I25 cannot be solved by Cplex 9.1 is simply the fact that the required solution time increases exponentially with the number of customers. Table 6 includes the percent deviations from the optimal objective values. The third column of the table provides the results obtained by solving the reduced DLAP while the fourth column indicates the results improved by running the two-phase heuristic as a post-processing step. The CPU time given in the last column is the total computation time spent for solving the reduced DLAP and application of the two-phase heuristic. However, we have to mention that the time required for the two-phase heuristic is negligible in all cases. Hence, the reported CPU times are essentially the time required to solve the p-median problem by Cplex 9.1.
The solutions attained are better than or equal to those found by Cplex 9.1, except for instances 29, 30, I4, and I6. The two-phase heuristic yields an improvement in 23 out of 28 test instances. Among the remaining five instances, optimal solutions for three instances are already found by solving the reduced DLAP. The results are quite promising since especially for large test instances except I20, the total CPU time is significantly smaller than the time limit set to 4000 s. Furthermore, for the instances solved by Sherali et al (1994, 2002) the computational effort of the p-median heuristic is considerably less than that reported in the literature (see Table 2).
4.3. Cellular heuristic
The first step of the cellular heuristic involves running the two-phase heuristic for T=25 times. Then we determine the largest rectangle covering the locally optimal facility locations obtained during these 25 runs and divide it into 25 cells as explained previously. The empty cells that do not contain any locations are eliminated from further consideration. The centroid of the facility locations in each nonempty cell is computed and used in the formulation of the reduced DLAP. Note that a centroid may not overlap with an intersection point within the convex hull. This does not pose any problem because a candidate site close to an intersection point that is likely to be an optimal facility location will be included in the optimal solution to the DLAP and further improved by the two-phase heuristic. Thus, the last step of the cellular heuristic is the application of the two-phase heuristic one more time where the facility locations are initialized with those provided by the solution of the reduced DLAP.
Table 7 contains the results in terms of the percent deviation from the optimal/best objective value except instances I23, I24, and I25 for which objective values are presented. The third column of the table gives the number of candidate sites in the reduced DLAP formulation. Columns 4 and 5 contain the results obtained by solving the reduced DLAP and by the subsequent improvement of the two-phase heuristic, respectively. As is the case with the p-median heuristic, the CPU time is the total running time of the reduced DLAP solution and the two-phase heuristic. Very good solutions are attained with a relatively low computational effort. The average accuracy of the solutions is -0.67%.
In order to check the dependence of the solution quality on the number of cells and the number of initial two-phase runs, we first increase the number of cells to 36 by dividing each axis into six equal intervals while utilizing the original 25 two-phase runs. Then, keeping the same number of cells we perform 100 two-phase runs. The results are presented in Table 8 with respect to each setting, that is, 25 runs/36 cells and 100 runs/36 cells. The last row indicates that the average accuracy of the solutions goes up from -0.67 to -0.75% when 25 runs are executed and 36 cells are formed. With 100 two-phase runs and 36 cells, the average accuracy further improves to -1.27%. However, the improvement comes at the expense of extra computational effort as indicated by the average CPU times. Increasing the number of cells from 25 to 36 enlarges the size of the reduced DLAP as shown by the average number of candidate sites. This is the reason why the average CPU time rises from 21.00 up to 56.03 s. Performing 100 two-phase runs rather than 25 further amplifies the computational effort to 93.04 s not only because of the increased number of runs but also due to the increase in the number of candidate sites, that is, nonempty cells.
Our experiments show that employing the two-phase heuristic a number of times is a very useful approach to discover those regions that contain promising candidate sites that allow DLAP formulations of smaller sizes. In our experiments almost half of the cells (and intersection points within these cells) on average are eliminated using the random two-phase results.
In Table 9 we make a comparison among the three heuristics' performances. The first column of this table contains the best objective value obtained for each instance by the heuristics. (For the cellular heuristic we use the version with 25 two-phase runs and 25 cells.) This value is taken as the reference point to calculate the relative deviations. Clearly, the relative deviation of the heuristic yielding the best objective value becomes zero percent. As can be seen, for some instances more than one heuristic may provide the best value. The best values marked by an asterisk (*) are smaller than the objective value of the best feasible solution found by Cplex 9.1 in 4000 s.
4.4. Lagrangean heuristic
In this subsection, we present the results obtained by solving the DLAP using the Lagrangean heuristic (LH). It is important to note that we are dealing with DLAP formulations that involve all the intersection points. The update of the Lagrange multipliers is performed via subgradient optimization. The initial value of the parameter
that is used in the update Equation (19) is set to two and halved every 30 iterations without any improvement in the best lower bound obtained. As explained earlier, two different alternatives are implemented in generating the upper bounds (feasible solutions): solving the transportation problem with the facility locations found from the relaxed problem and running the two-phase heuristic with the initial locations calculated by solving the relaxed problem.
To make a fair comparison with the other methods considered in the paper, the experiments are conducted with the time limit of 4000 s. It is implemented by finishing the iteration at which the time limit is reached and terminating the execution of the LH. With this strategy, the reported CPU times may be slightly larger than the time limit. The results in Tables 10 and 11 indicate that when upper bounds are generated using the two-phase heuristic, better results can be obtained as measured by the sharpness of the best upper bound (BUB). It is calculated as the percent deviation of the BUB from the best objective value obtained by either Cplex 9.1 or one of the three heuristics including all the versions of the cellular heuristic. In other words, for each instance the best values in column 2 of Tables 10 and 11 are the optimal/best objective values compiled from Tables 2, 3, 8, and 9. We note that the best lower bounds (BLBs) are always less than the corresponding best objective values, which implies the existence of a duality gap for all test instances. We measure the sharpness of the BLBs by the percent deviation 100
(Zbest-BLB)/Zbest. However, it is important to note that our primary objective is to find good solutions for the RCMFWP and we observe that LH coupled with the two-phase heuristic to generate the upper bounds can find very good solutions. On the average, the sharpness of the BUB is 1.03% with the maximum being 8.57% for the instance I17. Note that with LH it is even possible to find better solutions than the other three heuristics, which is the case for I25.
When the time limit restriction is dropped and the LH is terminated whenever the value of
decreases below 0.0005 or the number of iterations exceeds 800, better lower and upper bounds can be obtained. In this case, the average sharpness of the BLB and the average sharpness of the BUB obtained by the LH with the two-phase heuristic become 3.38 and 0.79%, respectively. However, the execution time on instances I21–I25 takes more than 20 000 s.
5. Conclusions
In this paper, we consider the rectilinear distance capacitated multi-facility Weber problem (RCMFWP), which is a continuous location–allocation problem with a nonconvex objective function. We first propose a new mixed integer linear programming (MILP) formulation. Although an optimal solution to the RCMFWP can be obtained by solving it with general-purpose MILP solvers, this approach becomes inefficient for even moderate-sized instances.
Three heuristic methods have been proposed and both their accuracy and efficiency are evaluated on 33 test instances eight of which are available in the literature. One of the heuristics is based on the Lagrangean relaxation of the new formulation. Although the best lower bounds calculated by the subgradient optimization algorithm are not sharp, that is, there exist duality gaps, the best upper bounds generated are quite promising. This shows that in terms of the solution quality the Lagrangean heuristic can be a choice for the solution method.
The other two heuristics have two common characteristics. First, they are motivated by the idea of reducing the size of the MILP formulation. The p-median heuristic achieves the reduction by taking into account only the customer locations thereby eliminating all other intersection points from the formulation. The cellular heuristic relies on the two-phase heuristic in discovering regions with promising intersection points and eliminates others belonging to the empty cells of a grid with a predefined resolution. The reduced formulation is obtained by using the centroids of the facility locations within the nonempty cells. The second common characteristic is that both heuristics employ the two-phase heuristic as a post-processing step where the optimal solution of the reduced problem is used as the initial facility locations of a final two-phase run.
As a future research direction, we consider other ways of reducing the number of intersection points by identifying the promising ones. This requires a more accurate clustering of the facility locations provided by the two-phase runs. Candidate sites can then be determined within the clusters and used in the reduced DLAP.
References
- Adams WP and Sherali HD (1993). Mixed-integer bilinear programming problems. Math Programming 59: 279–306. | Article | ISI |
- Bazaraa M, Jarvis JJ and Sherali HD (2004). Linear Programming and Network Flows, 3rd edn.. John Wiley and Sons: USA.
- Brimberg J, Hansen P, Mladenovi
N and Taillard ED (2000). Improvements and comparison of heuristics for solving the uncapacitated multisource Weber problem. Opns Res 48: 444–460. - Cooper L (1972). The transportation–location problem. Opns Res 20: 94–108.
- Fisher M (1981). The Lagrangean relaxation method for solving integer programming problems. Mngt Sci 27: 1–18.
- Gamal MDH and Salhi S (2003). A cellular heuristic for the multisource Weber problem. Comput Opl Res 30: 1609–1624. | Article |
- Graham RL (1972). An efficient algorithm for determining the convex hull of a finite planar set. Inform Process Lett 7: 175–180.
- Hansen P, Mladenovi
N and Taillard É (1998). Heuristic solution of the multisource Weber problem as a p-median problem. Opns Res Lett 22: 55–62. | Article | - Hansen P, Perreur J and Thisse F (1980). Location theory, dominance and convexity: some further results. Opns Res 28: 1241–1250.
- Love RF, Morris JG and Wesolowsy G (1988). Facilities Location Models and Methods. Elsevier Science Publishing Co.: New York.
- Martello S and Toth P (1990). Knapsack Problems: Algorithms and Computer Implementations. Wiley: Chichester, England.
- Sherali HD and Adams WP (1999). A Reformulation-Linearization Technique for Solving Discrete and Continuous Nonconvex Problems. Kluwer Academic Publishers: Netherlands.
- Sherali HD, Al-Loughani I and Subramanian S (2002). Global optimization procedures for the capacitated Euclidean, and ℓp distance multifacility location–allocation problems. Opns Res 50: 433–448.
- Sherali HD and Nordai FL (1988). NP-hard, capacitated, balanced p-median problems on a chain graph with a continuum of link demands. Math Opns Res 13: 32–49.
- Sherali HD and Shetty CM (1977). The rectilinear distance location–allocation problem. AIIE Trans 9: 136–143.
- Sherali HD, Ramachandran S and Kim S (1994). A localization and reformulation discrete programming approach for the rectilinear distance location–allocation problem. Discrete Appl Math 49: 357–378. | Article | ISI |
- Shetty CM and Sherali HD (1980). Rectilinear distance location–allocation problem: a simplex based algorithm. In: Extremal Methods and System Analysis. Lecture Notes in Economics and Mathematical Systems vol. 174. Springer: New York. pp 442–464.
- Vaish H and Shetty CM (1976). The bilinear programming problem. Naval Res Logist Quart 23: 303–309. | Article |
- Weiszfeld E (1937). Sur le point lequel la somme des distances de n points donné est minimum. Tôhoku Math J 43: 355–386.
- Wendell RE and Hurter AP (1973). Location theory, dominance and convexity. Opns Res 21: 314–320.
- Wesolowsky G (1993). The Weber problem: history and perspectives. Location Sci 1: 5–23.
Acknowledgements
We thank two anonymous referees for their helpful comments that significantly improved the exposition of the paper. We also gratefully acknowledge the partial support of Bo
aziçi University Research Fund Grant No. 03HA302.

stanbul, Turkey
