Improved linearized models for graph partitioning problem under capacity constraints

We investigate a variant of the Graph Partitioning Problem with capacity constraints imposed on the clusters, giving rise to quadratic constraints in 0–1 formulations. Several compact linearized models of the problem are proposed and analysed: (a) a model featuring binary variables which results from the application of the standard Fortet linearization technique; (b) a more compact model featuring only binary variables, obtained by linearization after reformulation of the quadratic constraints as bilinear constraints; (c) a strengthened version of the latter model, still featuring variables. Computational experiments comparing the relative strength and efficiency of the various models on a series of test instances involving complete graphs with up to 50 nodes are reported and discussed.


Introduction
The graph partitioning problem is a fundamental problem in combinatorial optimization.The basic version of the problem as defined in Garey and Johnson [6] (problem ND14) is as follows.Given an undirected graph G = (V, E) with node set V = {1, . . ., n}, weights w v ∈ Z + for each node v ∈ V , lengths t e ∈ Z + for each edge e ∈ E and a positive integer K, find a partition of V into disjoint sets (or clusters) V 1 , . . ., V k such that v∈Vj w v ≤ K for j = 1, . . ., k minimizing the sum of the lengths of the edges whose endpoints are in different clusters (i.e. the k-Cut defined by the partition).It was shown in [8] that the problem is NP-hard.In this paper, we consider a variant of the graph partitioning problem that we call graph partitioning under capacity constraints (GPCC) where the constraints on the weights of the clusters are replaced with constraints related to the edges incident to the nodes of each cluster.The lengths t e for all e ∈ E will be called the link capacities in our problem.For any node subset U ⊆ V , we define the capacity of U as the sum of the link capacities of the edges incident to at least one node of U , i.e. the edges in E(U ) ∪ δ(U ) where E(U ) is the set of the edges with both end nodes in U and δ(U ) is the set of the edges with exactly one end in U .In our problem, the capacity constraint is to bound the capacity of each cluster by a given constant C. The objective function considered is the same as in the definition in Garey and Johnson [6], i.e. to minimize the total link capacity of the k-Cut between the clusters.Note that as well as in the definition in Garey and Johnson, the number of the clusters k is not an input of our problem (it is part of the solution to the problem).
The GPCC problem has applications in the field of telecommunication network optimization, in particular it turns out to be a relevant model for optimum design of optical networks (see e.g.references [1,7,10]).In this application, the node set V corresponds to geographical sites and t (u,v) to the traffic demands between locations u and v.For various technological reasons, network operators often want to partition the node set V into clusters on which a certain network topology is imposed.For instance, in SONET/SDH optical networks, a common requirement is that every cluster is connected by a local network forming a cycle.Local networks are then interconnected by a secondary federal network which has one access node in each local network.Access nodes carry all the traffics internal to their local network and all the traffic exiting it but have a limited capacity.If we consider the traffic demand t (u,v) as the capacity of the edge (u, v), then the capacity of a local network (cluster) with node set U ⊂ V agrees with our definition of capacity.As the topology and the capacity of local networks are imposed, the cost of these networks is almost fixed (except the cost of physical cables for building them) once the partition of V is determined.Thus, the objective of the problem could be focused on minimizing either the number of local networks (clusters) or the cost of the federal network.For the latter, an objective function often used it to minimize the total link capacity of the k-Cut.The purpose of the present paper is to investigate and compare several 0-1 integer linear programming models for GPCC which can be qualified as compact, i.e. featuring a polynomial number of variables and constraints (by contrast, the model underlying the column generation approach in [10] which is a large scale set partitioning model with exponentially many columns, is essentialy noncompact).Note that two main compact 0-1 models for graph partitioning, namely Node-Cluster models and Node-Node models, have been investigated in the litterature where binary variables represent respectively relations of membership between nodes and clusters (case of the Node-Cluster model) and relations between nodes belonging to a same cluster (case of the Node-Node model).Existing works in the litterature make use of these models in various ways depending on which specific constraints are considered.This is the case of [9], [2], [3] and [4] which address variants of graph partitioning different from GPCC.In [9], the authors discuss the use of Node-Node model for balanced graph partionning and compare it with the SDP approach.In [2], the author considers several Node-Cluster models for balanced graph partitionning problems where the number of clusters and their size are constrained.In [3], [4], the Node-Cluster model also has been investigated, in particular, a comparision of the quadratic and linearized forms of Node-Cluster model has been discussed.Concerning GPCC, we can mention [7], [1] and [11].In [1], the authors have compared the performance of Node-Node models and Node-Cluster models applied to GPCC.It is concluded that for dense graphs, the Node-Cluster model outperforms the Node-Node model in branch-and-bound algorithms in spite of a worse quality of continuous relaxation.However, for sparse graphs, we have shown in [11] that an improvement of the Node-Node model can help to outperform the Node-Cluster model when applied to GPCC.In the present paper, we aim at improving the best solution for GPCC presented in [1] and thus, we restrict to complete graphs and to models of Node-Cluster type.Section 2 discusses two basic Node-Cluster models for GPCC, namely: a) a basic node-cluster model denoted (NC), which is a quadratic 0-1 program involving O(n 2 ) variables, O(n) quadratic constraints and O(n 2 ) linear constraints aimed at breaking symmetry (a necessary ingredient in view of improving efficiency of Branch-and-Bound procedures); b) a compact linear 0-1 model denoted (L-NC) deduced from (NC) by applying the wellknown standard linearization [5], and featuring O(n 3 ) 0-1 variables and O(n 2 ) constraints.This is the Node-Cluster model for GPCC used in [1].
Clearly, in view of the large number of variables, the latter model, though linear, cannot be expected to be practically useful to handle instances of GPCC with significantly more than, say, 50-60 nodes.As an attempt at overcoming this limitation, we investigate in section 3 alternative compact linear models featuring O(n 2 ) variables only; these are deduced by applying linearization after reformulating (NC) as a bilinear 0-1 programming problem, a technique close in spirit to the one proposed by Sherali and Smith in [13] which, as far as we know, has never been applied before to the GPCC problem.Our contributions in the context of the GPCC are twofold: a) it is shown how to exploit some special structures present in the GPCC problem to derive more compact models; this gives rise to a first linear O(n 2 ) model denoted (BL-NC); b) we show how relaxations stronger than those which can be obtained by applying the standard approaches in [13] can be obtained, by proposing the efficient computation of improved bounds on the additional variables involved in the linearization; this gives rise to a strengthened version of the latter model, denoted (S-BL-NC).
Finally the various compact linear 0-1 formulations are compared computationally in section 4 on a series of test problems involving instances of complete graphs with up to 50 nodes.The comparison of (L-NC), (BL-NC) and (S-BL-NC) shows that the latter clearly outperforms the other two models both in terms of strength of the relaxations and in terms of computation time.

Integer Programming Models
Further on, we consider the GPCC when G = K n the complete graph of n nodes.Hence, for every ordrered pair (u, v) of nodes, there is an edge (u, v) and a capacity t (u,v) .Note that this is not restrictive since the models can be applied to arbitrary graph G = (V, E) by fixing t (u,v) = 0 for every ordrered pair (u, v) of nodes such that (u, v) / ∈ E.

Node-Cluster Model [7]
We first present the model for GPCC given by Goldschmidt et al. in [7].Note that the model was originally designed for the so called k-SRAP problem where the number of clusters in the partition is at most k but we adapt it here to the case when the number of clusters is not constrained (we later show how to modify it back to the k-SRAP problem).Also the model presented in [7] was a linearization of the quadratic model we present here using a standard technique that we recall in Section 2.3.Let x ui = 1 if the node u is assigned to cluster i and x ui = 0 otherwise.Define T u = v =u t (u,v) as the total capacities of the edges incident to node u.The total capacities outside the clusters is then equal to the total capacities minus the capacities inside the clusters, i.e.
The model can be written as follows: x ui ∈ {0, 1} The first constraints are the capacity constraints for each cluster i for all i = 1, . . ., n.
The second constraint imposes that each node is assigned to exactly one cluster.The objective function is to minimize the total capacity between the clusters.

The Node-Cluster Model with symmetry breaking
As noted by [10] the Node-Cluster model is highly symmetric (it is easy to see that the same partition has many representations in the model) and gives poor results in practice.Some constraints were proposed in [10] to remove some of the symmetry of the model.In [1], the authors have proposed two families of constraints that remove all the symmetry related to having several different representations for the same partition.They impose that if the cluster indexed by i is not empty then the node of index i should be the smallest index of a node contained in it by adding the constraints: From now on, we shall omit the variables x ui for 1 ≤ u < i ≤ n in our models as they are equal to 0. Moreover, note that we can now model the k-SRAP problem by simply bounding the number of non-empty clusters (i.e.n i=1 x ii ≤ k).In summary, the final Node-Cluster model with symmetry breaking is as follows: 1 .This is obtained by applying the classical linearization technique introduced by Fortet [5].In this linearization, each product x ui x vi for all (u, v) ∈ E and i = 1, . . ., min(u, v) is replaced with a variable y uvi and the following constraints are added in (NC): for all (u, v) ∈ E, i = 1, . . ., min(u, v), The objective function and the capacity constraints can then be rewritten as As t uv ≥ 0 for each (u, v) ∈ E and as the constant C in the capacity constraints is positive, it is clear that solutions that are optimal for the objective function and comply with the capacity constraints will also maximize the value of y uvi for all (u, v) ∈ E and i = 1, . . ., min(u, v).Hence, in the linearized model of (N C), the constraints ( 6) and ( 7) which bound y uvi from bottom can be omitted, thus leading to the follwing linearized model for (NC): t (u,v) y uvi s.t.: constraints (8), ( 2), (3) Towards more compact linearized models with O(n 2 ) variables

General principles
We can see that in (NC) the quadratic terms n−1 u=i n v=u+1 t (u,v) x ui x vi for i = 1, . . ., n are the same in both the objective function and in the capacity constraints.They can be rewritten as n−1 u=i x ui ( n v=u+1 t (u,v) x vi ) for i = 1, . . ., n.The method introduced by Sherali and Smith in [13] applied to linearized these quadratic terms consists in two phases.
• In the first phase, the quadratic term • In the second phase, the latter is linearized by introducing z ui = x ui λ ui and setting where λ ui min / max = min / max{ n v=u+1 t (u,v) x vi : x ∈ X} where X is a suitable relaxation of the subprogram of (N C) which does not involve the quadratic constraints.Proposition 3.1 In the application of the Sherali-Smith approach to (NC), (i) for all i = 1, . . ., n and u = i + 1, . . ., n, λ ui min = 0 holds.(ii) for a given u = 1, . . ., n, we have λ ui max = λ ui max for all i, i <= u.
Proof.(i) It is easy to see that λ ui min = 0 for any X as we can always have a solution in X such that x vi = 0 for all v = u + 1, . . .n.
(ii) Given 1 < u ≤ n, let X be any relaxation of (N C) which may include the capacity constraints and let 1 ≤ i < i ≤ n, we have λ ui max = max{ n v=u+1 t (u,v) x vi : x ∈ X} and λ ui max = max{ n v=u+1 t (u,v) x vi : x ∈ X} as we can see that the contraints and the objective are totally separable and interchangeable for i and i .Thus, λ ui max = λ ui max = λ u max for any relaxation X.
Hence, as λ ui max = λ ui max for all i, i <= u (Proposition 3.1(ii)), let λ u max denote this common value.
Proof.The remark obviously follows from the formula λ ni max = max{ n v=n+1 t (n,v) x vi : x ∈ X}.
Taking into account Proposition 3.1 and setting h ui = λ ui − z ui , we can rewrite the linearization constraints (9), ( 10) et (11) as follows: We can then state the compact linearized (NC) model (called p-BL-NC for preliminary bilinear NC ) as follows.
(p-BL-NC) min 1 2 Since optimal solutions of (p-BL-NC) should maximize as much as possible the variables z ui for i = 1, . . ., n and for u = i, . . ., n, we note that the variables h ui play the role of slack variables in ( 16).The latter is the only constraint involving h ui except the bound constraint (19).Hence, without loss of generality, we can eliminate the variables h ui from the model, leading to the equivalent formulation.
(g-BL-NC) min 1 2 where (g-BL-NC) stands for generic bilinear NC.Note that in [13], the general forms of (p-BL-NC) and (g-BL-NC) have also been presented.The latter is only a relaxation of the former in the general case.In spite of this, the authors in [13] show that the general form of (g-BL-NC) outperforms the one of (p-BL-NC) in their numerical experiments.By above arguments, in the case of (NC), we can deduce the following stronger theoretical result.

Proposition 3.2
The optimal values of the linear programming relaxation of (p-BL-NC) and (g-BL-NC) coincide.

First estimates of the upper-bound parameters and the model (BL-NC)
In this section, we discuss how to estimate the parameters λ u max for u = 1, . . ., n in (g-LB-NC).Recall that in the original method suggested in [13], λ ui max = max{ n v=u+1 t (u,v) x vi |x ∈ X} where X is suitable relaxation of the subprogram of (N C) which does not contain the quadratic constraints.A first way of estimating λ ui max is to simply pick X = {0, 1} n which implies λ ui max = n v=u+1 t (u,v) for all i = 1, . . ., u.We obtain then the following model.17), ( 2), ( 3) Proposition 3.3 The linear programming relaxation of (BL-NC) is weaker than the one of (L-NC).
Proof.We can see that the linear programming relaxation of (L-NC) can be rewritten as follows.
and the one of (BL-NC) is equivalent to x ui = 0 As we have x ui ) where x ui ∈ [0, 1] for all 1 ≤ i < u ≤ n, (L-NC) has a tighter capacity constraint and a tighter objective than (BL-NC).

Improved estimates of upper-bound parameters and the model (S-BL-NC)
Clearly, to improve the quality of the n bounds λ u max (more precisely n−1 since λ n max = 0 (see Proposition 3.1)), we should take a set X more complicated than the set {0, 1} n .For each 1 ≤ u ≤ n − 1, by Propostion 3.1, we can estimate λ u max by fixing any 1 ≤ i ≤ u and by maximizing n v=u+1 t vi x vi with the variables x vi for 1 ≤ v ≤ n, over a subset of constraints Xu i of (NC) which involves these variables.It is interesting to remark that in view of the separable character of (NC), such a subset Xu i will contain at most one quadratic capacity constraint.The following proposition tells us how to choose Xu i as tight as possbile.

Proposition 3.4 The set Xu
i defined as x vi x v i t (v,v ) ≤ C and x vi ∈ {0, 1} for v=i,..,n} is as tight as possible for the estimation of λ u max .
Proof.We can see that all the constraints of (N C) are separable in i, the second index of the variables x ui except the constraints (2).But when i is fixed, the constraints (2) together with the constraints (3) can be reduced to just x vi ≤ 1 for v = i, . . ., n which is implied by the constraints x vi ∈ {0, 1}.
The constraints in Xu i can be linearized by classical linearization.As the 0/1 constraints are imposed, we obtain an equivalent set.Hence, denoting λu max the parameter estimated by maximizing n v=u+1 t vi x vi over Xu i , λu max can be obtained by solving the following 0/1 linear program.
where x v denotes x vi for v = i, .., n as i is fixed.
In the sequel, we denote (S-BL-NC) the model deduced from (BL-NC) in Section 3.2 by substituting the bounds n v=u+1 t vi with λu max .Finally, we establish theoretical relationships between the continuous relaxations of (BL-NC) and (S-BL-NC) in the following proposition.
Proposition 3.5 The linear programming relaxation of (S-BL-NC) is stronger than the one of (BL-NC).
We will see in the next section that in fact in all numerical experiments that we have function value and the integrality gap.The numerical results presented in Table 4.1 confirm the theoretical result in Proposition 3.3 that the linear programming relaxation of (BL-NC) is slightly weaker than the one of (L-NC).In spite of this, we can observe that solving the linear programming relaxation of (BL-NC) is ten to twenty times faster than solving the one of (L-NC).We will see in the next section that this turns out to be a major advantage in branch-and-bound algorithms for solving GPCC.The linear programming relaxation of (S-BL-NC) presents all the advantages: • It is the strongest in all the tests.For most instances, the integrality gap is divided by a factor 2 on average; • The time for solving it is very short, nearly the same as for the linear programming relaxation of (BL-NC).
These good characteristics of (S-BL-NC) result in reducing signicantly the time for solving GPCC by branch-and-bound algorithms and thus leading to a significant increase in the size of the instances solved to optimality as compared with the tests in [1].

Comparing exact solutions
We now present results on the computations for exact solutions for GPCC using respectively the models (L-NC), (BL-NC) and (S-BL-NC).We solve the three models by using the CPLEX 12.3 solver.To ensure that comparisons will not be biased, we switch off the CPLEX pre-solve and inactivate all its generic MIP cuts.Thus the algorithm used to solve the three models is in fact a branch-and-bound algorithm based on its linear relaxation.We set the CPU time limit equal to 7200 seconds.For each model and instance, we report in Table 4.2 the CPU time and the number of nodes in the branch-and-bound search tree.It can be seen from the table that (S-BL-NC) is the most efficient model for branch-and-bound algorithms.The two "compact" models (BL-NC) and (S-BL-NC) generate more nodes in branch-and-bound trees than (L-NC) but as the time required for solving the linear relaxation is much smaller than for (L-NC), these two models can explore more branch-and-bound nodes and reach the exact optimal solutions more quickly than (L-NC).Note that we have included in the CPU time of (S-BL-NC) the additional time for computing the n bound parameters λ u max for u = 1, . . ., n.Even with this, the reduction of CPU time is important by a factor up to 20 for the larger instances.It is interesting to note that although the (S-BL-NC) model turns out to be stronger than (L-NC) at the root node of the branch-and-bound tree as observed in Section 4.1, the former generates more nodes than the latter in branch-and-bound search trees.A possible explanation is that once computed, the parameters λ u max are fixed thoughout the branch-and-bound process and thus the linear relaxation of (BL-NC) may loose its advantage comparing with the one of (L-NC) once a number of binary variables x ui have been fixed.

Conclusions and perspectives
Several models for the graph partitioning problems with cluster capacity constraints have been discussed and compared, highlighting the superiority of models based on bilinear reformulation of the quadratic constraints.Exact solutions of instances involving complete graphs with up to 50 nodes have been reported where for one of the largest instances, the proposed method yields the optimal solution within about 10 minutes while the classical linearized model only gives a feasible 0/1 solution with 36.3% residual gap after two hours.A key ingredient in achieving efficiency is the computation of stronger bounds for the additional variables linearizing the bilinear expressions.In the experiments reported, these bounds are computed only once at the root node of the branch-and-bound tree.An interesting direction for future investigation would be to recompute these bounds in the course of the branch-and-bound process in an attempt at further reducing the number of nodes explored.This is left to future research.

Table 4 .
2. Exact solution comparisons Time limit exceeded, gap of the best known solution: 7.9% **.Time limit exceeded, gap of the best known solution: 36.3%