An Optimizer in Kafka's Castle

A good IP model may not be a good CP model

I have been experimenting with Google's or-tools, in the form of the python interface (http://code.google.com/p/or-tools/). Much of what I know about modeling with that solver comes from the great blog of Håkan Kjellerstrand (http://www.hakank.org/constraint_programming_blog/)

In the process of converting a timetabling model I created years ago from AMPL (http://www.ampl.com/) to Google or-tools, I had a hunch that a word-for-word translation would not be optimal. The model I was working on is fairly complex but, at its core, it hides an assignment problem:

Given a set \(S\) and a set \(T\) and a cost matrix \(C_{ij}\), \(i\in S\) and \(j \in T\) find the assignment of every element in \(S\) to one element in \(T\) minimizing the total cost. The paradigm is workers and tasks where each task is done by each worker at a certain cost (the cost could represent money, duration, quality, etc… any number of possible objectives).

The classical Integer Programming (IP) Model for the simplest version of this problem uses a binary variable \(x_{ij}, i\in S, j\in T\) where \(x_{ij}=1\) if and only if we assign element \(i\) to element \(j\). This leads to \begin{equation} \min \sum_i \sum_j c_{ij}x_{ij} \end{equation} Subject to: \begin{equation} \sum_j x_{ij} = 1 \quad \forall i \in S \end{equation} \begin{equation} \sum_i x_{ij} = 1 \quad \forall j \in T \end{equation} There are a multitude of variations, some of which I will consider later, with larger right-hand-side and inequalities but this case illustrates perfectly what I want to highlight.

The same model, in AMPL:

set S;
set T;
param cost {S, T} >= 0, default 1;
var x {S, T} >= 0, <= 1;
minimize v:
   sum {i in S, j in T} c[i,j] * x[i,j];

subject to Row_limit {i in S}:  
   sum {j in T} x[i,j] = 1;
subject to Column_limit {j in T}:  
   sum {i in S} x[i,j] = 1;

As a first step, I wrote this in the Python wrapper to Google or-tools, calling an IP solver, IBM's CBC, (https://projects.coin-or.org/Cbc) to do the real work. For the sake of the comparison let us call this the IP Model solved with a real IP solver.

def realipsolve(C):
  from linear_solver import pywraplp
  solver = pywraplp.Solver('CoinsGridCLP',
                           pywraplp.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
  n = len(C)
  x=[[solver.IntVar(0,1,'x[%i,%i]'%(i,j)) for i in range(n)] for j in range(n)]    
  for i in range(n):
      solver.Add(solver.Sum([x[i][j] for j in range(n)]) == 1)
  for j in range(n):
      solver.Add(solver.Sum([x[i][j] for i in range(n)]) == 1)
  v=solver.IntVar(0,sum([C[i][j] for i in range(n) for j in range(n)]),'v')
  solver.Add(v==solver.Sum([x[i][j]*C[i][j] for i in range(n) for j in range(n)]))
  solver.Minimize(v)
  solver.Solve()

If we keep the same model but change the solver to use Google's CP solver, the only change is in the first two lines

from constraint_solver import pywrapcp
solver = pywrapcp.Solver('IPModel')

Let us call this the IP Model solved by a CP solver (sort of a fake IP).

What is arguably wrong with this model is the large number of binary variables. We know that a CP solver uses, deep down, a backtracking algorithm with domain reduction via propagation. (I have not looked at the source of this particular CP solver (I tend to have profoundly allergic reactions to reading C++ code) but I assume it follows this approach.) In such a case, pruning is important, but reducing the number of variables is often more important, and much more easily achieved from the modeler's perspective.

In this simple problem, reducing the number of variables is trivial. Instead of two-dimensional binary variables, we can use a one-dimensional assignment variable of size \(|S|\) with domain in \(T\). With the alldifferent and the element constraints, we obtain this combination CP Model with CP solver:

def cpsolve(C):
  from constraint_solver import pywrapcp
  n = len(C)
  solver = pywrapcp.Solver('CPModel')
  x = [solver.IntVar(0,n-1,'x[%i]' % i) for i in range(n)]
  solver.Add(solver.AllDifferent(x))
  v=solver.IntVar(0,sum([C[i][j] for i in range(n) for j in range(n)]),'v')
  xv = [solver.IntVar(1,100,'xv[%i]' % i) for i in range(n)]
  for i in range(n):
      solver.Add(xv[i] == solver.Element(C[i],x[i]))
  solver.Add(v == solver.Sum([xv[i] for i in range(n)]))
  objective = solver.Minimize(value,1)
  db = solver.Phase(x,
                    solver.INT_VAR_SIMPLE,
                    solver.ASSIGN_MIN_VALUE)

The variable xv is not essential; we could combine the element and the sum into one constraint.

The reduction in the number of variables is notable, from \(|S| \times |T|\) to \(|S|\).

Now, how does it compare in running time? I experimented with increasing sizes, with random cost matrices. Here is one such experiment:


n	IP Model	IP Model	CP Model
	IP solver	CP solver	CP Solver
10	0.01	0.07	0.00
11	0.01	0.32	0.01
12	0.01	0.60	0.01
13	0.01	1.98	0.02
14	0.01	5.83	0.02
15	0.02	35.16	0.06
16	0.02	47.35	0.07
17	0.02	133.21	0.14
18	0.02	434.83	0.33
19	0.03	1882.80	0.28
TOT	0.16	2542.15	0.93

Two obvious conclusions: First, comparing the last two columns, both using Google's CP solver, we notice that reducing the number of variables has, as expected, a huge impact on the running time of the solver. The literal translation of the AMPL IP model, with its binary variable is decidedly the wrong thing to do. My original hunch was correct.

Less encouraging is the comparison between the first and last column. It seems that even with a good CP model, the CP solver is no match for an IP solver for this simple problem. We can verify this by running both side-by-side on larger instances. Here is one such run:


n	IP Model	CP Model
	IP solver	CP Solver
10	0.01	0.01
11	0.01	0.01
12	0.01	0.01
13	0.01	0.02
14	0.01	0.04
15	0.02	0.06
16	0.02	0.07
17	0.02	0.12
18	0.02	0.32
19	0.03	0.29
20	0.03	0.65
21	0.03	1.18
22	0.04	2.59
23	0.04	3.10
24	0.04	9.90
25	0.05	9.01
26	0.04	15.39
27	0.05	73.33
28	0.05	53.48
29	0.05	57.74
30	0.06	286.55
31	0.06	276.21
32	0.06	412.36
33	0.07	537.88
34	0.07	324.70
35	0.08	2004.02
36	0.08	6390.71

Caveat: I did not try to play with the search logic, choosing variables and values more intelligently. Or, rather, I did try to play a little, but did not see a noticeable difference in runtime.

Is this surprising? Maybe, maybe not. This problem is really trivial, from an IP perspective. It is a bipartite matching of minimum cost. If the IP solver recognizes this structure, it may apply the Network Simplex method and solve gigantic instances in the blink of an eye. The problem, in a sense, is too pure, too simple, and IP solvers have had over fifty years to tune their solution techniques to such problems. Moreover, the problems I am truly interested in solving, even if they have an assignment problem at their core, have a slew of side-constraints in addition.

What to do now? I could play with the search logic. That is, choosing the next variable and the next value with more care. More interesting to me is to add more and more side-constraints to see at what point, if any, will the CP solver dominate. My hunch is that, after a number of additional constraints, the CP solver will dominate. More later …

Date: 2013-01-10 16:00:04 EST

Author: Serge Kruk

Org version 7.8.09 with Emacs version 24

Validate XHTML 1.0

An Optimizer in Kafka's Castle

2013-01-10

A good IP model may not be a good CP model

1 comment:

About Me

Blog Archive