# Victor: Minimum Cut

Problem Definition

In graph theory, a minimum cut of a graph is a cut whose cutset has the smallest number of elements (unweighted case), or smallest sum of weights possible (Definition taken from here).

Data set

DBLife is a prototype system that manages information for the database research community (see dblife.cs.wisc.edu). DBLife is developed by the Database group at University of Wisconsin-Madison and Yahoo! Research. It contains information about Databases papers, conferences, people and more.

Victor Model and Code

The following code shows the model specification and model instantiation for the Minimum Cut problem applied to the DBLife Web site data set.

```-- This deletes the model specification
DELETE MODEL SPECIFICATION cut;

-- This creates the model specification
CREATE MODEL SPECIFICATION cut (
model_type=(python) as (w),
data_item_type=(int,int,float8) as (x,y,weight),
objective=examples.CUTS.stochast_cuts.loss,
objective_agg=SUM,
);

-- This instantiates the model
CREATE MODEL INSTANCE dblife_cut_mi
EXAMPLES dblife_cut(x,y,score)
MODEL SPEC cut
INIT_FUNCTION examples.CUTS.stochast_cuts.init
STOP WHEN examples.CUTS.stochast_cuts.stopping_condition
;
```

This specification creates a python-type "cut" model which is stored in the database as a byte array. The data items are specified by 3 values: integer x, integer y, and float weight. We specify the loss function, and that the scores are going to be aggregated by the SUM aggregator. Finally, we define the gradient step for the model.

For instantiating the model, we specify how to initialize the model by giving it a function name. Also, we specify when we should stop refining the model.

In the code section below, you can see the loss, gradient, init, and stopping condition function that the user provides. Note that each of the functions are defined in a few lines of python.

```class GraphModel:
def __init__(self, nNodes, S, T, array):
nNodes = max([nNodes,max(S), max(T)])
self.nodes = array
for s in S: self.nodes[s] = 0.0
for t in T: self.nodes[t] = 1.0
self.terminals     = set(S).union(set(T))
self.stepsize      = 5e-2
self.steps         = 1

def c_grad(self, int x, int y, double w, double stepsize):
if self.nodes[x] == self.nodes[y]:
return

diff = math.copysign(stepsize*w, self.nodes[x] - self.nodes[y])

if x not in self.terminals: self.nodes[x] = clip(self.nodes[x] - diff)
if y not in self.terminals: self.nodes[y] = clip(self.nodes[y] + diff)

self.steps += 1
cdef double stepsize = 0.1/(1+self.steps)

def loss(self, d):
(x,y,w) = d
return w*abs(self.nodes[x] - self.nodes[y])

def stopping_condition(s, loss):
if not (s.has_key('state')):
s['state'] = 0
s['last_loss'] = loss.value
return False
else:
s['state'] += 1
print "\tlast_loss={} loss_value={} diff={}".format(s['last_loss'], loss.value, abs(s['last_loss'] - loss.value))
done = (s['state'] > 1000) or (abs(s['last_loss'] - loss.value) < 1e-3)
done = done and (s['state'] > 2)
s['last_loss'] = loss.value
return done

def init():
import src.globals.global_utils as gu
conn = gu.connect_db()
cur  = conn.cursor()

cur.execute("SELECT max(eid) FROM dblife_entities")
(nNodes,) = cur.fetchone()
conn.close()

S     = range(10)
T     = range(30,50)
array = numpy.array([0.5]*nNodes)
return (GraphModel(nNodes+1, S, T, array),array)

return m

def loss(m,data):
return m[0].loss(data)
```

Coming soon.

Running the Example

Open VICTOR_SQL/examples/CUTS/stochast_cuts_front.py and make sure that it uses the right paths for include.

Run the following commands to run the example:

```\$ cd VICTOR_SQL/examples/CUTs
\$ make
\$ ../../bin/victor_front.py cuts.spec
```