# Pivot Algorithm For Generating Self-avoiding Chain, Using Python and Cython

Nov 13, 2014

Pivot al­go­rithm is best Monte Carlo al­go­rithm known so far used for gen­er­at­ing canon­i­cal en­sem­ble of self-avoid­ing ran­dom walks (ﬁxed num­ber of steps). Originally it is for the ran­dom walk on a lat­tice, but it also can be mod­i­ﬁed for con­tin­u­ous ran­dom walk. Recently I en­coun­tered a prob­lem where I need to gen­er­ate self-avoid­ing chain con­ﬁg­u­ra­tions.

The sim­plest ver­sion of this al­go­rithm breaks into fol­low­ing steps:

• Prepare an ini­tial con­ﬁg­u­ra­tion of a $N$ steps walk on lat­tice (equivalent to a $N$ monomer chain)
• Randomly pick a site along the chain as pivot site
• Randomly pick a side (right to the pivot site or left to it), the chain on this side is used for the next step.
• Randomly ap­ply a ro­tate op­er­a­tion on the part of the chain we choose at the above step.
• After the ro­ta­tion, check the over­lap be­tween the ro­tated part of the chain and the rest part of the chain. Accept the new con­ﬁg­u­ra­tion if there is no over­lap and restart from 2th step. Reject the con­ﬁg­u­ra­tion and re­peat from 2th step if there are over­laps. For ran­dom walks on a 3D cu­bic lat­tice, there are only 9 dis­tinct ro­ta­tion op­er­a­tions.

Some ref­er­ences on Pivot al­go­rithm

• [Lal(1969)]: The orig­i­nal pa­per of pivot al­go­rithm.
• [Madras and Sokal(1988)]: The most cited pa­per on this ﬁeld. For the ﬁrst time, they showed pivot al­go­rithm is ex­trememly ef­ﬁ­cient con­tra­dicted to the in­tu­ition.
• [Clisby(2010)]: The au­thor de­vel­oped a new more ef­ﬁ­cient in­ple­men­ta­tion of pivot al­go­rithm and cal­cu­late the crit­i­cal ex­po­nent $\nu$, which is also the Flory ex­po­nent for poly­mer chain in bad sol­vent.

## Python Implementation Using Numpy and Scipy #

The im­ple­ment of this al­go­rithm in Python is very straight­for­ward. You can down­load raw ﬁle.

import numpy as np
import timeit
from scipy.spatial.distance import cdist

# define a dot product function used for the rotate operation
def v_dot(a):return lambda b: np.dot(a,b)

class lattice_SAW:
def __init__(self,N,l0):
self.N = N
self.l0 = l0
# initial configuration. Usually we just use a straight chain as inital configuration
self.init_state = np.dstack((np.arange(N),np.zeros(N),np.zeros(N)))
self.state = self.init_state.copy()

# define a rotation matrix
# 9 possible rotations: 3 axes * 3 possible rotate angles(90,180,270)
self.rotate_matrix = np.array([[[1,0,0],[0,0,-1],[0,1,0]],[[1,0,0],[0,-1,0],[0,0,-1]]
,[[1,0,0],[0,0,1],[0,-1,0]],[[0,0,1],[0,1,0],[-1,0,0]]
,[[-1,0,0],[0,1,0],[0,0,-1]],[[0,0,-1],[0,1,0],[-1,0,0]]
,[[0,-1,0],[1,0,0],[0,0,1]],[[-1,0,0],[0,-1,0],[0,0,1]]
,[[0,1,0],[-1,0,0],[0,0,1]]])

# define pivot algorithm process where t is the number of successful steps
def walk(self,t):
acpt = 0
# while loop until the number of successful step up to t
while acpt <= t:
pick_pivot = np.random.randint(1,self.N-1) # pick a pivot site
pick_side = np.random.choice([-1,1]) # pick a side

if pick_side == 1:
old_chain = self.state[0:pick_pivot+1]
temp_chain = self.state[pick_pivot+1:]
else:
old_chain = self.state[pick_pivot:]
temp_chain = self.state[0:pick_pivot]

# pick a symmetry operator
symtry_oprtr = self.rotate_matrix[np.random.randint(len(self.rotate_matrix))]
# new chain after symmetry operator
new_chain = np.apply_along_axis(v_dot(symtry_oprtr),1,temp_chain - self.state[pick_pivot]) + self.state[pick_pivot]

# use cdist function of scipy package to calculate the pair-pair distance between old_chain and new_chain
overlap = cdist(new_chain,old_chain)
overlap = overlap.flatten()

# determinte whether the new state is accepted or rejected
if len(np.nonzero(overlap)) != len(overlap):
continue
else:
if pick_side == 1:
self.state = np.concatenate((old_chain,new_chain),axis=0)
elif pick_side == -1:
self.state = np.concatenate((new_chain,old_chain),axis=0)
acpt += 1

# place the center of mass of the chain on the origin
self.state = self.l0*(self.state - np.int_(np.mean(self.state,axis=0)))

N = 100 # number of monomers(number of steps)
l0 = 1 # bond length(step length)
t = 1000 # number of pivot steps

chain = lattice_SAW(N,l0)

%timeit chain.walk(t)
1 loops, best of 3: 2.61 s per loop


Above code per­forms a 100 monomer chain with 1000 suc­cess­ful pivot steps. However even with numpy and the built-in func­tion cdist of scipy, the code is still too slow for large num­ber of ran­dom walk steps.

## Cython Implementation #

When come to the loops, Python can be very slow. In many com­plex sit­u­a­tions, even numpy and scipy is not that help­ful. For in­stance in this case, in or­der to de­ter­mine the over­laps, we need to have a nested loop over two sets of sites (monomers). In the above code, I use built-in func­tion cdist of scipy to do this, which is al­ready highly op­ti­mized. But ac­tu­ally we don’t have to com­plete the loops, be­cause we can stop the search if we en­counter one over­lap. However I can’t think of a nat­ural numpy or scipy way to do this ef­ﬁ­ciently due to the con­di­tional break. Here is where [Cython] can be ex­trememly use­ful. Cython can trans­late your python code to C and trans­late your C or C++ code to a Python mod­ule so you can di­rectly import your C/C++ code in Python. To do that, ﬁrst we just hand­write our pivot al­go­rithm us­ing plain C++ code.

#include <math.h>
using namespace std;
void c_lattice_SAW(double* chain, int N, double l0, int ve, int t){
... // pivot algorithm codes here
}


Name the ﬁle c_lattice_SAW.cpp. Here we de­ﬁne a func­tion called c_lattice_SAW. Where chain is the ar­ray stor­ing the co­or­di­nates of monomers, N is the num­ber of monomers, l0 is the bond length, t is the num­ber of suc­cess­ful steps.

• C++11 li­brary ran­dom is used here in or­der to use Mersenne twister RNG di­rectly.
• The C++ code in this case is not a com­plete pro­gram. It does­n’t have main func­tion.

The whole C++ code can be found in this gist. Beside our plain C code, we also need a header ﬁle c_lattice_SAW.h.

void c_lattice_SAW(double* chain, int N, double l0, int ve, int t);


If you don’t want to hand­write a C code, an­other way to use Cython is to write plain Cython pro­gram whose syn­tax is very much Python-like. But in that way, how to get high qual­ity ran­dom num­bers ef­ﬁ­ciently is a prob­lem. Usually there are sev­eral ways to get ran­dom num­bers in Cython

• Use Python mod­ule random.This method will be very slow if ran­dom num­ber is gen­er­ated in a big loop be­cause gen­er­ated C code must call a Python mod­ule every­time which is a lot of over­head time.

• Use numpy to gen­er­ate many ran­dom num­bers in ad­vance. This will re­quire large amount of mem­ory and also in many cases, the to­tal num­ber of ran­dom num­bers needed is not known be­fore the com­pu­ta­tion.

• Use C func­tion rand() from stan­dard li­brary stdlib in Cython. rand() is not a very good RNG. In a sci­en­tiﬁc com­pu­ta­tion like Monte Carlo sim­u­la­tion, this is not good way to get ran­dom num­bers.

• Wrap a good C-implemented RNG us­ing Cython. This can be a good way. Currently I have found two ways to do this: 1. [Use numpy randomkit] 2. [Use GSL li­brary]

• Handwrite C or C++ code us­ing random li­brary or other ex­ter­nal li­brary and use Cython to wrap the code. This will give the op­ti­mal per­for­mance, but comes with more com­pli­cated and less read­able code.

What I did in this post is the last method.

Now we need to make a .pyx ﬁle that will han­dle the C code in Cython and de­ﬁne a python func­tion to use our C code. Give the .pyx a dif­fer­ent name like lattice_SAW.pyx

import cython
import numpy as np
cimport numpy as np

cdef extern from "c_lattice_SAW.h":
void c_lattice_SAW(double* chain, int N, double l0, int ve, int t)

@cython.boundscheck(False)
@cython.wraparound(False)

def lattice_SAW(int N, double l0, int ve, int t):
cdef np.ndarray[double,ndim = 1,mode="c"] chain = np.zeros(N*3)
c_lattice_SAW(&chain,N,l0,ve,t)
return chain


Compile our C code to gen­er­ate a shared li­brary which can be im­ported into Python as a mod­ule. To do that, we use Python distutils pack­age. Make a ﬁle named setup.py.

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

import numpy

setup(
cmdclass = {'build_ext':build_ext},
ext_modules = [Extension("lattice_SAW",
sources = ["lattice_SAW.pyx","c_lattice_SAW.cpp"],
extra_compile_args=['-std=c++11','-stdlib=libc++'],
language="c++",
include_dirs = [numpy.get_include()])],
)


Instead of nor­mal ar­gu­ments, we also have extra_compile_args here. This is be­cause in the C++ code, we use li­brary random which is new in C++11. On Mac, -std=c++11 and -stdlib=libc++ need to be added to tell the com­pil­ers to sup­port C++11 and use libc++ as the stan­dard li­brary. On Linux sys­tem, just -std=c++11 is enough.

If cimport numpy is used, then the set­ting include_dirs = [numpy.get_include()])] need to be added

Then in ter­mi­nal we do,

On Linux

python setup.py build_ext --inplace


or on Mac OS

clang++ python setup.py build_ext --inplace


clang++ tell the python use clang com­piler not gcc be­cause ap­par­ently the ver­sion of gcc shipped with OS X does­n’t sup­port C++11.

If the com­pi­la­tion goes suc­cess­fully, then a .so li­brary ﬁle is gen­er­ated. Now we can im­port our mod­ule in Python in that work­ing di­rec­tory

import lattice_SAW
import numpy

%timeit lattice_SAW.lattice_SAW(100,1,1,1000)
100 loops, best of 3: 5.97 ms per loop


That is 437 times faster than our Numpy/Scipy way!