## Neural Networks Part 1: Feedingforward and Backpropagation

**Introduction**

Last time we looked at the perceptron model for classification, which is similar to logistic regression, in that it uses a nonlinear transformation of a linear combination of the input features to give an output class.

A neural network generalizes this idea by conceiving of a network of nodes which each take as an input a linear combination of the outputs of other nodes, then use some activation function to turn this into an output, such as:

- the sigmoidal logistic function, as in logistic regression,
- the binary step function seen in the perceptron,
- the identity function as seen in linear regression,
- the sigmoidal .

**A Feedforward Neural Network**

A neural network is just a directed graph. We will be looking at feedforward neural networks, which are acyclic.

Each node is called a *neuron.* There are three main kinds of neurons:

- Input neurons, one neuron for each feature, so if our input were vectors in , we would have 3 input neurons. These neurons have trivial activation functions and no incoming edges.
- Output neurons, one neuron for each output feature, so if our so if our output were vectors in , we would have 3 output neurons. These will typically have a nonlinear activation function, chosen so that the output range matches our training data. These neurons have no outgoing edges.
- Hidden neurons are in between input and output neurons and have both incoming and outgoing edges.

Since the graph is acyclic, we can think of it as being composed of layers, as in the following picture taken from wikipedia:

although note there can be many hidden layers. Each edge has a *weight, *a real number.

The neural network makes a prediction by *feeding forward* the input. To describe this we develop some notation. The discussion that follows is indebted to this very helpful page.

**Notation**

Let be a neural network and index the vertices of so that each is a vertex for .

For each let denote the set of indices of the *targets *of . That is the set of indices such for each there exists a directed edge from . We may also conflate with the set of indexed neurons if the context makes this unambiguous.

Similarly, for each let denote the set of indices of the *sources* of .

For each index , let denote the *activation function* of .

For each index , let be the *output* of . When the argument is understood, we may simply write .

For each directed edge define to be the *weight* of the edge.

Where the outputs are understood, for each index , define the *input* to be , the weighted sum of the outputs of the previous layer.

**Feeding Forward**

Feeding forward refers to how an input becomes an output or prediction.

As you might have guessed: feeding forward an input vector , the input and output to each input neuron is just the corresponding component of the vector .

From there the input to a neuron in the first hidden layer is simply the sum of the components of , weighted by the weights of the incoming edges to that hidden node.

You repeat this for each hidden node. Then the inputs to the hidden nodes are transformed via their activation functions to their outputs. The weighted sums of these outputs are then passed on as inputs to the next layer, and so on.

This is probably best understood through a simple example.

Here we have 4 neurons. The input is two dimensional, so we have two input neurons shown in blue. We have a single hidden layer with just one hidden neuron, shown in green. Finally we have a single output neuron shown in orange. Hence the output is one dimensional.

**Choosing the Weights: Backpropagation**

So we have seen how a neural network can make predictions, but how can we choose the weights so that the predictions are any good? The answer is the same as in logistic regression, we use gradient descent. In order to do this we need to use differentiable activation functions and choose an appropriate error function to minimize.

If is the input vector, and is the target vector, let be the prediction of the neural network. Then we define the error to be . This is the natural Euclidean norm of the difference between the target and the prediction. The factor of a half is there for aesthetical reasons, since we will be differentiating and gaining a factor of 2.

The trick to the algorithm is to figure out how the weights contribute to the error, and backpropagation is an elegant way of assigning credit/blame to each node and edge. We fix an input , and write for , and for each each edge we are seeking . Once we have found this, we set , where is the *learning rate*. So we are doing stochastic gradient descent as in the perceptron.

It is clear that an edge and its weight contribute to the error through the target ‘s input. So it makes sense to divide and conquer using the chain rule, that is

.

The first part of the product is worth giving a name, we denote it by and call it the *error* at the node.

The second part of the product is simple to calculate:

Returning to , we can calculate it recursively, assuming the errors for the neurons in have already been calculated.

Observe that can be thought of as a function of the variables for indexing any layer, and so using the chain rule we have

notice that if we assume that each layer is fully-connected to the next, then we could write the above succinctly as an inner product, and inner products can be calculated very efficiently, for instance using a GPU, so it is often faster to have more connections!

Together this gives

.

Since we have calculated the error at a node recursively, we had better deal with the base case – the output neurons. But this follows simply from the definition of to be , with the factor of 2 cancelling with the half as promised.

**Bias Neurons**

The final piece of the puzzle are called *bias **neurons*, which I had left out before to simplify the discussion. Each layer, except the output layer, has a bias neuron. A bias neuron has no inputs, and its output is always 1. Effectively it just adds the weight of its outgoing edges to the inputs of the next layer.

These behave just like regular neurons, only when updating their weights, we note that since their input is effectively constant, they have , and so we need only calculate the error at a bias neuron to update its weights.

**Next Time**

In the next installment I will show how to compactly represent the process of feeding forward and backpropagation using vector and matrix notation assuming the layers are fully connected. We will then use this to give a Python implementation.

## Enter the Perceptron

**Introduction**

Some time ago in my post on logistic regression, I talked about the problem where you have a matrix representing data points with features, and a vector of classifications.

In linear regression you would produce a vector so that , but the problem here is that our output could be any real number, not . The solution we investigated was using the logistic link function which turns any real number into something we interpret as a probability.

That was one solution, another even simpler one is the function defined . It is convenient in this case to think of the response as taking values in not .

Previously we added a column of ones to in order to create an intercept term, here we will write it separately so so that where is called the *bias*.

This post will take the book ‘The Elements of Statistical Learning’ as its guide (the section starting on p. 130), which is freely available online, but there is also a great blog post about it here.

**The Algorithm**

So define , we classify a point as being in the positive class if and negative if . Geometrically we have an affine hyperplane , and points are classified depending on whether they are ‘above’ or ‘below’ the hyperplane.

The unit normal to is given by , points above then mean points with positive projection onto this vector.

The signed distance of a point to is found by projecting the vector onto , where is the ‘origin’ of the hyperplane (since an affine hyperplane is a translation of a linear subspace, there is a point corresponding to the translation of the origin). So we get

,

notice this means that is proportional to the signed distance of from .

If is the set of misclassified points, then we can measure the error as the sum of the distances (up to a constant of proportionality), of the misclassified points to , that is

,

where is the response of , notice that this makes each summand negative, which is reversed by the minus sign outside.

If we fix we can differentiate with respect to and , giving

,

and

.

Just like in logistic regression, we can use these partial derivatives to search for a minimum of . The algorithm purposed by Frank Rosenblatt, the inventor of the perceptron, is to modify the hyperplane after looking at each individual misclassified item in as it comes, that is , where is the learning rate. This has the advantage of being much faster than computing the global gradient informed by all of the data, and is an example of stochastic gradient descent.

Of course by changing the hyperplane as we go we may change , and so what we actually do is iterate over each datapoint in a fixed order, and check if it is misclassified under the latest hyperplane, then update and move onto the next point. This is then very easy to code.

**The Code**

First let’s recycle some code we used for logistic regression to get some test data.

import numpy as np from scipy import linalg import random as rd import matplotlib.pyplot as plt def test_set(a,b, no_samples, intercept, slope, variance, offset): #First we grab some random x-values in the range [a,b]. X1 = [rd.uniform(a,b) for i in range(no_samples)] #Sorting them seems to make the plots work better later on. X1.sort() #Now we define our linear function that will underly our synthetic dataset: the true decision boundary. def g(x): return intercept + slope*x #We create two classes of points R,B. R points are given by g(x) plus a Gaussian noise term with positive mean, #B points are given by g(x) minus a Gaussian norm term with positive mean. R_turn=True X2=[] #This will hold the y-coordinates of our data. Y=[] #This will hold the classification of our data. for x in X1: x2=g(x) if R_turn: x2 += rd.gauss(offset, variance) Y.append(1) else: x2 -= rd.gauss(offset, variance) Y.append(-1) R_turn=not R_turn X2.append(x2) #Now we combine the input data into a single matrix. X1 = np.array([X1]).T X2 = np.array([X2]).T X = np.hstack((X1, X2)) #Now we return the input, output, and true decision boundary. return [ X,np.array(Y).T, map(g, X1)]

Fairly straightforward stuff I hope! If not then it should make sense when we plot it.

X,Y,g = test_set(a=0, b=5, no_samples=200, intercept=10, slope=0, variance=0.5, offset=1) fig, ax = plt.subplots() R = X[::2] #Even terms B = X[1::2] #Odd terms ax.scatter(R[:,0],R[:,1],marker='o', facecolor='red', label="Outcome 1") ax.scatter(B[:,0],B[:,1],marker='o', facecolor='blue', label="Outcome 0") ax.plot(X[:,0], g, color='green', label="True Decision Boundary") ax.legend(loc="upper left", prop={'size':8}) plt.show()

This will produce something like:

Now to implement the algorithm.

def perceptron(X,y, learning_rate=0.1, max_iter=1000): #Expecting X = n x m inputs, y = n x 1 outputs taking values in -1,1 m=X.shape[1] n=X.shape[0] weight = np.zeros((m)) bias = 0 n_iter=0 index = list(range(n)) while n_iter <= max_iter: rd.shuffle(index) for row in index: if (X[row,:].dot(weight)+bias)*y[row] <=0: #Misclassified point. weight += learning_rate*X[row,:]*y[row] bias += learning_rate*y[row] n_iter+=1 return {'weight':weight, 'bias':bias}

we shuffle the order we take the data points in at each step to get rid of any peculiarities in the order we update. Now let’s try it on our test data.

result= perceptron(X,Y,0.1) def decision_boundary(x, weight, bias): return (-bias - x*weight[0])/weight[1] fig, ax = plt.subplots() R = X[::2] #Even terms B = X[1::2] #Odd terms ax.scatter(R[:,0],R[:,1],marker='o', facecolor='red', label="Outcome 1") ax.scatter(B[:,0],B[:,1],marker='o', facecolor='blue', label="Outcome -1") ax.plot(X[:,0], g, color='green', label="True Decision Boundary") ax.plot(X[:,0], [decision_boundary(x, result['weight'], result['bias']) for x in X[:,0]], label='Predicted Decision Boundary') ax.legend(loc="upper left", prop={'size':8}) plt.show()

Producing:

Not bad! Get the code here.

## Snakes on a Manifold: Part 2

Welcome back. Last time we implemented the main components of the game. This time we will implement the main game loop. The job of the loop is to: get keyboard input, update the game state, update the display.

We will now be using pygame, so you might want to check out a tutorial before proceeding, but I think the best way to learn is just to dive right in. You will however need to install pygame before running the code. If you haven’t already, you should download the code from last time too. Finally if you want to skip to playing with the code yourself, here it is.

**The Code**

Ok here it is.

import pygame import snake as sk from pygame.locals import * class Game(object): def __init__(self, size, topology, speed=6): pygame.init() #Resolution in pixels. self.screen_resolution =600 self.topology=topology self.speed = speed #Controls how fast the snake moves. self.screen = pygame.display.set_mode((self.screen_resolution, self.screen_resolution)) pygame.display.set_caption('SNAKES ON A MANIFOLD!') self.board = sk.Board(size, topology) #Put the snake in the center. self.snake = sk.Snake([size/2, size/2], self.board.opposite_direction, self.board.get_next_coord) #Number of grid cells. self.size=size #Width in pixels of a grid cell. self.square_size=int(self.screen_resolution/self.size) #Now we name some colors in RGB for ease of use. self.black = (0,0,0) self.red = (255,0,0) self.green = (0,255,0) self.blue=(0,0,255) self.darkgrey=(40,40,40) self.background_colour=self.black self.boundary_colour=self.blue self.snake_colour = self.green self.apple_colour=self.red #Initialises screen to background color. self.screen.fill(self.background_colour) #Then we draw the boundary cells. self.draw_boundary() #This command updates the changes we have made to the screen. pygame.display.update() #Main game loop. self.main() def main(self): score=0 Quit=False first=True while True: # main game loop #Displays score in window caption. pygame.display.set_caption('Topology: '+self.board.topology+' | Score: '+ str(score)) #Gets keyboard input direction=None for event in pygame.event.get(): if event.type == QUIT: #User has closed window. Quit=True break if event.type==pygame.KEYDOWN: #User has pressed key. if event.key==K_q: Quit=True break if event.key==K_UP: direction='UP' if event.key==K_DOWN: direction='DOWN' if event.key==K_LEFT: direction='LEFT' if event.key==K_RIGHT: direction='RIGHT' if Quit: break #To see if anything has happened, we check where the head is. head_loc = self.snake.head_position #Now we find out what, apart from the head, is in the grid cell. head_on = self.board.grid[head_loc[0], head_loc[1]] if (head_on == 1) or (head_on==2): #The snake has hit the boundary or part of itself. Quit=True break if head_on==3: #The snake has found an apple. self.snake.grow() x,y=self.board.generate_apple() self.draw_apple(x,y) score+=1 self.update_board() if first: #When the game starts, the snake stands still waiting #for user input to start the game. x,y=self.board.generate_apple() first=False self.draw_apple(x,y) self.draw_snake() pygame.display.update() while True: got_direction=False for event in pygame.event.get(): if event.type == QUIT: Quit=True break if event.type==pygame.KEYDOWN: if event.key==K_q: Quit=True break if event.key==K_UP: direction='UP' got_direction=True break if event.key==K_DOWN: direction='DOWN' got_direction=True break if event.key==K_LEFT: direction='LEFT' got_direction=True break if event.key==K_RIGHT: direction='RIGHT' got_direction=True break if Quit: break elif got_direction: break if Quit: break self.snake.move(direction) self.draw_snake() pygame.display.update() #This pauses the game to slow it down, adjust to make faster/slower. pygame.time.Clock().tick(self.speed) #Waiting to quit after died. Quit=False restart=False while True: # main game loop #Gets keyboard input for event in pygame.event.get(): if event.type == QUIT: Quit=True break if event.type==pygame.KEYDOWN: if event.key==K_q: Quit=True break if event.key==K_r: restart=True break if Quit: break if restart: break if restart: G=Game(self.size, self.topology) pygame.quit() def draw_boundary(self): #Draws any boundary grid cells. for row in range(self.size): for col in range(self.size): if self.board.grid[row, col]==1: y=row*self.square_size x=col*self.square_size rect=pygame.Rect(x,y, self.square_size, self.square_size) pygame.draw.rect(self.screen,self.boundary_colour, rect) def draw_snake(self): #We add a snake colored square where the head has moved, #and add a background colored square where the tail was. old_tail_position=self.snake.segments[0].previous_position new_head_position=self.snake.head_position top=old_tail_position[0]*self.square_size left=old_tail_position[1]*self.square_size background_rect=pygame.Rect(left,top, self.square_size, self.square_size) pygame.draw.rect(self.screen, self.background_colour,background_rect) top=new_head_position[0]*self.square_size left=new_head_position[1]*self.square_size snake_rect=pygame.Rect(left,top, self.square_size, self.square_size) pygame.draw.rect(self.screen, self.snake_colour,snake_rect) def draw_apple(self, x, y): apple_rect=pygame.Rect(y*self.square_size,x*self.square_size, self.square_size, self.square_size) pygame.draw.rect(self.screen, self.apple_colour, apple_rect) def update_board(self): #Updates the numpy array of the board. old_tail_position = self.snake.segments[0].previous_position new_head_position=self.snake.head_position self.board.grid[old_tail_position[0], old_tail_position[1]]=0 self.board.grid[new_head_position[0], new_head_position[1]]=2

And we’re done! Now you can play a game by typing something like this.

G=Game(20, 'Mobius Strip')

Please let me know what you think. The code is here. If anyone is feeling ambitious, you could add a menu screen where you could select a speed and topology.

## Snakes on a Manifold: Part 1

Have you every wondered what it would be like to live on a Mobius Strip? I was playing around with some polygonal schemas for various topologies awhile back and I was reminded of the game ‘Snake’ which I used to play on my dad’s phone when I was a kid. I realized that the game was set on a torus. So for fun, and to illustrate a talk I was giving, I decided to implement this game in python using a module called pygame where you could play with a variety of topologies.

It looks like this:

In this post we will implement the underlying logic of the game, and in the next post we will implement the input/output part with pygame. If you just want the code, click here.

**The Code**

We will create a Snake class, and a Snake will be composed of Snake_Segment objects. Each Snake_Segment is responsible for its behind, the Snake_Segment immediately behind it, so we have a sort of linked list.

Let’s start with the Snake_Segment.

class Snake_Segment(object): def __init__(self, initial_position, behind=None): self.position=initial_position #Position on the board. self.previous_position = initial_position #Keep track of to tell tail to move there. self.direction=None #This is for the head segment. if behind == None: self.behind=self #The tail of the snake. else: self.behind=behind def move(self, target_coords): self.previous_position = self.position self.position = target_coords #Once the segment moves, it passes along the message to its behind, along with its #previous coordinates. if not self.behind==self: #Checks that it is not the tail of the snake. self.behind.move(self.previous_position)

Now we can build a Snake. Its actions are to move and grow.

class Snake(object): def __init__(self, initial_position, opposite_directions, get_next_coords): #It is important to keep track of opposite_directions #because unless the Snake always moves forwards. #The head is the segment at the end of self.segments. self.segments=[] self.segments.append(Snake_Segment(initial_position)) self.direction=None self.length=1 #The snake keeps track of the head's board coordinates. self.head_position=initial_position self.opposite_direction = opposite_directions #We pass the snake a function from the Board class to tell it where to move next. self.get_next_coords =get_next_coords def move(self, direction=None): if direction==None: #This is the default action, the snake keeps going in the same direction. direction = self.direction elif self.direction==None: #This happens at the beginning before its first movement. self.direction=direction elif direction == self.opposite_direction[self.direction]: #Do nothing in this case because the snake only moves forwards. direction = self.direction else: #The snake turns. self.direction=direction #Now we get new coordinates for the head based on the direction. target_coords, self.direction = self.get_next_coords(self.head_position, direction) #We then tell the head to move, and the command propagates down to the tail. self.segments[-1].move(target_coords) self.head_position = target_coords def grow(self): #We grow by adding a segment to the front of the snake, #becoming the new head. self.segments.append(Snake_Segment(self.head_position, self.segments[self.length-1])) self.length +=1

So now we have a Snake it needs somewhere to sliver, what we will call a Board. This is the most complicated part of the program because it is tasked with incorporating the topological information. This is handled by its get_next_coord method. Its other duty is to generate apples.

class Board(object): def __init__(self, size, topology): self.size =size #The board will be self.size x self.size squares. self.grid=[] self.topology = topology #A string like 'Disc', 'Torus', etc. self.init_topology() self.grid_init() self.opposite_direction = {'UP':'DOWN', 'DOWN':'UP', 'LEFT':'RIGHT', 'RIGHT':'LEFT'} def init_topology(self): #The topological information is contained in self.edges, which tells the board #which edges are boundary, which edges are identified, and with what orientations. #self.edges is a dictionary with keys 'LEFT', 'RIGHT' , 'UP', 'DOWN', #and having values of the form ('edge name or boundary', -1/0/1) where -1 represents #identification with reversed orientation, 1 identification with orientation preserved and #0 for the boundary case. if self.topology =='Disc': self.edges ={'LEFT':('BOUNDARY',0), 'RIGHT':('BOUNDARY',0), 'UP':('BOUNDARY',0), 'DOWN':('BOUNDARY',0)} elif self.topology=='Torus': self.edges ={'LEFT':('RIGHT',1), 'RIGHT':('LEFT',1), 'UP':('DOWN',1), 'DOWN':('UP',1)} elif self.topology=='Sphere': self.edges ={'LEFT':('UP',1), 'RIGHT':('DOWN',1), 'UP':('LEFT',1), 'DOWN':('RIGHT',1)} elif self.topology=='Mobius Strip': self.edges ={'LEFT':('BOUNDARY',0), 'RIGHT':('BOUNDARY',0), 'UP':('DOWN',-1), 'DOWN':('UP',-1)} elif self.topology=='Klein Bottle': self.edges ={'LEFT':('RIGHT',-1), 'RIGHT':('LEFT',-1), 'UP':('DOWN',1), 'DOWN':('UP',1)} elif self.topology=='Projective Plane': self.edges ={'LEFT':('RIGHT',-1), 'RIGHT':('LEFT',-1), 'UP':('DOWN',-1), 'DOWN':('UP',-1)} def set_grid(self, coords, value): self.grid[coords[0], coords[1]]=value def grid_init(self): grid=np.zeros(shape=(self.size, self.size)) #On the grid, 0s represent empty, 1s represent boundary, 2s represent snake and 3s apples. if self.edges['LEFT'][0]=='BOUNDARY': grid[::,0] = np.ones(shape=self.size) if self.edges['RIGHT'][0]=='BOUNDARY': grid[::,-1] = np.ones(shape=self.size) if self.edges['UP'][0]=='BOUNDARY': grid[0] = np.ones(shape=self.size) if self.edges['DOWN'][0]=='BOUNDARY': grid[-1] = np.ones(shape=self.size) self.grid=grid def generate_apple(self): x=random.randint(0, self.size-1) y=random.randint(0, self.size-1) if self.grid[x,y]==0: #Makes sure the square is not already occupied. self.grid[x,y]=3 return [x,y] else: return self.generate_apple() def get_next_coord(self, coord, direction): #First handles how to move in a direction without worrying about boundaries. if direction == 'UP': change_coord = [-1,0] elif direction =='DOWN': change_coord = [1,0] elif direction == 'LEFT': change_coord = [0,-1] elif direction == 'RIGHT': change_coord = [0,1] else: print 'Invalid direction' #Just in case. #Calculate new coordinates optimistically. potential_new_coords = [coord[i] + change_coord[i] for i in range(2)] #First check to see if we are trying to travel through an edge. check=False if -1 in potential_new_coords: check=True elif self.size in potential_new_coords: check=True if not check: #We are in the interior so no problems. return [potential_new_coords, direction] #Now we know we are traveling through some edge, #and the direction tells us which. edge, orientation = self.edges[direction] #If the snake emerges from the LEFT edge, it will then be moving in the RIGHT direction, #and so on. new_direction = self.opposite_direction[edge] #We automatically know one coordinate because we know the new edge. left_or_right=False if edge=='BOUNDARY': print 'Error, boundary case should not happen here.' elif edge == 'LEFT': left_or_right=True potential_new_coords[1]=0 elif edge =='RIGHT': left_or_right=True potential_new_coords[1]=self.size-1 elif edge == 'UP': potential_new_coords[0]=0 elif edge == 'DOWN': potential_new_coords[0]=self.size-1 else: print 'Edge invalid' #Now we figure out the other coordinate, which depends on either #the old row or column, depending on whether the new edge is opposite #or perpendicular. if edge == self.opposite_direction[direction]: opposite=True else: opposite=False #Find out whether we want the old row or old column. if left_or_right: if opposite: coord_component = coord[0] else: coord_component = coord[1] else: if opposite: coord_component = coord[1] else: coord_component = coord[0] #Now we find out whether the the orientation is flipped. if orientation==-1: coord_component = self.size-1-coord_component if left_or_right: potential_new_coords[0]=coord_component else: potential_new_coords[1]=coord_component return [potential_new_coords, new_direction]

And we’re done! Next time we will make this pretty using pygame. If you want to test it before then, you can print the board grid to the console.

The source code is here.

## Computing Homology

Last time we introduced abstract simplicial complexes (abs) and how to represent them in Python. Today we will look at the homology groups of an abs.

For an excellent primer on homology I recommend checking out this post , I will just quickly recap the basics.

**Homology: A quick reminder**

Let be an abs.

Write for the dimensional skeleton of , formed by restricting to those elements of with dimension .

If is a -simplex with vertices labelled we may write to denote this. To denote the -simplex consisting of the vertices of except the vertex we write .

For each dimension we write for the set of -dimensional simplices, and write for its cardinality.

For each dimension we write for the –*dimensional chain group* given by the group presentation . What this says is that the generators of the group are the -simplices, the group is abelian and each element is its own inverse.

An alternative way to think of this is as a vector space, where has basis and is over the field (mod 2 arithmetic). So every element can be represented as a binary vector of length , and addition is bitwise xor.

Now we define boundary maps . These are both group homomorphisms and linear maps. For a -simplex define:

, the sum of its -dimensional faces.

Since is a a basis for this determines a linear mapping.

Define called the –*dimensional cycle group* (subspace).

Define called the –*dimensional boundary group* (subspace).

Now we can define the –*dimensional homology group* .

This is both a quotient group and a quotient vector space.

Finally define the *Betti number* .

**Computation**

We are going to be interested in the Betti numbers as these can be thought of as counting the -dimensional holes in our space. For this just means the number of connected components.

In order to calculate the ranks of interest, we must construct the matrices representing the boundary mappings. In order to do this we must agree upon an ordering of the simplices. So we simply use the one generated by the data structure, which is why each simplex has an index attribute.

To find the rank of , which we have called , we simply count the number of -simplices. So we can add that feature to our SimplicialComplex class easily.

def chain_rank(self,p): return len(self.n_simplex_dict[p])

Now the matrix of will be have rows, one for each -simplex, and columns, one for each -simplex. Then the column of the matrix will contain the coefficients of the image of the simplex expressed in terms of our basis for .

So to find the column vector, we ask each of a simplex’s boundary faces for their index. These will be the positions of the 1s in the column.

def boundary_column(self,simplex): column_size = self.chain_rank(simplex.dimension-1) faces = simplex.faces column = [0 for i in range(column_size)] for face in faces: column[face.index]=1 return column

Now we simply combine these columns to form the boundary matrix. Note that for greater than the dimension of the simplicial complex, the trivial group, and similarly for negative . We don’t have to calculate anything in this case.

This also means that the kernel of is . This is handled by us having the empty set as a kind of -dimensional simplex.

def get_boundary_matrix(self,p, skeleton=None): if skeleton==None: skeleton=self.dimension if ((p>skeleton)or(p<0)): return np.asarray([]) mat=[self.boundary_column(simplex) for simplex in self.n_simplex_dict[p]] return np.asarray(mat, dtype=int).transpose()

The skeleton argument is used when you want to only consider simplices up to a certain dimension. For instance if is given by then is a representation of a triangle. Then the zeroth Betti number as there is one connected component. However the zeorth Betti number of is 3, one for each vertex.

Let’s test what we’ve got so far. In this post Jeremy Kun computes the homology of the tetrahedron with an ‘arm’, he provides a nice picture of it here. If we label the vertices of the tetrahedron , then this structure can be specified with the maximal simplices then restricting to the -skeleton, since we want there to be a hole in the tetrahedron.

Let’s see what the boundary matrices are:

def jeremy_example(): maximal_simplices = [[0,1,2,3],[0,4],[2,4]] S = SimplicialComplex(maximal_simplices) S.plot(direction='up', label_color='blue') for i in range(3): print "Boundary matrix of dimension ", i print S.get_boundary_matrix(i,skeleton=2) jeremy_example()

Which produced the following plot (of the 3-skeleton):

And outputted to the console:

**Boundary matrix of dimension 0**

** [[1 1 1 1 1]]**

** Boundary matrix of dimension 1**

** [[1 1 1 0 0 0 1 0]**

** [1 0 0 1 1 0 0 0]**

** [0 1 0 1 0 1 0 1]**

** [0 0 1 0 1 1 0 0]**

** [0 0 0 0 0 0 1 1]]**

** Boundary matrix of dimension 2**

** [[1 1 0 0]**

** [1 0 1 0]**

** [0 1 1 0]**

** [1 0 0 1]**

** [0 1 0 1]**

** [0 0 1 1]**

** [0 0 0 0]**

** [0 0 0 0]]**

If you check his post, you will see that the matrices are the same, up to a reordering of basis, and noticing that

So far so good. Now all that remains is to calculate the rank and nullity of each matrix. But this is easily done, we simply diagonalize the matrix and count the pivots. For the number of pivots is equal to and by the rank-nullity theorem.

The form we will put the matrix in is called Smith-Normal form I won’t go through the details of how the algorithm works, they are on the wiki page. If you have seen Gaussian elimination before this is basically that.

def reduce_matrix(self, matrix): #Returns [reduced_matrix, rank, nullity] if np.size(matrix)==0: return [matrix,0,0] m=matrix.shape[0] n=matrix.shape[1] def _reduce(x): #We recurse through the diagonal entries. #We move a 1 to the diagonal entry, then #knock out any other 1s in the same col/row. #The rank is the number of nonzero pivots, #so when we run out of nonzero diagonal entries, we will #know the rank. nonzero=False #Searching for a nonzero entry then moving it to the diagonal. for i in range(x,m): for j in range(x,n): if matrix[i,j]==1: matrix[[x,i],:]=matrix[[i,x],:] matrix[:,[x,j]]=matrix[:,[j,x]] nonzero=True break if nonzero: break #Knocking out other nonzero entries. if nonzero: for i in range(x+1,m): if matrix[i,x]==1: matrix[i,:] = np.logical_xor(matrix[x,:], matrix[i,:]) for i in range(x+1,n): if matrix[x,i]==1: matrix[:,i] = np.logical_xor(matrix[:,x], matrix[:,i]) #Proceeding to next diagonal entry. return _reduce(x+1) else: #Run out of nonzero entries so done. return x rank=_reduce(0) return [matrix, rank, n-rank] def boundary_rank(self,p, skeleton=None): if skeleton==None: skeleton=self.dimension if (p >= skeleton) or (p<0): #Trivial case. return 0 else: return self.reduce_matrix(self.get_boundary_matrix(p+1, skeleton))[1] def cycle_rank(self, p, skeleton=None): if skeleton==None: skeleton=self.dimension if p==0: #Mapping to trivial group so kernel is everything in this case. return self.chain_rank(0) elif (p>skeleton) or (p<0): #Trivial case. return 0 else: return self.reduce_matrix(self.get_boundary_matrix(p, skeleton))[2] def betti_number(self, p, skeleton=None): if skeleton==None: skeleton=self.dimension return self.cycle_rank(p,skeleton) - self.boundary_rank(p, skeleton)

Now let’s just test it out on the same example. We expect since there is one connected component, from the -dimensional hole made by the arm and for the ball inside the tetrahedron’s shell.

def jeremy_example(skeleton=2): maximal_simplices = [[0,1,2,3],[0,4],[2,4]] S = SimplicialComplex(maximal_simplices) S.plot(direction='up', label_color='blue') for i in range(4): print "Boundary matrix of dimension", i print S.get_boundary_matrix(i,skeleton) print "Reduced boundary matrix of dimension", i print S.reduce_matrix(S.get_boundary_matrix(i,skeleton))[0] print "Betti number",i,"=", S.betti_number(i,skeleton) jeremy_example()

The console outputs:

**Boundary matrix of dimension 0**

** [[1 1 1 1 1]]**

** Reduced boundary matrix of dimension 0**

** [[1 0 0 0 0]]**

** Betti number 0 = 1**

** Boundary matrix of dimension 1**

** [[1 1 1 0 0 0 1 0]**

** [1 0 0 1 1 0 0 0]**

** [0 1 0 1 0 1 0 1]**

** [0 0 1 0 1 1 0 0]**

** [0 0 0 0 0 0 1 1]]**

** Reduced boundary matrix of dimension 1**

** [[1 0 0 0 0 0 0 0]**

** [0 1 0 0 0 0 0 0]**

** [0 0 1 0 0 0 0 0]**

** [0 0 0 1 0 0 0 0]**

** [0 0 0 0 0 0 0 0]]**

** Betti number 1 = 1**

** Boundary matrix of dimension 2**

** [[1 1 0 0]**

** [1 0 1 0]**

** [0 1 1 0]**

** [1 0 0 1]**

** [0 1 0 1]**

** [0 0 1 1]**

** [0 0 0 0]**

** [0 0 0 0]]**

** Reduced boundary matrix of dimension 2**

** [[1 0 0 0]**

** [0 1 0 0]**

** [0 0 1 0]**

** [0 0 0 0]**

** [0 0 0 0]**

** [0 0 0 0]**

** [0 0 0 0]**

** [0 0 0 0]]**

** Betti number 2 = 1**

** Boundary matrix of dimension 3**

** []**

** Reduced boundary matrix of dimension 3**

** []**

** Betti number 3 = 0**

Just as expected. So we’re done. Get the code here.

## Representing an Abstract Simplicial Complex in Python

**What’s an abstract simplicial complex?**

So what is an abstract simplicial complex? It is a collection of finite sets that are closed under taking subsets. So if is an abstract simplicial complex (asc), and , then every subset of is also in .

These are meant to be combinatorial models of geometric simplicial complexes, which are topological spaces given by collections of simplices, like a point, a line segment, a triangle or a tetrahedron. An n-simplex is the convex hull of affinely independent points in .

These are important in topology and geometry because they are very easy to understand and compute with, but can be pieced together in clever ways to approximate more complicated objects.

**Some terminology**

Let be an abs.

**Definition: **the *vertices* of , written , are the set of all points in at least one element of . Writing this formally: .

**Definition: **An element of is called a simplex.

**Definition**: If is a simplex, and is a subset of , then we say that is a* face* of or that is a *coface* of and denote this by .

**Definition: **If then the *dimension *of is .

**Definition: **If and the only coface of is itself, then we call a *maximal simplex*.

Notice that an abs is completely determined by its set of maximal simplices.

**Representation in Python**

We want a convenient way to represent and generate these objects. Firstly we want to create one from a list of maximal simplices given as lists of integers. Secondly we want to map the (co)face relations, and enable us to easily iterate over simplices of a given dimension.

So stage one will be unwrapping the maximal simplices into simplices, placing them in a tree. Stage two will be filling in the face relations, representing the abs as a Hasse diagram, where each node represents a simplex.

**Stage 1**

In our data structure a node represents a simplex. So we create a new node class for the job.

import tigraphs2 as tig class Simplex(tig.BasicNode, object): def __init__(self, vertices, **kwargs): super(Simplex, self).__init__(**kwargs) self.vertices = vertices self.faces=[] self.cofaces=[] self.dimension = len(self.vertices)-1 #These are used for initializing the complex: self._cfaces =[] self._children={} self.__parent = None

Now we want to create a class for a complex, which will inherit from the tree class since the initial structure will be a tree.

class SimplicialComplex(tig.return_tree_class(), object): def __init__(self, maximal_simplices, Vertex=Simplex, **kwargs): super(SimplicialComplex, self).__init__(Vertex=Vertex, **kwargs) self.maximal_simplices = maximal_simplices self.dimension = max(map(len,self.maximal_simplices))-1 self.n_simplex_dict={} #Keeps track of simplexes by dimension. for i in range(-1,self.dimension+1): self.n_simplex_dict[i]=[] #Create the empty set 'root' of the structure. self.create_simplex([]) self.set_root(self.vertices[0]) self.vertices[0].label=str([]) #Initialize complex from maximal simplices for ms in self.maximal_simplices: self.add_maximal_simplex(ms)

Now we actually need to write the methods, starting with the create_simplex method:

def create_simplex(self, vertices): self.create_vertex(vertices) vertex = self.vertices[-1] self.n_simplex_dict[vertex.dimension].append(self.vertices[-1]) #It is convienient for the simplex to know its index for later on #when we create 'vectors' of simplices. index = len(self.n_simplex_dict[vertex.dimension])-1 vertex.index=index vertex.label=str(vertices)

The way we will form the initial tree is by building up from the empty set, so we need a way of building up sets.

def create_child(self, simplex, vertex): #This method creates a new simplex which is an immediate coface to the argument #simplex. It does so by appending the new vertex to the old simplex's vertices. #_cfaces keeps track of which of the simplex's cofaces have already been added. if vertex in simplex._cfaces: return simplex._cfaces.append(vertex) child_vertices=[v for v in simplex.vertices] child_vertices.append(vertex) self.create_simplex(child_vertices) child=self.vertices[-1] child._parent=simplex simplex._children[vertex]=child self.leaves.add(child) self.create_edge(ends=[simplex, child]) simplex.cofaces.append(child) child.faces.append(simplex)

We are now ready to write the add_maximal_simplex method. It is a recursive procedure, starting with the list representing the maximal simplex, say [0,1,2], we add create cofaces to the root [] with vertices [0], [1] and [2]. To each of these new simplices, we pass on a list of vertices still to be added. To [0] we pass [1,2], to [1] we pass [2], and to [2] we pass [].

In general at each stage we use each vertex remaining in turn to create a child, and pass to that child the vertices yet to be used, that is the vertices remaining greater in value.

def add_maximal_simplex(self,vertices, simplex=None, first=True): #We start at the root. if first: simplex = self.get_root() if len(vertices)>=1: #Condition to terminate recursion. for index in range(len(vertices)): vertex = vertices[index] self.create_child(simplex, vertex) self.add_maximal_simplex(simplex=simplex._children[vertex], vertices=vertices[index+1:], first=False)

That’s stage 1 complete. Let’s improve the default plot method for the graph class and see if the code is working.

def plot(self,margin=60): A = self.get_adjacency_matrix_as_list() g = ig.Graph.Adjacency(A, mode='undirected') for vertex in self.vertices: if vertex.label !=None: #print vertex.label index=self.vertices.index(vertex) g.vs[index]['label']=vertex.label #The layout is designed for trees, so we tell it to set the empty set as the root. layout = g.layout("rt", root=0) visual_style = {} visual_style["vertex_size"] = 20 visual_style["vertex_label_angle"]=3 visual_style["vertex_label_dist"] =1 #We want the root at the bottom of the plot. layout.rotate(180) ig.plot(g, layout=layout, margin=margin, **visual_style)

Great! Let’s try it out on the tetrahedron. This defined by the maximal simplex [0,1,2,3].

def test(n): maximal_simplices = [list(range(n))] Sn = SimplicialComplex(maximal_simplices) Sn.plot() test(4)

Producing:

Looking good! Notice that the levels of the tree correspond to simplices of similar dimension and that each simplex’s vertices are in numeric order.

Also notice that many of the (co)face relations are missing, so now we fill them in.

**Stage 2**

First we add a useful method for accessing a simplex by its list of vertices. This is why we have arranged it so that the vertices are always given in ascending order, so we know exactly what each simplex’s address is.

def get_simplex(self, address, first=True,simplex=None): #We recurse up from the root to the desired simplex. if first: simplex = self.get_root() if len(address)==0: return simplex if address[0] not in simplex._cfaces: return None return self.get_simplex(address=address[1:],first=False, simplex=simplex._children[address[0]])

Now we need to add in missing faces and cofaces. To do this we add a method to find a given simplex’s immediate faces, its boundary.

def _boundary(self,simplex): vertices = simplex.vertices n = len(vertices) boundary=[] #Each boundary simplex is found by discarding a single vertex. for index in range(n): boundary.append(vertices[:index]+vertices[index+1:]) return boundary

Next we add a simple method for creating a (co)face relation.

def add_face_coface(self, face, coface): #Remember that since the tree was constructed from the empty set up, #cofaces are the children in the graph. if coface._parent != face: self.create_edge([face, coface]) face.cofaces.append(coface) coface.faces.append(face)

Now we simply combine these method to add all of the relations. Recall that in a Hasse diagram, we demonstrate inclusions through transitivity where possible.

def update_adjacency_simplex(self, simplex): boundary_faces = self._boundary(simplex) boundary_faces = map(self.get_simplex, boundary_faces) for face in boundary_faces: self.add_face_coface(face,simplex) def update_adjacency(self): for i in range(self.dimension+1): for simplex in self.n_simplex_dict[i]: self.update_adjacency_simplex(simplex)

And we’re done! Phew! Now let’s see what it looks like.

def test(n): maximal_simplices = [list(range(n))] Sn = SimplicialComplex(maximal_simplices) Sn.plot() Sn.update_adjacency() Sn.plot() test(4)

Producing:

Lovely. You can download the code here, remember you will need igraph for plotting, and my tigraph module from here. Next time we will look at computing the homology of an abs.

## My First Genetic Algorithm

‘Genetic’ is one of those words that make things cool, some others are topological, chaos or chaotic, fractal, quantum and recursive. That was not intended to be an exhaustive list. So for a while I have wanted to have a go at making a genetic algorithm and I have finally gotten around to it.

**How they work**

First let me refer you to the wiki, then I’ll quickly talk about the things I think are important. The main idea is to use an search algorithm inspired by evolution to solve a problem.

There are four core ideas.

**Parametrization of Solution**

To begin with we assume that a solution takes the form of a binary vector of length . Each component of a solution vector can be thought of as a gene, which can be on or off (1 or 0 resp.).

The vectors I will call individuals, or creatures, or candidate solutions, or vectors. The population is a set of individuals, to begin with we initialize the population randomly or however you like.

We also assume that we have a function , this measures an individual’s fitness. Fitness is how well the solution, parametrized by the binary vector, solves your problem. So we will assume that bigger numbers mean better solutions.

Once we have a population, we assess their fitness. Now we want to create new individuals, hopefully with better fitness. We do this by combing different parent solutions to create child solutions. One simple way to do this is called *uniform crossover* , a function taking solutions to solutions is called a *genetic operator, *a binary genetic operator is a *crossover*. Uniform crossover simply means that the child’s ith gene is equal to the father’s or mothers with equal probability (alert! hetronormative language in use! please evacuate the building! ). But not everybody get’s to be a parent, ideally you want the chance that a creature has a child to depend in some way on its fitness.

One way of selecting parents is called tournament selection. The idea is that you grab k creatures from your population, and the one with the highest fitness is selected. The parameter k then controls selection pressure.

‘Mutation’ also happens to be one of the cool words. In this context it is an unary genetic operator, that we think of as perturbing a solution vector. There are many ways to do this, perhaps to the simplest is to flip each gene from on to off or vice versa with probability called the mutation rate.

The problem with this can be that it encourages genes to be on where there are few on and off when there are many on. If the number of on genes is part of what is interesting, say if your vector represented variable selection in a machine learning model, then you do not want the mutation to be biased towards a certain number of on genes.

This means that ideally you should use a kind of mutation suitable for your problem, but even if you don’t make the best choice, natural selection will combat biases in the mutation, more or less successfully.

Since I want the expected number of on genes to be the same after mutation, my approach is to, with probability pick an on gene at random and turn it off and with probability pick an off gene and turn it on, repeated times.

Ok I have talked enough about the theory, let’s make one! First we need a problem to solve.

**The Problem**

I will randomly choose an integer between and . Your job is to guess what is.

The information you are given is that after each guess , I will select random integers from the same range, write it as a binary vector, and then I will then tell you how often either:

1. The random integer is both *and* *coordinate-wise*; or

2. The random integer is both *not * *and* *not * *coordinate-wise*.

So I will tell you an integer from that describes the size of the subset for which are upper bounds and the size of the subset for which are lower bounds.

**The Code**

You can view it statically here, or download the notebook and play with it yourself, or just get the source – happy evolutions!