Home » Posts tagged 'Directed graph'

Tag Archives: Directed graph

Special Graphs and Dynamic Inheritance

Welcome back! Last time we introduced the Vertex, Edge, Graph, DirGraph and UnDirGraph classes, now we are going to have some fun and introduce the special graphs like linear, cycle, complete and so on. If you want to refresh your memory on the special graphs, we discussed them here.

There is also a wider programming lesson to be learned concerning ‘dynamic inheritance’. Each of these special graphs, take the linear graph for instance, might want to be directed or undirected, but how to model this? We want a general notion of a linear graph, that changes depending upon whether it is directed or undirected, meaning we would like a class that can somehow choose to inherit from different classes.

My solution for this problem is to wrap our class definition in a function, and so inheritance is decided at runtime! If you know of a better one, please let me know! Let’s see how it works for a linear graph.

#This is a wrapper to a class definition, deciding whether to inherit
#from DirGraph or UnDirGraph at runtime. It can be initialised by
#number of vertices or the number of edges.

def return_linear_class(directed=False):
    if directed:
        base=DirGraph
    else:
        base=UnDirGraph
    class Linear(base, object):
        def __init__(self, number_vertices=0, number_edges=0, **kwargs):
            super(Linear, self).__init__(**kwargs)
            self.linear_generate(number_vertices, number_edges)

        def linear_generate(self,number_vertices, number_edges):
            if (not number_edges ==0) and (not number_vertices==0):
                if not number_vertices == number_edges+1:
                    print 'Number of edges and vertices incompatible!'
                    return
                else:
                    self.number_vertices=number_vertices

            elif not number_edges==0:
                self.number_vertices = number_edges +1
            else:
                self.number_vertices = number_vertices

            self.create_vertices(self.number_vertices)
            for index in range(self.number_vertices -1):
                Source = self.vertices[index]
                Target = self.vertices[index+1]
                self.create_edge([Source, Target])
    return Linear

Now this gives us a class definition, but we still need to instantiate the class, which we also do with a function, this is straightforward. Notice that we use **kwargs to pass down any regular Graph or UnDirGraph or DirGraph arguments you might want to specify.

#This instantiates the Linear class.
def create_linear(directed=False, number_vertices=0, number_edges=0,**kwargs):
linear = return_linear_class(directed)(number_vertices, number_edges,**kwargs)
return linear

And that is that. Now just so you believe that it works we will make a directed and undirected linear graph with 10 vertices and plot the results.

testdir=create_linear(number_vertices=10)
testundir=create_linear(directed=True,number_vertices=10)
testdir.plot()
testundir.plot()

Producing the following images:
undirlinear10dirlinear10

Notice that the plotting in igraph is sometimes a bit random!

Next up is cycle graphs, we do the same inheritance trick as before, and also we use object composition to build the cycle by joining the ends of a linear graph.

#Class definition wrapper to dynamically inherti from DirGraph or UnDirGraph.
#Also has a composition from Linear, to create a cycle it joins the ends
#of a linear graph.
def return_cycle_class(directed=False):
    if directed:
        base = DirGraph
    else:
        base = UnDirGraph
    class Cycle(base, object):
        def __init__(self, number_vertices=0, number_edges=0, **kwargs):
            super(Cycle, self).__init__(**kwargs)
            if (not number_edges==0) and (not number_vertices==0):
                if not number_edges == number_vertices:
                    print 'Numbers of edges and vertices incompatible!'
                    return
            elif not number_edges ==0:
                number_vertices = number_edges
            Linear_part = create_linear()
            Linear_part.linear_generate(number_vertices, number_edges)

            self.vertices = Linear_part.vertices
            self.edges = Linear_part.edges
            self.cycle_generate(number_vertices)

        def cycle_generate(self, number_vertices):
            Target = self.vertices[0]
            Source = self.vertices[number_vertices-1]
            self.create_edge(ends=[Source, Target])
    return Cycle

To test it I will show you an undirected cycle on 10 vertices.

t=create_cycle(number_edges=9)
t.plot()

Producing:
cyclegraph10

as desired.

Next up is the complete graph, very easy to do, I only made it for undirected graphs, but it could easily be altered to dynamically inherit.

class Complete(UnDirGraph, object):
    def __init__(self,number_vertices=0, **kwargs):
        super(Complete, self).__init__(**kwargs)
        self.create_vertices(no_create=number_vertices)
        ends=[]
        for Source in self.vertices:
            for Target in self.vertices:
                if [Source,Target] not in ends:
                    if not Source == Target:
                        self.create_edge(ends=[Source,Target])
                        ends.append([Source,Target])
                        ends.append([Target, Source])

Now writing

t = Complete(number_vertices=10)
t.plot()

produces

complete10.

Now we create a Tree class, along with a subclass of nary-trees.

def return_tree_class(directed=False):
    if directed:
        base = DirGraph
    else:
        base = UnDirGraph

    class Tree(base, object):
        def __init__(self, **kwargs):
            super(Tree, self).__init__(**kwargs)

        def is_leaf(self, vertex):
            if self.get_degree(vertex) == 1:
                return True
            elif self.get_number_vertices() ==1:
                return True
            else:
                return False

        def set_root(self, vertex):
            if vertex in self.vertices:
                self.remove_vertex_label('Root')
                self.add_vertex_label(vertex, label='Root')

        def get_root(self):
            return self.get_vertex('Root')

        def get_leaves(self):
            leaves = filter(self.is_leaf,self.vertices)
            return leaves
    return Tree

def create_tree(directed=False, **kwargs):
    tree = return_tree_class(directed)(**kwargs)
    return tree

def return_nary_tree_class(directed=False):
    base = return_tree_class(directed)
    class NaryRootedTree(base,object):
        def __init__(self, N=0, **kwargs):
            super(NaryRootedTree, self).__init__(**kwargs)
            self.N = N

        def split_vertex(self, vertex):
            if self.is_leaf(vertex):
                children = [self.Vertex() for i in range(self.N)]
                vertex.children = children
                self.leaves.discard(vertex)
                for Child in children:
                    self.add_vertex(Child)
                    self.leaves.add(Child)
                    Child.parent=vertex
                    self.create_edge(ends=[vertex, Child])

        def fuse_vertex(self, vertex):
            try:
                children=vertex.children
            except AttributeError:
                return
            self.leaves.add(vertex)
            for child in children:
                self.fuse_vertex(child)
                self.remove_vertex(child)
                child.parent = None
                vertex.children = None

        def create_full_n_level(self, n):
            self.vertices =[]
            self.edges =set([])
            self.create_vertex()
            self.set_root(self.vertices[0])
            for level in range(n):
                leaves=self.find_leaves()
                for Leaf in leaves:
                    self.split_vertex(Leaf)

    return NaryRootedTree

def create_nary_tree(directed=False, N=0,**kwargs):
    nary_tree = return_nary_tree_class(directed)(N, **kwargs)
    return nary_tree

Let’s make a binary tree, since we are doing computer science after all.

t=create_nary_tree(N=2)
t.create_full_n_level(n=6)
t.plot()

Which looks like:

nary2to6

And that’s all folks! You find all the code for the graph stuff so far here. Let me know what you think!

Next time we will probably be looking at decision trees.

A First Taste of Graph Theory

1. Motivation

Welcome back. So far we’ve learnt how to clean data, and do some basic plotting. Now we really want to move on to some machine learning proper and make some models.

If you recall we were looking at the titanic data, and trying to split the data using sex, class, age etc, and the name of the game is to have these splits consist of mostly survivors or non-survivors, that way when we predict that a particular group will survive, because the majority in that group do, we will be mostly right.

Consider that if {51 \% } of women survived compared with {49 \% } for men, when we make the prediction, reducing these probabilities to classes, we are throwing away a lot of information.

What we have been groping towards is a kind of crude decision tree learning and this, with some more advanced variations, is what I want to implement.

But what is a decision tree? What is a tree? What is a graph? How are we to proceed if we know no graph theory? Well we could probably get away without any. But we’re better than that. So today I will give a primer on graph theory, will this eventual goal in mind. Once we know something about graphs, we can learn about making graphs with Python, and once we’ve done that, we can try them out on the titanic data.

2. The Definition of a Graph

Here is a picture of a graph:

The red filled circles we call ‘vertices’ or sometimes ‘nodes’. The lines connecting the vertices we call ‘edges’.

A graph is basically a way of showing relationships between things, for instance, imagine that the vertices above were people, and the edges represented the relationship of holding hands, awww.

If you have come across the idea of relations before, you may be thinking of reflexivity, transitivity and symmetry.

A ‘reflexive’ edge is what we call a loop, like this:

A ‘symmetric’ edge is what we call undirected, and it just looks like a line, here’s what directed edges look like:

The beginning vertex of a directed edge is called the source, and the ending vertex is called the target.

Transitivity – Homer taller than Bart and Bart taller than Lisa implies Homer taller than Lisa – you could visualise as a triangle:

transitive

Now that we have a visual idea of graphs, let’s define them formally.

Definition 1 An undirected graph {G} is a triple {(V,E, ends)}, where {V} is a set of vertices, {E} is a set of edges and {ends} is a mapping from {E} to subsets of {V} of size {1} or {2}.

And for directed graphs:

Definition 2 A directed graph {G =(V,E, source, target)}, where {V,E} are sets of vertices and edges respectively, and {source, target: E \rightarrow V.}

If you’ve studied maths, then you’ll know that one of the key questions to ask about a new mathematical object is: what are the morphisms? Meaning, what are the functions from graphs to graphs that preserve structure.

A graph homomorphism is a function {f: G \rightarrow H} between two graphs, taking vertices to vertices and edges, that ‘respects’ the structure of the graph, in symbols: {ends \circ f = f \circ ends}, the function commutes with the {ends} maps.

Notice that the notation suppresses the difference between the {ends} mapping in {G} and {H}, but we’re all grown-ups here, we get context.

As you can probably guess, for directed graphs we require {f \circ source = source \circ f} and {f \circ target = target \circ f}

A subgraph of a graph is a subset of its vertices, a subset of its edges, with the same mappings but that it is still a graph.

The degree of a vertex is the number of edges connected to it.

3. Types of Graphs

Now we’ll see some of the standard classes of graphs. These will all be simple graphs, meaning that there are no loops, finite numbers of vertices and edges, at most one edge between any two vertices and undirected.

3.1. Linear Graphs

These just look like a line:

3.2. Cyclic Graphs

We’ve already seen two of these, you walk around a cycle and come back to the beginning. We denote the cyclic graph with {n} vertices by {C_n}.

3.3. Complete Graphs

This is the situation where everything is connected to everything else, the most edges possible. This is denoted {K_n}

3.4. Trees

A tree is connected, meaning there is only one ‘piece’ of the graph, and has no cycles, meaning no parts that look a cyclic graph.

The degree 1 vertices are called leaves, and an n-ary tree is one where each non-leaf has n ‘children’.

If you want to learn more about trees, there is an excellent post about it here.

4. Representing Graphs

One very natural way to encode the layout of a graph is an adjacency matrix. If {|V|=n} then the matrix will be {n \times n}, where each row represents a vertex as a source, and the columns are vertices as targets, and the entry is the number of edges with that source and target.

So if there is a {2} in row {1} and column {5}, there are two edges from vertex {1} to vertex {5}. Of course the matrix will depend on how you choose to order the vertices. For example, the adjacency matrix of {C_3 = \left ( \begin{array}{ccc} 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1& 0 \end{array} \right )}, and the graph looks like this:

Notice that if there are no loops, the diagonal will be {0}s, and if undirected then the matrix is symmetric about the main diagonal. Another neat property is that if {A} is the adjacency matrix of a simple graph {G}, then {A^n_{i,j}} is the number of paths of length {n} from vertex {i \mapsto j}. Prove it yourself!

The problem with adjacency matrices as a data structure, is that it takes a lot of work to change it when you add or delete a vertex. Another way of doing it is with an adjacency list. This is where you have a list of the vertices, with each vertex ‘pointing’ to a list of vertices that are its targets. This can easily be implemented using dictionaries in Python. This makes it easy to find adjacent vertices, but it doesn’t really treat edges with the respect they deserve. We’ll be implementing something where both edges and vertices are objects that know about who they are connected to, this won’t be as fast as an adjacency list, but it will allow us to keep track of a lot of information.

5. Next Time

This was just to give you an idea of graphs are, and some of terminology needed to discuss them. There is loads more stuff to learn about graphs, you can colour edges and vertices, talk about paths and walks on a graph, planar graphs… but that’s all for now. Next time: object orientated programming with Python.