Home » maths » graph theory » Finding Triangles in a Graph

Finding Triangles in a Graph

This begins a series of posts concerning a book on computational topology

by Edelsbruner and Harer.

It is divided into 3 parts: computational geometry, classical topology and persistent homology.

We will start at the beginning in part 1, chapter 1 on graphs. Much of computational geometry and topology seems to involve structures composed of triangles, triangulations.  And so most algorithms expect you to already have the triangles ‘in hand’, as it were. So today we will write a function to return all the triangles in a simple graph.

Complexity of triangle finding

Unfortunately this is no quick way to do this, for a graph $G$ with $n$ vertices and $m$ edges, there may be ${n \choose 3}$ triangles, and so to find them all, we can do no better than $O(n^3)$ time.

We can also phrase the complexity in terms of $m$, which may be more appealing when the graph is very sparse.

How many triangles are there in terms of $m$? One way to think is, from the bag of edges, I pick three, what is the probability $p$ that they form a triangle? The worst case scenario is when the graph is complete so we’ll assume that for our estimates.  We’ll think of drawing them in order, 1,2,3. Edge 1 we place no restrictions on, we look and it has ends $(a,b)$. Next edge 2 must have one of $a$ or $b$ in its ends, of which there are approximately $2n$ (approximately since edge 1 is being counted twice).

Now for the third edge there is only one choice. So we have approximately probability $p= \frac{2n}{m} . \frac{1}{m}$. We are allowed to be scruffy in our estimates because we will be thinking asymptotically, in which case the difference between $m$ and $m-1$ is of no consequence.

So then the probability of drawing a triangle is $\frac{N}{{m \choose 3}}$ where $N$ is the number of triangles, and ${m \choose 3}$ is the number of 3 element subsets of $m$.

Hence $N = p. {m \choose 3} \approx pm^3 \approx \frac{n}{m^2} m^3 = \frac{2\sqrt{m}}{m^2}m^3 =2m^{\frac{3}{2}}$, so the  number of triangles is $O(m^{\frac{3}{2}})$.

Now we can either iterate through the nodes or through the edges, we will try the edges since this will be better for a sparse graph, which often turn up in real life. The algorithm we are going to use is called ‘forward’, and I found it here and also here.

The Forward Algorithm

First I will present the pseudocode.

Input: list of vertices V=[1,...,n] in decreasing degree order.
Data: A(v) - admissible neighbors of v.
for i in V:
for j in Adj(i):
if i < j:
for k in A(i) intersect A(j):
output triangle (i,j,k)
add i to A(j)



So how does it work? It moves from the highest degreed vertices to the lowest, looks for its neighbors with lower degrees, if it shares any admissable neighbors with one, it returns a triangle, else its lower degree neighbor adds the higher degree one to its admissible neighbors.

What this means is that A(v) is a subset of Adj(v), and when v gets called in the outermost loop, A(v) consists of all the neighbors of v that have higher degrees.

Forward the Complexity

We’ll skip over sorting the list by degrees, which I’m sure you know can be done in $O(n \log n)$.

First the two outer layers of the loop hit every edge once, so that is $O(m)$. Now the inner part is the intersection of A(v) and A(w). We’ll implement these as sets, which Python implements in time linear in the combined length of the sets, excepting degenerate cases where the set members are non-hashable.

So how long are the lists? In the worst case a vertex is connected to every other vertex, and they all have equal degrees. In that case you have $n$ neighbors in A(v), and $n \approx \sqrt{m}$.
Combining these we get $O(m^{\frac{3}{2}})$ which is as good as the lower complexity bound we found, of course the order notation does hide some details. Now let’s do it in Python.

Python Implementation

First if you don’t have it already you’ll need my graph theory code, tigraphs.

import tigraphs as tig
def iterate_triangles(graph):
G=graph
V=G.vertices
V.sort(key=lambda x: G.get_degree(x), reverse=True)
#Sort the vertices based on degree size, in reverse order.
n=len(V)
A={}
for i in range(n):
V[i].index=i   #so we can easily find a vertex's index in the list
A[V[i]]=set()
for first_vertex in V:
for second_vertex in G.get_vertex_neighbors(first_vertex):
if first_vertex.index < second_vertex.index:
for third_vertex in A[first_vertex].intersection(A[second_vertex]):
yield [first_vertex.index, second_vertex.index, third_vertex.index]


Notice that I wrote it as an iterator function, so that we don’t have to create a big list of the triangles unless we want to, instead we can iterate over each in turn and save space.

Just for fun I decided to investigate the run time empirically, and it is fast.

import time
times ={}
for i in range(3,20):
print i
G=tig.Complete(i)
start= time.clock()
for j in range(10000):
count=0
for triangle in iterate_triangles(G):
count+=1
end= time.clock()
timer= (end-start)
times[i]=timer
print times

import matplotlib.pyplot as plt
no_vertices=range(3,20)
def foo(x):
x=float(x)
return x*(x-1)/2
no_edges = map(foo, no_vertices)
run_time=[times[i] for i in no_vertices]
plt.plot(no_edges, run_time)
plt.xlabel('Number of edges in Complete Graph')
plt.ylabel('Runtime in secs for 10000 triangle iterations')
plt.show()


Producing this plot.

As you can see it appears to be linear in $m$, better than the estimate of $m^{1.5}$. This is the scruffiness of our estimates coming back to haunt us for low values, but asymptotically we would see a curve developing.

The code for this is available here.