Topological Sort for problems using DAG

Given a DAG (Directed Acyclic Graph), in which we know the adjacencies represent the order in which to perform a task, and the

vertices are tasks, we want to place the vertices in a sequence. We must find a sequence that must satisfy

all dependencies of pre-requisites. This sequential arrangement of the vertices is called the topological sort of the DAG.

The first examples that come in my mind are of pre-requisite chains in courses offered at universities. Suppose, you were

given a challenge to plot a roadmap for everyone in your university about how they can complete their major and/or minor given

the courses they have already completed. Another one to which I was exposed recently was of finding out the complete sequence

of events that took place if only given a partial sequence and also to detect if the given information was accurate or not.

Accuracy of the given information can be detected by checking if at any point in the algorithm we detect a cycle. Since,

such problems are being expressed through DAGs, if there exists any cycle in the graph, it can be concluded that the graph will

not have a topological sort. The proof for this can be found here

Suppose we have the following graphs:

graph1 = { "x" : ["y"],
                "z" : ["y"],
                "y" : [],
                "a" : ["b"],
                "b" : ["c"],
                "c" : [] }

Python

graph2 = {"x" : ["y"], "y": ["x"]}

Here, you can notice how graph1 has a toposort but for graph2, it does not exist. This is because of the fact there

exists a cycle in the graph. We can also understand it using the proof of the statment I had mentioned above. “Topological

sort exists only for a DAG” and since graph2 is not a DAG (since, it is cyclic) it must not have a toposort.

We can find the toposort using a modified dfs algorithm or kahn’s algorithm.

Kahn’s algorithm is discussed in the link and depends and utilizes

on calculating the indegree of all the vertices and using Queue (although it can also be written using an array).

Here is my implementation using Modified DFS and an array as a (kind-of) stack:

def dfs_toposort(graph):
    L = []
    color = { u : "white" for u in graph }
    found_cycle = [False]
    
    for u in graph:
        if color[u] == "white":
            dfs_visit(graph, u, color, L, found_cycle)
        if found_cycle[0]:
            break
    
    if found_cycle[0]:
        L = []
    
    L.reverse()
    return L

def dfs_visit(graph, u, color, L, found_cycle):
    if found_cycle[0]:
        return
    color[u] = "gray"
    
    for v in graph[u]:
        if color[v] == "gray":
            found_cycle[0] = True
            return
        if color[v] == "white":
            dfs_visit(graph, v, color, L, found_cycle)
    
    color[u] = "black"
    L.append(u)

The function dfs_toposort returns an empty array if there exists a cycle in the graph.

Also, it is important to note here that the topological sort need not be unique. (Hence, for competitive programming

problems it might be easier to find problems that involve checking if a given graph is a DAG or not; or if a sequence

satisfying the pre-req chain exists or not by detecting cycles). This is quite evident once you realize that there might

be many 0-in-degree vertices that can lead the toposort result.

You can also see Erik Demaine’s lecture on this topic given for MIT 6.006

Leave a Reply

Scroll to Top