Tarjan's Algorithm

Tarjan's Algorithm is popular algorithm for finding the Strongly Connected Components (SCC) of a directed graph. In the below graph, the nodes in a blue envelope constitute a single connected component as any node u u u as a path to another node v v v and vice versa. Note that while we can reach the node f f f from the node e e e , the opposite is not true. Hence, f f f and e e e are not part of the same component. Thus, the following graph has 3 strongly connected components as highlighted. A single strongly connected component can be formally described as maximal set of vertices such that for any 2 vertices belonging to the set, say u u u and v v v , there exists a path from u u u to v v v and vice versa.

image

\\ Source: Strongly connected component from Wikipedia

In this tutorial we will discuss the Tarjan's Algorithm to find SCC.

Implementation​

The algorithm works by using a single DFS in the graph. To understand its working first let's define a few standard terms :

i n − t i m e ( u ) = in-time(u) = in − t im e ( u ) = time at which node u u u was reached for the first time or the in time of DFS for the node.

l o w − l i n k ( u ) = low-link(u) = l o w − l ink ( u ) = smallest time reachable of a node reachable from the DFS subtree of node u

We also need to ensure that we don't mix 2 different SCC's due to a cross-edge between them. To counter this issue, we will use a stack to store the nodes which have not been assigned to an SCC and have been visited so far.

Thus, whenever we find an SCC, we will pop the corresponding nodes from the stack.

We will call the first node we discovered of an SCC the route node for sake of simplicity. Thus, we will identify the SCC by its root node. To check when we have reached a root node, we just check if l o w ( u ) low(u) l o w ( u ) and i n t i m e ( u ) in_time(u) i n t ​ im e ( u ) are equal.

The pseudo-code for the algorithm can be found here.

The Time and Space Complexity of the algorithm is O ( V + E ) O(V + E) O ( V + E ) , where V V V represents the number of vertices and E E E represents the number of edges, same as that os DFS.

The implementation of above can be as follows (along with above graph as example) :

#include  using namespace std;  struct find_SCC   int timer = 0; // store the in-time for DFS search  vectorint> in_time; // stores the low-link value for every node  vectorint> low_link; // checks whether a node is on the stack or not  vectorbool> on_stack; // stack to store the currently available nodes  stackint> stk; // to store the final answer  vectorvectorint>> res;  // recursive function to do the DFS search on the graph void dfs(int u, vectorvectorint>> &graph)   // set the values for in_time and low_link, and put the node on stack  in_time[u] = low_link[u] = timer;  timer++;  stk.push(u);  on_stack[u] = true;  // DFS to neighbours of current node for (int v : graph[u])   // if the node is unvisited if (in_time[v] == -1)   dfs(v, graph); // update the low-link value for node u  low_link[u] = min(low_link[u], low_link[v]); // else if the node was visited before, and is still on the stack > else if (on_stack[v])   // update the low-link for node u  low_link[u] = min(low_link[u], in_time[v]); > >  // check if u is the root node for a SCC if (low_link[u] == in_time[u])   vectorint> SCC; // all the nodes above u in the stack are in SCC of u while (stk.top() != u)   int v = stk.top();  stk.pop();  SCC.push_back(v);  on_stack[v] = false; > // now removing u from stack and adding it to the SCC  stk.pop();  SCC.push_back(u);  on_stack[u] = false;  // adding the SCC to the answer  res.push_back(SCC); >  return; >  // takes input of graph as adjacency list and returns the SCC of graph as // vectors  vectorvectorint>> tarjans(vectorvectorint>> &adjacencyList)   int n = adjacencyList.size();  in_time.resize(n, -1);  low_link.resize(n, -1);  on_stack.resize(n, false);  for (int u = 0; u  n; u++)   if (in_time[u] == -1) dfs(u, adjacencyList); >  return res; > >;  int main()   // constructing adjacency list for example graph shown, mapping node a to 0, b // to 1 and so on.  vectorvectorint>> graph(8);  graph[0].push_back(1);  graph[1].push_back(2);  graph[1].push_back(4);  graph[1].push_back(5);  graph[2].push_back(3);  graph[2].push_back(6);  graph[3].push_back(2);  graph[3].push_back(7);  graph[4].push_back(0);  graph[4].push_back(5);  graph[5].push_back(6);  graph[6].push_back(5);  graph[7].push_back(3);  graph[7].push_back(6);  // using the tarjan's algo  find_SCC t = find_SCC();  vectorvectorint>> res = t.tarjans(graph);  // output the final result  cout  <"The Strongly Connected Components for the graph are:" < endl; for (vectorint> i : res)   for (int j : i)   cout  <(char)('a' + j)  <" "; >  cout < endl; >  /*  output of the above code is:  The Strongly Connected Components for the graph are:  f g  h d c  e b a  */  return 0; > 

We can further cluster all the nodes belonging to an SCC into one node, and represent all the edges from and to a constituent node of the SCC to this new node. The resulting graph is called the Condensation Graph

Using Tarjan's algorithm to find bridges in a Undirected Graph​

Here, we are searching in an undirected graph, and also we don't need to remove the cross-edges. Thus, the stack is no longer required. An edge between node u u u to v v v will be considered a bridge if there is no other way to reach v v v from u u u if the edge is removed. We can check for this by using the properties of the low-link time. If the low-link time of v v v is less than that of u u u , we can conclude that we can reach v v v from some other path, otherwise the edge between them is a bridge. Here is an example for same.

Example #1: 1192 - Critical Connections in a Network​

This problem directly asks us to find bridges in the given graph. The input is provided in form of pairs of nodes with an edge between them. Thus, we will first convert this to an adjacency list and apply the same algorithm discussed above.

class Solution   public: // initializing the variables int timer = 0;  vectorint> in_time, low_link;  vectorvectorint>> graph;  vectorvectorint>> res;  // recursive function to perform DFS void dfs(int u, int p = -1)   in_time[u] = low_link[u] = timer;  timer++;  for (int v : graph[u])   // we ignore the parent node if (v == p) continue; // if we discover a new node if (in_time[v] == -1)   dfs(v, u);  low_link[u] = min(low_link[u], low_link[v]); // check if the edge is a bridge if (low_link[v] > in_time[u])   res.push_back(u, v>); > > else   // if we visit a node which already has been visited  low_link[u] = min(low_link[u], in_time[v]); > >  return; >  vectorvectorint>> criticalConnections(int n, vectorvectorint>>& connections)   in_time.resize(n, -1);  low_link.resize(n, -1);  graph.resize(n);  // make an adjacency list from the input for (auto connection : connections)   graph[connection[0]].push_back(connection[1]);  graph[connection[1]].push_back(connection[0]); >  // as the entire graph is connected, just call dfs on 0 dfs(0);  return res; > >; 

Using Tarjan's algorithm to find articulation points in a Undirected Graph​

Similar to bridges, articulation points are vertices which if removed will disconnect the graph. Once again, we can modify the Tarjan's algorithm to find such vertices in a given undirected graph.

Example #2: 1568 - Minimum Number of Days to Disconnect Island​

Notice that the answer cannot be more than 2 2 2 . Any rectangular shape will have atleast 4 corner cells, and if we remove the vertical neighbour and horizontal neighbour cells of anyone one of these, we have disconnected the graph. Thus, we need to check when the answer can 0 0 0 or 1 1 1 , and we are done.

The answer will be 0 0 0 if the islands are initially disconnected. This can be checked using basic DFS. The answer will be 1 1 1 when the graph has atleast 1 articulation point. The condition for articulation points is similar to that for finding bridges, except we also check for presence of parents.

class Solution   public: // initializing the variables int n, m, cnt_islands = 0; int timer = 0;  vectorvectorint>> in_time, low_link;  // recursive function to perform DFS, return true if an articulation point is // detected bool dfs(pairint, int> u, vectorvectorint>>& grid,  pairint, int> p = -1, -1>)   int i = u.first, j = u.second;  in_time[i][j] = low_link[i][j] = timer++;  cnt_islands++; // variable to check for articulation point in DFS subtree of the node bool has_articulation_point = false; // variable to count number of children visited first by the node int cnt_children = 0;  // find all neighbours from the grid  vectorpairint, int>> neighbours; if ((i > 0) && (grid[i - 1][j])) neighbours.push_back(i - 1, j>); if ((i  (n - 1)) && (grid[i + 1][j])) neighbours.push_back(i + 1, j>); if ((j > 0) && (grid[i][j - 1])) neighbours.push_back(i, j - 1>); if ((j  (m - 1)) && (grid[i][j + 1])) neighbours.push_back(i, j + 1>);  for (auto v : neighbours)   // if the neighbour is a parent, ignore it if (v == p) continue; // if the neighbour was already visited else if (in_time[v.first][v.second] != -1)  low_link[i][j] = min(low_link[i][j], in_time[v.first][v.second]); // if the neighbour is a new node else   // if the subtree has an articulation point if (dfs(v, grid, u)) has_articulation_point = true; // update the low-link value  low_link[i][j] = min(low_link[i][j], low_link[v.first][v.second]); // if the point itself is an articulation point if ((low_link[v.first][v.second] >= in_time[i][j]) && (p != (pairint, int>)-1, -1>))  has_articulation_point = true;  cnt_children++; > >  // if it has an articulation point, return true if (((p == (pairint, int>)-1, -1>) && (cnt_children > 1)) ||  has_articulation_point) return true; // else return false return false; >  int minDays(vectorvectorint>>& grid)   n = grid.size();  m = grid[0].size(); int connected_comp = 0; bool has_articulation_point = false;  in_time.resize(n, vectorint>(m, -1));  low_link.resize(n, vectorint>(m, -1));  for (int i = 0; i  n; i++)   for (int j = 0; j  m; j++)   // ignore the cells of grid with water, or who have been visited if ((grid[i][j] == 0) || (in_time[i][j] != -1)) continue; // we already have found a connected component, and this will be a new // one, thus we directly return 0 if (connected_comp > 0) return 0; if (dfs(i, j>, grid)) has_articulation_point = true;  connected_comp++; > >  // if there are no cells with land if (cnt_islands == 0) return 0; // if there is only one cell with land else if (cnt_islands == 1) return 1; if (has_articulation_point) return 1; return 2; > >; 

Suggested Problems

Problem NameDifficultySolution Link
1489 - Find Critical and Pseudo-Critical Edges in Minimum Spanning TreeHard N/A