CS4245 Analysis of Algorithms

Bipartite Matching

Istvan Simon

The Marriage Problem and Matchings

Suppose that in a group of n single women and n single men who desire to get married, each participant indicates who among the opposite sex would be acceptable as a potential spouse. This situation could be represented by a bipartite graph in which the vertex classes are the set of n women and the set of n men, and a woman x is joined by an edge to a man y if they like each other. For example we could have the women Ann, Beth, Christina, Dorothy Evelyn, and Fiona, and the men Adam, Bob, Carl, Dan, Erik, and Frank. If Ann liked Adam and Bob, (and vice-versa), Beth liked Adam and Carl, Christina liked Dan, Erik and Frank, Dorothy liked Bob, Evelyn liked Adam and Dan, and Fiona liked Frank, we would have the following bipartite graph. In this situation could we marry everybody to someone they liked? This is one version of the Marriage Problem.
figure 1

Since polygamy and polyandry is not allowed, every woman can be married to at most one man, and every man to at most one woman. Therefore, a possible set of marriages can be represented as a subset M of the edges, no two of which are adjacent. Such a set of edges is called a matching in the Graph. In other words, a matching is a set of vertex-disjoint edges. A matching is called perfect if every vertex is incident to an edge of the matching. Thus the marriage problem can be stated in graph-theoretic terms as asking if a given bipartite graph G has a perfect matching. (We could think of a perfect matching as perfect because it maximizes marital bliss.)

In the example considered above indeed there is a perfect matching: we could marry Ann to Adam, Beth to Carl, Christina to Erik, Dorothy to Bob, Evelyn to Dan, and Fiona to Frank. But in some other situations a perfect matching may be impossible. If that is the case, we still may be interested to nonetheless extend the benefits of marriage to the largest number of people in the group. So we may want to find a maximum matching, that is one with maximum cardinality. This latter generalization of the Marriage Problem is called the Maximum Matching Problem. It has the advantage that it could be applied even if the number of women and men in the group were not the same. In this section we shall develop an efficient algorithm for both the Marriage Problem and the Maximum Matching Problem.
figure 2

Hall's Theorem

We remark first that if there exists a perfect matching then it must be true that for any subset of the women the number of men that they collectively like must be at least as large as the number of women in the subset. For if this is not true then clearly a perfect macthing is impossible. This is called Hall's condition, discovered by Philip Hall in 1931. Hall proved that this obvious necessary condition is also sufficient to insure a perfect matching.

Hall's Theorem: Given a bipartite graph G=(V,E) with vertex classes X and Y, G has a perfect matching if and only if for every subset S of X, |Adj(S)| >= |S|, where Adj(S) denotes the set of vertices adjacent to some vertex of S .

We will present a proof of Hall's Theorem later. For now we just remark that it does not directly give us an efficient algorithm, because to check Hall's condition would require examining all 2n subsets of X. In fact, it seems that because Hall's condition is both necessary and sufficient, it dooms any chance of finding an efficient algorithm. Though this is certainly plausible, it turns out to be false.

In Dijkstra's algorithm we succeeded in finding an efficient algorithm by following a strategy in which successive improvements to a partial solution lead to an optimal solution. Could we apply a similar approach for the marriage problem? Let us suppose that M is a matching, if M is not a maximum matching, how could we improve it by finding a larger one?

For example, we might want to try this in the case we already discussed. So suppose we have the matching Ann married to Bob, Beth to Adam, Christina to Dan, and Fiona to Frank. Dorothy, Evelyn, Carl and Erik are unmatched. At first sight it appears that we are stuck, because Dorothy only likes Bob, and he is already matched, and Evelyn likes only Adam and Dan, both of whom are matched. To make progress we must be willing to rearrange our existing matchings, in order to increase their number. But how?
figure 3

Let us start with a currently unmatched woman, say Dorothy. Now we could reason as follows: to match Dorothy we must marry her to Bob; but Bob is matched to Ann; maybe we could match Ann to someone else; well, we could match Ann to Adam instead, but Adam is already matched to Beth; so if we do that we must match Beth to someone else; we could match Beth to Carl. Carl is currently unmatched so we found a better matching! The new matching then is Dorothy to Bob, Ann to Adam, Beth to Carl, plus Christina to Dan, and Fiona to Frank who weren't affected by our rearrangement.

Amazingly, we found a way to improve on a matching by finding a path from an unmmatched woman to an unmatched man in which every second edge is in the current matching. Such a path is called an alternating path, relative to the matching M, or a-path for short, because it alternates between edges not in the current matching and edges in the current matching. The path we found is Dorothy, Bob, Ann, Adam, Beth, Carl. The edges Dorothy-Bob, Ann-Adam, and Beth-Carl on this path are not in the initial matching; the edges Bob-Ann, Adam-Beth are. (see Figure 4.)
figure 4

Can we always improve on a matching if we find an alternating path? The answer is yes, because an alternating path is a path from a woman to a man in a bipartite graph. Consequently it has odd length. Therefore we have always one more odd-numbered edge on an alternating path than the number of even-numbered edges. Since all odd-numbered edges are not in the matching, while all even-numbered ones are, we increase the number of matchings by one if we replace the even-numbered edges by the odd-numbered ones in the matching. The result is a new matching with one more edge than the one we started with. Thus, we have just proved the following result.

Lemma 1. Suppose that G=(V,E) is a bipartite graph with vertex classes X and Y, and M is a matching in G. If there exists an alternating path from an unmatched vertex x in X to an unmatched vertex y in Y, then there exists a matching M' with cardinality |M| + 1.

Lemma 1 gives us an algorithm to improve a matching to a better one.

Matching Algorithm: 

  G=(V,E) is a bipartite graph, with vertex classes X and Y; 

  M is the current matching; 

  spouse[y] is null, if y in Y is not currently matched; 
  otherwise it is x, where xy is an edge in M.

  color[v] is WHITE, if v is an unmatched vertex,
	 GREEN if it is matched,  in  the current matching;
  

  1. // Initialization

  	 for y in Y  spouse[y] = null;  // M is empty initially
	 for v in V color[v] = WHITE;   // everybody unmatched

  2.  do {

  3. 	   search for an a-path in G,
  
           x0, y0, ..., xi, yi , where
	       color[x0] = color[yi] = WHITE, and 
	       spouse[yj] = xj+1 for j = 0 ..  i-1;

  4.     if (there is no a-path in G)  halt; otherwise

         // Improve matching

  5.     for j = 0 .. i 
  6.		spouse[yj] = xj;
  7.     color[x0] = GREEN; color[yi] = GREEN;
     } 
 
Analysis of the Matching Algorithm
We now show that our algorithm solves the Marriage Problem, and the Maximum Matching Problem efficiently.

How long does the key step of searching for an a-path on line 3 take? We can use a modified Breadth-First-Search for searching for an a-path. We start by labeling all the WHITE vertices in X with 0. This takes O(n) steps. Now we label all unlabeled vertices adjacent to a vertex labeled 0 with 1. If any of them is WHITE, we found an alternating path. Otherwise we label their spouses 2. Next we label all unlabeled vertices adjacent to a vertex labeled 2 with 3. If any vertex so labeled is WHITE, we found an alternating path. Otherwise we label their spouses 4, and we continue in this manner until either we find an alternating-path or all vertices in Y reachable from the WHITE vertices in X get labeled, and none of them is WHITE. In this case, there is no alternating path and we halt. Each vertex in X labeled this way is considered at most once, and so this search takes time O(n+m).

It follows that the Matching Algorithm runs in time O(n2 + nm), since we do at most n searches.

The algorithm works by improving a matching repeatedly until we cannot further improve it. This leads to a maximal matching, that is one that cannot be further improved by this method. It is not immediately clear whether a maximal matching so obtained is also maximum. It turns out that this is the case, so that the Matching algorithm always determines a maximum matching. This claim is justified by the following stronger version of Hall's Theorem, proved in the next section.

Correctness of the Matching Algorithm

Theorem 2. Suppose that G=(V,E) is a bipartite graph with vertex classes X and Y, and that M is a matching in G. Let k >= 0 be the number of unmatched vertices in X. Then the following are equivalent:

  1. M is a maximum matching iff
  2. G has no alternating path relative to the matching M from an unmatched vertex in X to an unmatched vertex in Y iff
  3. There is a subset S of X such that |Adj(S)| = |S| - k.

Proof: We prove that (1) => (2) => (3) => (1).

Indeed, (1) => (2) is the contrapositive of Lemma 1, hence it holds.

To prove that (2) => (3) let S be the set of vertices in X reachable from the unmatched vertices of X, by a path that alternates edges not in M and edges in M. Then S has k unmatched vertices and some number j of matched vertices. So |S| = k+j. Since there is no a-path in G relative to the current matching M, all the vertices adjacent to the vertices of S must be matched. Furthermore, they must be matched to a vertex in S, because S includes all vertices of X reachable from the unmatched vertices of X by a path that alternates edges not in M and edges in M. Hence, their number must be j, that is |Adj(S)| = j = |S| - k.

Finally, (3) => (1) because clearly if |Adj(S)| = |S| - k, then at least k vertices in S must be unmatched by any matching. Since the number of unmatched vertices in X by the current matching M is k, it follows that M must be a maximum matching.

Note that Hall's theorem is an immediate corollary of Theorem 2, since M is a perfect matching iff k = 0.