Anish Athalye

Algorithms in the Real World: Committee Assignment

I recently had another chance to use a fancy algorithm to solve a real-world problem. These opportunities don’t come up all that often, but when they do, it’s pretty exciting!

Every year, MIT HKN has a bunch of eligible students who need to be matched to committees. As part of the assignment process, the officers decide how many spots are available on each committee, and then we have every eligible rank the committees. In the past, officers matched people manually, looking at the data and trying to give each person one of their 1st or 2nd choices. Unfortunately, this is time-consuming and unlikely to result in an optimal assignment if we’re trying to maximize overall happiness.

Example

To see how assignments can be suboptimal, we can go through an example.

CommitteeCapacity
Tutoring2
Outreach1
Social2
PersonTutoringOutreachSocial
Alice132
Bob132
Charlie123
Dave213
Eve213

In the above table, 1 means first choice, and 3 means third choice. We could imagine assigning people by going down the list and assigning each person to their highest-ranked committee that has available slots. This would result in assigning Alice to Tutoring (1st choice), Bob to Tutoring (1st choice), Charlie to Outreach (2nd choice), Dave to Social (3rd choice), and Eve to Social (3rd choice).

Not only is this algorithm unfair to the people at the bottom of the list, but it’s also suboptimal. If we want to minimize the sum of the rankings for committees we placed people on, we could go with Alice–Social, Bob–Social, Charlie–Tutoring, Dave–Outreach, Eve–Tutoring. This results in a “cost” of 8, which is optimal, rather than the cost of 10 we got with the first assignment, which was constructed greedily.

In the actual data set, there were 8 committees and 57 eligible members, so it wouldn’t have been feasible to manually find an optimal assignment.

Problem Statement

More formally: we have kk committees and nn people. Each committee jj has a capacity of cjc_j people, where we’re guaranteed that j=1kcj=n\sum_{j = 1}^{k} c_j = n. We know people’s preferences, which for any given person ii is a permutation of the committees, σi\sigma_i, mapping the highest ranked committee to 11 and the lowest-ranked committee to kk. Our goal is to find an assignment AA of people to committees that solves the following optimization problem:

arg minAi=1nσi(A(i))subject toj{1,2,,k},i=1nδA(i),j=cj\begin{aligned} & \argmin_{A} & & \sum_{i = 1}^{n} \sigma_i(A(i)) \\ & \text{subject to} & & \forall j \in \{1, 2, \ldots, k\}, \sum_{i = 1}^{n} \delta_{A(i), j} = c_j \end{aligned}

Above, δ\delta is the Kronecker delta. Essentially, we want to find the assignment that minimizes cost while satisfying our constraints of having a specific number of people on each committee.

Algorithm

It turns out that the committee assignment problem can be transformed into an instance of the assignment problem (which can also be thought of as finding a minimum weight matching in a bipartite graph).

In the assignment problem, you have nn people, nn jobs, and a cost matrix CC where CijC_{ij} is the cost of having person ii do job jj, and you want to find the minimum cost assignment such that each person is assigned to a unique job.

The committee assignment problem can be trivially transformed into an instance of the assignment problem, simply by making cjc_j copies of each committee jj, counting each as a separate “job”, and constructing an appropriate cost matrix from the σi\sigma_is.

Luckily, the assignment problem is well-studied in computer science, and there’s a known solution — the Hungarian algorithm solves this problem. There’s even an implementation of the algorithm built into SciPy. This makes solving the committee assignment problem really easy — it only requires a little bit of code to implement the transformation described above.

Results

Using this algorithmic approach to solve the committee assignment problem worked really well for us! We made some slight modifications to the process — one of the committees hand-picked their members, and then we used the algorithm on the remaining 52 members and 7 committees. When running the program, we decided to minimize the sum of the squares of the costs rather than minimizing just the sum.

With 52 people and 7 committees, our implementation ran in less than half a second and gave us an assignment with 37 people getting their first choice, 13 getting their second choice, and 2 getting their third choice.