
Theory Seminar
Robust-Sorting and Applications to Ulam-Median
This event is free and open to the publicAdd to Google Calendar
Sorting is one of the most basic primitives in many algorithms and data analysis tasks. Comparison-based sorting algorithms, like quick-sort and merge-sort, are known to be optimal when the outcome of each comparison is error-free. However, many real-world sorting applications operate in scenarios where the outcome of each comparison can be noisy. In this work, we explore settings where a bounded number of comparisons are potentially corrupted by erroneous agents, resulting in arbitrary, adversarial outcomes.
We model the sorting problem as a query-limited tournament graph where edges involving erroneous nodes may yield arbitrary results. Our primary contribution is a randomized algorithm inspired by quick-sort that, in expectation, produces an ordering close to the true total order while only querying $\tilde{O}(n)$ edges. We achieve a distance from the target order $\pi$ within $(3 +\epsilon)|B|$, where $B$ is the set of erroneous nodes, balancing the competing objectives of minimizing both query complexity and misalignment with $\pi$. Our algorithm needs to carefully balance two aspects — identify a pivot that partitions the vertex set evenly and ensure that this partition is “truthful” and yet query as few “triangles” in the graph $G$ as possible. Since the nodes in $B$ can potentially {\em hide} in an intricate manner, our algorithm requires several technical steps that ensure that progress is made in each recursive step.
Additionally, we demonstrate significant implications for the Ulam-$k$-Median problem. This is a classical clustering problem where the metric is defined on the set of permutations on a set of $d$ elements. Chakraborty, Das, and Krauthgamer gave a $(2-\varepsilon)$ FPT approximation algorithm for this problem, where the running time is super-linear in both $n$ and $d$. We give the first $(2-\varepsilon)$ FPT linear time approximation algorithm for this problem. Our main technical result gives a strengthening of the results in Chakraborty et
al. by showing that a good 1-median solution can be obtained from a constant-size random sample of the input. We use our robust sorting framework to find a good solution from such a random sample. We feel that the notion of robust sorting should have applications in several such settings.