Loading Events

Dissertation Defense

Finding and Tolerating Concurrency Bugs

Jie Yu
SHARE:

Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can interleave in so many different ways. To ensure correctness, a programmer has to test all possible thread interleavings, which, however, is impractical. Many rare thread interleavings remain untested in production systems, and they are the root cause for a majority of the concurrency bugs.

Given that untested interleavings are the root cause of a majority of the concurrency bugs, we explore two possible ways to tackle concurrency bugs in this dissertation. We can either expose untested interleavings during testing to find concurrency bugs, or avoid untested interleavings during production runs to tolerate concurrency bugs. The key is an efficient and effective way to encode and remember tested interleavings.

We first discuss two hypotheses about concurrency bugs: the small scope hypothesis and the value independent hypothesis. Based on these two hypotheses, we define a set of interleaving patterns, called interleaving idioms, which we use to encode tested interleavings. We show that the idiom based interleaving encoding scheme is able to represent most of the concurrency bugs we have analyzed.

We then discuss a testing tool called Maple. It memoizes tested interleavings and actively seeks to expose untested interleavings. We show that Maple is able to expose concurrency bugs and expose interleavings faster than other conventional testing techniques.

Finally, we discuss two parallel runtime system designs which seek to avoid untested interleavings during production runs to tolerate concurrency bugs. Avoiding untested interleavings significantly improve correctness because most of the concurrency bugs are caused by untested interleavings. Also, the performance overhead for disallowing untested interleavings is modest as commonly occuring interleavings should have been tested in a well-tested program.

Sponsored by

S. Narayanasamy