#### Faculty Candidate Seminar

# What we can solve with a sublinear number of samples

Add to Google Calendar

Dr. Raskhodnikova is a Post-doctoral Fellow at the Weizmann Institute of Science in Rehobot, Israel; she received her PhD from MIT in 2003

Suppose we are given a list of numbers and we wish to determine

whether it is sorted. That problem obviously requires reading the

entire list. But Ergun, Kannan, Kumar, Rubinfeld and Viswanathan

showed in 1998 that if we know in advance that our list is either

sorted or far from sorted, we can perform the test by examining only a

small portion of the list. Testing if a list is sorted is thus an

example of a task that can be performed with a small number of samples

into the input.

As data of all types gets easier to obtain and cheaper to store,

datasets are becoming increasingly large, and we are faced with a

need to perform computational tasks on massive datasets. Can we

still compute something useful about a dataset when reading all of

it is prohibitively expensive? In other words, what can we solve

with a sublinear number of samples from the input? This is the main

question addressed by the newly emerging area, called Sublinear

Algorithms.

In this talk, we will give a few examples of specific problems that

can be solved with a sublinear number of samples, starting with the

sorting example and moving on to simple properties of images, edit

distance between strings, and approximating compressibility of a

string. We will also explain a framework called "property testing"

that was defined by Rubinfeld and Sudan in 1996 to formalize the kind

of approximation that sublinear algorithms are good at. We will

present a partial classification of what can and cannot be solved in

that model when an algorithm can query the input only in a sublinear

number of places.