Priority Sampling


June 30, 2011

Tamas Sarlos pointed out a much smarter strategy on how to obtain a sparse representation of a (possibly dense) vector: Priority Sampling by Duffield, Lund and Thorup, 2006. The idea is quite ingenious and (surprisingly so) essentially optimal, as Mario Szegedy showed. Here’s the algorithm (please read the previous blog on vector sparsification for some motivation):

This provides an estimator with the following properties:

Note that we assumed that all \(x_i \geq 0\). If not, simply apply the same algorithm to \(|x_i|\) and return signed versions of the estimate.