Data Streams
Content
Random subsets
- Hash functions
- Approximate moments
- Alon-Matias-Szegedy sketch
Heavy hitter detection
- Lossy counting
- Space saving
Randomized statistics
- Flajolet counter
- Bloom filter and logarithmic randomized techniques
- CountMin sketch
Realtime analytics
- Fault tolerance and scalability
- Interpolating sketches
Supplementary material
Slides in PDF and Keynote. If you want to extract the equations from the slides you can do so by using LaTeXit, simply by dragging the equation images into it.
- Muthu Muthukrishnan’s tutorial
- Alon Matias Szegedy sketch. The original paper
- Count-Min sketch site
- Bloom Filter survey by Broder & Mitzenmacher
- Metwally, Agrawal, El Abbadi space saver
- Berinde, Indyk, Cormode, Strauss proof of space optimal bounds for space saving is here
- Graham Cormode’s tutorial
- Flajolet-Martin paper
Videos
This is unedited video straight from a Lumix GF2 with a 14mm lens which should explain the sound and the exaggerated hands … But it should help as a supplement with the slides (YouTube typically makes the 1080i version available within 1 week of the upload).