SML: Data Streams

STATISTICS 241B, COMPUTER SCIENCE 281B

Content

  • Random subsets

    • Hash functions

    • Approximate moments

    • Alon-Matias-Szegedy sketch

  • Heavy hitter detection

    • Lossy counting

    • Space saving

  • Randomized statistics

    • Flajolet counter

    • Bloom filter and logarithmic randomized techniques

    • CountMin sketch

  • Realtime analytics

    • Fault tolerance and scalability

    • Interpolating sketches

Supplementary material

Slides in PDF and Keynote. If you want to extract the equations from the slides you can do so by using LaTeXit, simply by dragging the equation images into it.

Videos

This is unedited video straight from a Lumix GF2 with a 14mm lens which should explain the sound and the exaggerated hands … But it should help as a supplement with the slides (YouTube typically makes the 1080i version available within 1 week of the upload).