Systems

Introduction to Machine Learning - 10-701/15-781

Slides in Keynote and PDF.

Content

  • Hardware

    • Processor, RAM, buses, GPU, disk, SSD, network, switches, racks, server centers

    • Bandwidth, latency and faults

  • Basic parallelization paradigms

    • Trees, stars, rings, queues

    • Hashing (consistent, proportional)

    • Distributed hash tables and P2P

  • Storage

    • RAID

    • Google File System / HadoopFS

    • Distributed (key, value) storage

  • Processing

    • MapReduce

    • Dryad

    • S4 / stream processing

Supplementary material

Videos