Systems
Content
Hardware
- Processor, RAM, buses, GPU, disk, SSD, network, switches, racks, server centers
- Bandwidth, latency and faults
Basic parallelization paradigms
- Trees, stars, rings, queues
- Hashing (consistent, proportional)
- Distributed hash tables and P2P
Storage
- RAID
- Google File System / HadoopFS
- Distributed (key, value) storage
Processing
- MapReduce
- Dryad
- S4 / stream processing
Supplementary material
- Consistent hashing (Karger et al.) paper
- Stateless Proportional Caching (Chawla et al.) paper, slides
- Pastry P2P routing (Rowstron and Druschel) paper, site
- MapReduce (Dean and Ghemawat) site
- Google File System (Ghemawat, Gobioff, Leung) site
- Amazon Dynamo (deCandia et al.) slides, paper, Peter Vogel’s DynamoDB post
- BigTable (Chang et al.) site
- CEPH filesystem (proportional hashing, file system) homepage, paper on distribution protocol
- CPUS Sandy Bridge, ARM micoarchitecture
- NVIDIA CUDA CUDA site
- ATI Stream Computing site
- Microsoft Dryad (Isard et al.) site
- Yahoo S4 (Neumayer et al.) site, slides, paper
- Memcached site
- Linked.In Voldemort (key,value) storage design description
- PNUTS distributed storage (Cooper et al.) paper
- SSDs (solid state drives) benchmarks
- All Things Distributed, Peter Vogel’s blog
Videos