| Mining of Massive Datasets |
Below is a draft, evolving version of the MMDS book. We have added Jure Leskovec as a coauthor. There are two new chapters, on mining large graphs and dimensionality reduction. A chapter on large-scale machine learning is expected soon.
There is a revised Chapter 2 that treats map-reduce programming in a manner closer to how it is used in practice, rather than how it was described in the original paper. Chapter 2 also has new material on algorithm design techniques for map-reduce.
Download the Latest Book (453 pages, approximately 2.7MB)
Download chapters of the book:
Preface and Table of Contents
Chapter 1 Data Mining
Chapter 2 Map-Reduce and the New Software Stack
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems
Chapter 10 Mining Social-Network Graphs
Chapter 11 Dimensionality Reduction
Index
The following materials are equivalent to the published book, with errata corrected to July 4, 2012. It has been frozen as we revise the book. The evolving book can be downloaded as Version 1.3 above.
Download the Book as Published (340 pages, approximately 2MB)
Download chapters of the book:
Preface and Table of Contents
Chapter 1 Data Mining
Chapter 2 Large-Scale File Systems and Map-Reduce
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems
Index
Students who want to use the Gradiance system for self-study can register at www.gradiance.com/services. Then, use the class token 1EDD8A1D to join the "omnibus class" for the MMDS book. See The Student Guide for more information.