| Mining of Massive Datasets
|
The book has a new Web site www.mmds.org.
This page will no longer be maintained. Your browser should be automatically redirected
to the new site in 10 seconds.
The book has now been published by Cambridge University Press.
The publisher is offering a 20% discount to anyone who buys the hardcopy
Here.
By agreement with the publisher, you can still download it free from this page.
Cambridge Press does, however, retain copyright on the work, and we expect that you will obtain their
permission and acknowledge
our authorship if you republish parts or all of it. We are sorry to have to mention this
point, but we have evidence that other items we have published on the Web have been
appropriated and republished under other names. It is easy to detect such misuse,
by the way, as you will learn in Chapter 3.
--- Jure Leskovec, Anand Rajaraman (@anand_raj), and Jeff Ullman
Contents
-
The Original Book.
-
The Latest Version of the Book.
-
Support Materials, including Gradiance automated homeworks for the book, slides,
and the errata sheet.
Download Version 2.1
The following is the second edition of the book, which we expect to be published soon. We have added Jure Leskovec
as a coauthor. There are three new chapters, on mining large graphs, dimensionality reduction, and machine learning.
There is a revised Chapter 2 that treats map-reduce programming
in a manner closer to how it is used in practice, rather than how it was described in the
original paper. Chapter 2 also has new material on algorithm design techniques for map-reduce.
Version 2.1 adds Section 10.5 on finding overlapping communities in social graphs.
Download the Latest Book (511 pages, approximately 3MB)
Download chapters of the book:
Preface and Table of Contents
Chapter 1 Data Mining
Chapter 2 Map-Reduce and the New Software Stack
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems
Chapter 10 Mining Social-Network Graphs
Chapter 11 Dimensionality Reduction
Chapter 12 Large-Scale Machine Learning
Index
Download Version 1.0
The following materials are equivalent to the published book, with errata corrected to
July 4, 2012. It has been frozen as we revise the book. The evolving book can
be downloaded as Version 1.3 above.
Download the Book as Published (340 pages, approximately 2MB)
Download chapters of the book:
Preface and Table of Contents
Chapter 1 Data Mining
Chapter 2 Large-Scale File Systems and Map-Reduce
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems
Index
Gradiance Support
If you are an instructor interested in using the Gradiance Automated
Homework System with this book, start
by creating an account for yourself at
www.gradiance.com/services.
Then, email your chosen login and the request to become an instructor for the MMDS book
to support@gradiance.com
You will then be able to create a class using these materials.
Manuals explaining the use of the system are at
www.gradiance.com/info.html.
Students who want to use the Gradiance system for self-study can register at
www.gradiance.com/services.
Then, use the class token 1EDD8A1D to join the "omnibus class" for the MMDS book.
See The Student Guide for more information.
Other Stuff
- Jure's Materials from the most recent CS246.
- Slides and Course Material from old CS345A. Like the
book, you are welcome to use these as you like, but please preserve our authorship.
- The Errata Sheet for the hardcopy version.
We shall endeavor to keep the downloads
up to date. Note that the pagination is different on the version we maintain, but you
can check whether your download is up-to date from the hardcopy errata.
Please report errata to ullman a t gmail.com.