MMDS: Beta Version of Third Edition
      You are welcome to download components of what will become the third edition of Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman. Please be advised that the new materials have not been reviewed, even by all of the authors, and we would appreciate your telling us of any errors you find (email to ullman at gmail dotcom). Here is a table of the new materials and the major changes so far.
      ChapterLinkMajor Changes
      1Ch. 1A revised discussion of the relationship between data mining, machine learning, and statistics in Section 1.1.
      2Ch. 2Spark and TensorFlow added to Section 2.4 on workflow systems
      3Ch. 3More efficient method for minhashing in Section 3.3
      10Ch. 10More detail on finding overlapping communities in Section 10.5, approximate simrank and application to community-finding in Section 10.6, and more efficient methods for parallel transitive closure in Section 10.8
      12Ch. 12New section on decision trees
      13Ch. 13New chapter on deep learning
      Entire BookBook

      Gradiance Support

      If you are an instructor interested in using the Gradiance Automated Homework System with this book, start by creating an account for yourself at Then, email your chosen login and the request to become an instructor for the MMDS book to You will then be able to create a class using these materials. Manuals explaining the use of the system are at

      Students who want to use the Gradiance system for self-study can register at Then, use the class token 1EDD8A1D to join the "omnibus class" for the MMDS book. See The Student Guide for more information.