Kumaré

HOGWILD!

There are two different Hogwild! releases. The newest version is an alpha release of Hogwild! which has been rewritten with a focus on performance and maintainability. The Hogwild!-Experiments release is the original Hogwild! used to preform the experiments in the Hogwild! paper. This version behaves slightly differently than the current alpha-release Hogwild!.

Alpha Source code and Datasets

The release is available with and without example datasets since the datasets are relatively large.

(Alpha) Source Code Only (4.6MB) (Alpha) Code and Data (894MB)

(Version: 03a; last updated: 11/7/2012)

Test with the following compilers:

  • g++ (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)
  • clang version 3.1 (tags/RELEASE_31/final)
It can be built on either Linux or OSX (tested on 10.7).

Paper Version: Code and Experiments

This is the original version of Hogwild! used to conduct the experiments for the paper. The source code of Hogwild is small, but the data used in our experiments is quite sizable. For your convenience, we provide two packages: one package with source code only and another package with both source code and several (prepared) datasets we used:

Source Code Only (17KB) Code and Data (910MB)

(last updated: 11/21/2011)

Below is a list of datasets we experimented with and their links. ([+] indicates that it's included in the "code and data" package. We leave out datasets that are too large.)

After downloading and unpacking, you can follow the instructions in README. As a quick guide, follow the following steps to run the experiments and generate graphs:

  1. Run "make", which will build binaries in the directory "bin".
  2. If you don't have a "data" directory, you probably want to download the code-and-data package of HOGWILD!
  3. Open "experiments/experiments_settings.py" to change data paths and other parameters as necessary.
  4. Run "python experiments/build_experiments.py" to generate command lines based on your settings in "experiments/experiments_settings.py".
  5. Run these command lines, which log to the directory "output".
  6. Run "python experiments/produce_graphs.py" to generate experiment graphs based on the logs in "output". The graphs are stored as PDF files in the directory "graphs". (Note that matplotlib is required.)

Publications

Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J. Wright. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. Published on NIPS 2011.