Jellyfish
Ben Recht and I have released some software for large-scale matrix completion. If your algorithm is faster on billion-entry matrices, send it to us so we can learn how to go faster. The key idea is to adjust how the stochastic gradient samples the data to increase the opportunities for parallelism. The algorithm is essentially buzzword complete: a large-scale parallel stochastic gradient algorithm for nonconvex relaxations.
Source Code
Publications
Benjamin Recht and Christopher Re, Parallel Stochastic Gradient Algorithms for Large-Scale Matrix Completion. Available on Optimization Online.