Aditya Parameswaran

I am an assistant professor of Computer Science at the University of Illinois (UIUC) . My research interests are broadly in simplifying and improving data analytics, i.e., helping users make better use of their data.

My work involves building real data analytics systems with principled foundations, designing algorithms (with formal guarantees) for the systems, as well as mining data obtained from such systems.

Biographical Sketch

Aditya Parameswaran is an Assistant Professor in Computer Science at the University of Illinois (UIUC). He spent the 2013-14 year visiting MIT CSAIL and Microsoft Research New England, after completing his Ph.D. from Stanford University, advised by Prof. Hector Garcia-Molina. He is broadly interested in data analytics, with research results in human computation, visual analytics, information extraction and integration, and recommender systems.

Aditya is a recipient of the Arthur Samuel award for the best dissertation in CS at Stanford (2014), the SIGMOD Jim Gray dissertation award (2014), the SIGKDD dissertation award runner up (2014), the Key Scientific Challenges Award from Yahoo! Research (2010), three best-of-conference citations (VLDB 2010, KDD 2012 and ICDE 2014), the Terry Groswith graduate fellowship at Stanford (2007), and the Gold Medal in Computer Science at IIT Bombay (2007).

News

  • November 10, 2014: Three new paper acceptances in the last month! Our Datahub paper was accepted at CIDR; our rapid approximate visualization generation paper was accepted at VLDB; and our paper on generalized confidence intervals for crowdsourced workers was accepted at ICDE!
  • October 10, 2014: Thrilled to be a part of the new NIH BD2K (Big Data 2 Knowledge) center for revolutionizing genomic data analysis. Thank you, NIH, for the support!
  • September 2, 2014: Finally, we can talk about our exciting new project, titled Datahub (i.e., GitHub for Data) on collaborative data science and version management. The ambitious goal is to eliminate the pain-points of data book-keeping while doing collaborative data science.
  • September 1, 2014: Our paper on pricing for crowdsourcing tasks has been accepted for presentation at VLDB 2015! The paper studies a simple, but important problem: if you have a batch of tasks and a deadline, how should you vary price to meet the deadline?
  • August 25, 2014: Pleasantly surprised to be selected as the KDD dissertation award runner-up, having already been given the SIGMOD dissertation award! Feel truly lucky to have two communities - SIGMOD and KDD - supporting my work!
  • August 24, 2014: Had a blast being a keynote speaker at KDD IDEA 2014 - a big thank you to the organizers for inviting me! If this year was any indication, IDEA is going to flourish as a workshop for many years!
  • August 20, 2014: Our paper on optimally learning maximum-likelihood worker accuracies has been accepted as a work-in-progress paper for HCOMP 2014! The paper tackles the problem of worker quality estimation in a way EM-based algorithms cannot - by providing optimality guarantees.
  • August 15, 2014: Started at Illinois; exciting times ahead!

Synergistic Activities

I am currently serving on or have served on the Program Committees of: VLDB 2013-14, WWW 2014, SIGMOD 2014-15, WSDM 2015, SOCC 2014, HCOMP 2014, ICDE 2014, and EDBT 2014.

Visual Analytics

Automatically recommending visualizations or visual summaries on very large volumes of data

View details »

Approximate Analytics

Interactive querying of large datasets while sacrificing slightly on accuracy of query results

View details »

Crowd-Powered Analytics

Using crowdsourcing to process and make sense of large volumes of data

View details »

Information Extraction

Extracting information from the web, integrating it with existing information, and surfacing this information to users

View details »

Recommendation Systems

Building scalable recommendation systems that take into account contextual information

View details »

Recent Releases

Selected Projects

Datasift

DataSift: A Crowd-Powered Search Engine

DataSift is a crowd-powered search engine that is useful for long or complex queries that traditional search engines have trouble with, or with queries that contain rich media, such as images or videos.


SeeDB

SeeDB: Automatic Visualization Recommendation

SeeDB automates the task of finding the right visualization for a query, significantly simplifying the laborious task of identifying appropriate visualizations.


crowd-alg

Crowd Algorithms

Our work has developed a number of algorithms for gathering, processing, and understanding data obtained from humans (or crowds), while minimizing cost, latency, and error.


Needletail

NeedleTail: A System for Browsing

NeedleTail is a system tuned towards instantly returning a small number (a "screenful") of query results very quickly on extremely large datasets.