Installation Guide for Elementary

SIGMOD Paper of Elementary

This page contains the following information:

Installation Guide
User Guide
More Options (Using Greenplum)
Future Improvements

1. Installation Guide

Elementary relies on C++ and PostgreSQL. This section explains how to install the prerequisites.

Install C++

We require gcc/g++ newer than 4.7.2. We have successfully tested our code with gcc/g++ 4.7.2 on RHEL 5 and MacOS 10.7.

Install and Configure PostgreSQL

We have tested our code on PostgreSQL 9.2.3. If you don't have PostgreSQL 9.2.3 on your machine, please download and install it. Let PG_DIST be the path where you unpack the source distribution. Let PG_PATH be the location where you want to install PostgreSQL. Run the following commands:
```
cd PG_DIST
./configure --prefix=PG_PATH
gmake; gmake install
cd PG_PATH/bin
initdb -D PG_PATH/data
postgres -D PG_PATH/data &
createdb test
psql test
```
initdb initializes a directory to store databases; postgres launches the PostgreSQL server daemon; createdb creates a new database (with the name 'test' in our example); and psql takes you to the interactive console of PostgreSQL, where you can issue numerous kinds of SQL queries. Type '\h' for help, and '\q' to quit.
Create a PostgreSQL super user with a name, say, "postgres". Look here, or simply run the following command:
```
PG_PATH/bin/createuser -s -P postgres
```
You will be prompted for a password. Henceforth, let's assume the password is "strongPasswoRd".
Create a database with a name, say, "bugs". Look here, or simply run the following command:
```
PG_PATH/bin/createdb bugs
```

Compile Elementary

Download Elementary source from the download page. Please follow the instructions below to install it.

After unpacking, go to Elementary0.3 (lets call this $ELE_HOME), where you will see a Makefile. First make the dependencies using:
```
$ CC="PATH_TO_CC" make dep
```
where CC and CPP point to the installed location of gcc/g++. On a Mac, you need to further specify BUILD_PREFIX="--build=x86_64-apple-darwin10.0.0"
Then we build Elementary by:
```
$ CC="PATH_TO_CC" CPP="PATH_TO_CPP" PG_PATH="PATH_TO_POSTGRES_INSTALLATION" make
```
where PG_PATH is the directory where PostgreSQL was installed in the previous step. This will produce a binary called "ele" in the same folder.
You then need to set up an environment variable to include ./lib/urcu/lib/
```
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./lib/urcu/lib/
```

Configuration

Go to $ELE_HOME; you should see a configuration file named "bugs_config". This file contains a list of parameters that help you configure the requisites for Elementary. The figure below shows the default values. (# indicates comments. This is used here just for explanatory purposes. Please don't use # in the "bugs_config" file).

# Database username
PostgreSQL_uname=postgres

# The password for the database user
PostgreSQL_password=strongPasswoRd

# The hostname
PostgreSQL_host=localhost

# The port in which postgres is running
PostgreSQL_port=5433

# The database name
PostgreSQL_dbname=bugs

# This parameter controls the amount of memory that will be consumed by the OpenBUGS Interface. 
# A conservative setting would be [Available_Memory_For_Elementary]/400.
Rows_per_fetch=2000000

#EXP_STORAGE MM (or) FILE. Default is MM (Main Memory). 
EXP_STORAGE=MM

2. User Guide

We use a simple example to illustrate how to use OpenBugs Model Specification Interface for Elementary. The input/output formats and command options are compatible with OpenBUGS language Specification.

Input

The Input consists of the standard Model File, Data File, and Inits File, as specified in OpenBUGS. The location of these files can be specified in the "bugs_config" file, as shown below. One can also set monitors (as can be done in OpenBUGS) on the variables one wants to observe the results. The monitors should be specified as comma separated values without any spaces in between them. The default is empty, which monitors all the variables.

#The model file. Default is test.model
Bugs_Model=test.model

#The data file. Default is test.data
Bugs_Data=test.data

#The inits file. Default is test.inits
Bugs_Inits=test.inits

#Monitors on variables.
Bugs_Monitors=alpha,beta

Inference

To run inference for the Model specified, one can specify the following command from $ELE_HOME. In the command, --work_dir specifies the TEMPORARY_DATA_DIR, which will be used by elementary for storing temporary data. --app "bugs" invokes the OpenBUGS interface. --nepoch specifies the NUMBER_OF_UPDATES to the model (as specified in OpenBUGS)

./ele --work_dir "TEMPORARY_DATA_DIR" --app "bugs" --nepoch "NUMBER_OF_UPDATES"

An example inference output for the Pumps Model is as shown below

3. More Options

We also have a version where we use Greenplum. For this version, alongside the previous installation of PostgreSQL, a user also needs to install Greenplum. We have tested our code on Greenplum 4.2. If you don't have Greenplum installed on your machine, please download and install it. After installation, create a database with the name "bugs" in Greenplum. Once it is done, add the Greenplum specific parameters to "bugs_config" as shown below:

# Database username
Greenplum_uname=greenplum

# The password for the database user
Greenplum_password=greenplumPassWOrd

# The hostname
Greenplum_host=localhost

# The port in which greenplum is running
Greenplum_port=5432

# The database name
Greenplum_dbname=bugs

To use Greenplum while doing inference, please specify the command as follows.

./ele --work_dir "TEMPORARY_DATA_DIR" --app "bugs" --bugs_hybrid --nepoch "NUMBER_OF_UPDATES"

4. Current Unsupported features from the OpenBUGS language Specification

Data Transformations
Wishart and Generalized F distribution
Truncation
Rectangular format for data
Comments are not yet supported in the model, data, or inits file.
Scalar Functions: cut, density, deviance, gammap, integral, post.p.value, prior.p.value,replicate.post(s),replicate.prior(s), solution, cumulative.
Vector Functions: interp.lin, inverse, logdet, eigen.vals, ode, prod, p.valueM, rank, ranked, replicate.postM, sort

Also, when sampling from distributions with very high variance, we don't get the same results as OpenBUGS. We are currently working on this!

Home

People

Projects

Publications

Elementary

Download

Documentation

ChangeLog