Getting started

The following instructions are for the Unix and OSX platforms. Windows support is coming soon.

From Binary

Download the latest version of OptiML (0.3.4-alpha)

Extract the distribution:

tar -xzf optiml-0.3.4-alpha.tgz

Add OptiML to your path:

cd optiml-0.3.4-alpha
export PATH=$PWD/bin:$PATH

Run the OptiML REPL:

optiml

The first time you run the REPL, it will take a couple of minutes to download OptiML's dependencies. It will also take a short period to load OptiML inside the REPL. Because OptiML is an embedded DSL inside Scala, the OptiML REPL is the Scala REPL with OptiML types and operations pre-loaded. To start playing, create a new DenseVector or new DenseMatrix:

val m = DenseMatrix.rand(10,10)
val v = DenseVector.rand(10)
v+v

Example applications are distributed with OptiML in the apps folder. To run an application, first we need to compile them:

sbt compile

Next, we can either run an interpreted version of our application directly, or stage it, to run DSL compilation and generate code for different targets. The interpreter version is just a pure Scala library that can be used to develop and prototype before switching to the Delite version for high performance. To run the interpreter version of logistic regression, run:

delitec LogRegInterpreter [input training data file] [input label file]

The logistic regression application is located in apps/OptiML/src/LogReg.scala. It expects a white-space delimited matrix of floating point values (one row per line) for the training data and a vector with one entry per line (either 0 or 1 representing the classification of the corresponding data element) as its input.

To run the Delite version of the application, we first stage it using delitec, and then run it using delite:

delitec LogRegCompiler
delite LogRegCompiler [input training data file] [input label file] -t [num threads]

You should notice that this version runs significantly faster than the interpreter version. To see the code that OptiML generated, look inside the generated/ folder. With default arguments, we only attempt to generate parallel Scala kernels, which are located in generated/scala/kernels. Run

delitec --help
delite --help

to see a list of options you can use when compiling and running the Delite high performance version of your application. For example, to generate C++ kernels in addition to Scala kernels for logistic regression, we would use:

delitec LogRegCompiler --cpp
delite LogRegCompiler [input training data file] [input label file] --cpp [num threads]

The generated C++ kernels will be located in generated/cpp/kernels. In order to use the C++ and CUDA code generators, the configuration files in config/delite/ must have correct paths.

To learn the language basics, click here. For a more detailed example, try Deep Learning with OptiML.

Good luck and happy hacking! Contact us at optiml@googlegroups.com if you have any questions or run into any problems.

From Source

Delite requires Java 1.7 or later, Python 2.7, git and SBT 0.13. Instructions for installing these can be found at their respective websites. Be sure to increase the default heap size for your sbt executable script; a command that works well on most machines is:

java -Xmx4g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=128m -jar `dirname $0`/sbt-launch.jar "$@"

To get started, you first need to download the source for virtualization-lms-core, Delite and Forge. These three libraries provide 1) staging support to embed DSLs in scala, 2) a parallel and heterogeneous compiler, and 3) the meta DSL that Delite DSLs are implemented in, respectively. Delite uses git for version control (see here for installation instructions).

These three repositories have been packaged into a singe repository called hyperdsl using git submodules. First, make a copy of the hyperdsl repository and its submodules on your machine:

git clone https://github.com/stanford-ppl/hyperdsl.git
cd hyperdsl 
git submodule update --init

Next, we need to compile each project and make them available to OptiML as a dependency. Compiling requires a few environment variables to be set, the proper defaults for which can be found in init-env.sh. From the hyperdsl directory type

source init-env.sh
sbt compile

We're almost done. Now we need to run Forge to generate the most recent OptiML version:

forge/bin/update ppl.dsl.forge.dsls.optiml.OptiMLDSLRunner OptiML

The OptiML distribution is now located in the directory published/OptiML. We can run applications from there, or copy this directory somewhere else (e.g. ~/OptiML). For now, we'll assume we want to run from within the current directory tree (see the Forge instructions for more detailed information on this setup).

Now we are ready to write "hello world" in OptiML. Existing OptiML applications are located in apps/src inside the published OptiML directory. It is easiest to add your application here, but you can add it anywhere you want as long as you update the SBT configuration to point to your new location. For now, let's open up a file in apps/src called 'HelloWorld.scala' and copy this text in:

import optiml.compiler._
import optiml.library._
import optiml.shared._

object HelloWorldInterpreter extends OptiMLApplicationInterpreter with HelloWorld 
object HelloWorldCompiler extends OptiMLApplicationCompiler with HelloWorld 
trait HelloWorld extends OptiMLApplication { 
  def main() = println("hello world")
}

OptiML comes with two versions: an interpreter, which is a pure Scala library implementation, and the compiler, which is the Delite implementation. All Delite applications have three phases: compiling (Scala), staging (DSL compilation), and executing (runtime). The first step, running the scala compiler, checks your application for syntactic and type correctness. Staging is where the main compilation happens: the DSL builds an IR of the program, optimizes it, and generates an execution graph along with code for multiple targets (e.g. Scala and CUDA). Finally, the Delite runtime reads in the execution graph and generated kernels and executes the application.

Since Delite compilation can take a while (but produces high performance code), the recommended strategy is to use the OptiML interpreter while developing and prototyping, and then run the Delite version to scale out to larger datasets or heterogeneous hardware.

So without further ado, let's get to it. Starting from the published OptiML directory, first compile the application and then run the interpreter version:

sbt compile
bin/delitec HelloWorldInterpreter

If all went well, you should see hello world printed to the console!

Now let's run the Delite version. Delite scripts require the JAVA_HOME environment variable to be set. If it is not already, you will need to set it:

export JAVA_HOME=[absolute path of Java installation] 

Now we can run the compiler version simply by changing the argument to the Delite:

bin/delitec HelloWorldCompiler
bin/delite HelloWorldCompiler

In this case delitec generates new code for the application rather than interpreting the application directly, and delite runs the generated code. You should see output like the following:

Delite Runtime executing with the following arguments:
HelloWorldRunner.deg
Delite Runtime executing with 1 CPU thread(s) and 0 GPU(s)
Beginning Execution Run 1
hello world
[METRICS]: Latest time for component all: 0.032000s

Congratulations! You just ran your first OptiML application. The generated execution graph will be in published/OptiML/HelloWorldRunner.deg. Also look around in the folder published/OptiML/generated/ - it contains code OptiML generated! Take some time to look at the generated DEG file, and the generated kernels to see what went on behind the scenes.

Delite is capable of using CUDA to execute applications on the GPU as well as BLAS to accelerate certain matrix operations on the CPU. These features require some additional setup in order to work with your environment. Click here for instructions.

To learn the language basics, click here. For a more detailed example, try Deep Learning with OptiML.