Skip to content

OCCAM: Object Culling and Concretization for Assurance Maximization

License

Notifications You must be signed in to change notification settings

sunlifeng/OCCAM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Occam Toolchain (BSD)

This package includes OCCAM software, available from: https://github.com/ashish-gehani/OCCAM/

The following terms apply to OCCAM:

Redistribution and use in source and binary forms, with or without 
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice, 
  this list of conditions and the following disclaimer in the documentation 
  and/or other materials provided with the distribution.

* Neither the name of SRI International nor the names of its contributors may 
  be used to endorse or promote products derived from this software without
  specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS 
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A 
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED 
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Installation / Getting Started

Instructions for installing the OCCAM toolchain and building a bitcode version of the FreeBSD base distribution. The following instructions have been tested with FreeBSD 10.0 amd64.

Prerequisites:

1. FreeBSD

The ideal way to start is with a clean install of FreeBSD 10.0 with sources installed. The simplest way to do this is to install from the Release 10.0 ISO image and and on the "Distribution Select" screen select just the following:

  [*] ports   Ports tree
  [*] src     System source code

If you are on an existing system that has either an old version of the source tree or is missing source, you can follow the instructions in the FreeBSD Handbook Chapter 24 to get the relevant sources.

2. Necessary ports

Upgrade the ports collection (as 'root'):

   su -
   portsnap fetch
   portsnap extract
   cd /usr/ports/ports-mgmt/portupgrade
   make -DBATCH install clean
   portupgrade -a --batch

Install the following ports using the BSD port tree:

  bash git subversion python27 protobuf py-protobuf sudo wget

(See the FreeBSD Handbook Chapter 5 for instructions.) The quick way to do this is:

   su -
   cd /usr/ports
   cd shells/bash && make -DBATCH install clean && \
     cd ../../devel/git && make -DBATCH install clean && \
     cd ../../devel/subversion && make -DBATCH install clean && \
     cd ../../devel/protobuf && make -DBATCH install clean && \
     cd ../../devel/py-protobuf && make -DBATCH install clean && \
     cd ../../security/sudo && make -DBATCH install clean && \
     cd ../../ftp/wget && make -DBATCH install clean

(Python27 is installed as a prerequisite of Subversion.)

You will need to do the following to make Python work:

   su -
   cd /usr/local/bin
   ln -s python2 python

Below we assume the shell being used is bash, that is:

   chsh -s /usr/local/bin/bash

has been run. If you want to use another shell, replace bash-isms like 'export' with the appropriate equivalent.

We suggest installing 'sudo' and setting up your account as a sudoer, to make installing programs easier. You can do this, or modify the commands that use 'sudo' below.

3. LLVM and Clang 3.5

Install LLVM and Clang version 3.5. (These instructions adapted from http://llvm.org/docs/GettingStarted.html) Decide where you want to install LLVM. If you have 'root' access, you can use the default '/usr/local', though any location is fine. You may then wish to add this to your shell startup (in '~/.profile' for bash):

  export LLVM_HOME=/usr/local

Get LLVM and Clang version 3.5:

  svn co http://llvm.org/svn/llvm-project/llvm/branches/release_35 llvm
  cd llvm/tools
  svn co http://llvm.org/svn/llvm-project/cfe/branches/release_35 clang
  cd ../projects
  svn co http://llvm.org/svn/llvm-project/compiler-rt/branches/release_35 compiler-rt
  cd ../..

Now finish the build and install:

  cd llvm
  mkdir build
  cd build
  ../configure --prefix=$LLVM_HOME --enable-assertions \
      --enable-targets=host-only --enable-optimized
  gmake -j8
  sudo gmake install

Note that the FreeBSD 10.0 base includes Clang 3.3 (but does not include the complete LLVM framework).

4. Checkout out the code:

   git clone https://github.com/ashish-gehani/OCCAM.git
   cd OCCAM
   export OCCAM_SRC=`pwd`/OCCAM

Installing OCCAM

With the prerequisites satisfied, we can install OCCAM. Decide where you want to install the tool. Below we assume it will be in your home directory. We recommend adding the below exports to your shell startup.

  export CC=clang
  export CXX=clang++
  export CPP='clang -E'
  export OCCAM_HOME=~/occam

Also, adjust the shell PATH so the LLVM tools built in step 3 above (that are in $LLVM_HOME/bin) will be picked up ahead of the system ones (typically in /usr/bin).

  export PATH=${LLVM_HOME}/bin:${PATH}
  cd $OCCAM_SRC

Edit the Makefile and uncomment the lines exporting LLVM_HOME and OCCAM_HOME at the top of the file (if they are not set in your environment). Change the paths to your choices. Now build and install the tool.

  gmake
  gmake install		#Note: 'sudo' is used for non-root Python setup

If you make modifications to the Python scripts and want to install the new ones without rebuilding the C++ library, use "gmake install-occam".

Create an alias to the 'occam' toolchain wrapper. For bash, add the following to to your shell startup:

  alias occam=$OCCAM_HOME/bin/occam

Mount the /proc filesystem

Part of the parallel build driver in the toolchain reads from the /proc filesystem. This isn't mounted by default on FreeBSD. We'll need to mount it before using the tool. You can mount it with the following command:

  sudo mount -t procfs proc /proc

Alternatively, you can add this filesystem to '/etc/fstab':

proc            /proc           procfs  rw      0       0

buildworld

To build a lot of useful libraries and executables in a hurry, we'll first run the FreeBSD 'buildworld' process using the OCCAM toolchain. For details of the 'buildworld' process, see Chapter 24.6 of the FreeBSD Handbook.

First, we need to patch one of the 'buildworld' make files to prevent it from removing the OCCAM toolchain from the path after it compiles fresh versions of Clang and the rest of the FreeBSD toolchain. We'll also need to tell 'make' to use Clang (which our toolchain will masquerade as) by changing '/etc/src.conf'. While we're at it, we'll turn off some parts of the build to reduce how much time it takes.

  cd /usr/src
  sudo patch -p1 < $OCCAM_SRC/patches/Makefile.inc1-occam.patch
  sudo cp $OCCAM_SRC/patches/src.conf /etc/src.conf

Before starting the build, you can choose to record logs from the OCCAM tool by setting the following variables:

  export OCCAM_LOGFILE={absolute path to log location}
  export OCCAM_LOGLEVEL={INFO, WARNING, or ERROR}

Now kick off the build. You will want to use -jN for some N, as this process takes a long time without the OCCAM toolchain, and even longer with (i.e. hours).

  mkdir $OBJ_DIR {wherever you want to build everything}
  MAKEOBJDIRPREFIX=$OBJ_DIR ; export MAKEOBJDIRPREFIX
  occam make buildworld

This should go smoothly, but occasionally IOErrors related to reading the /proc file system may occur. It should be safe to continue the build in this case:

  occam make -DNO_CLEAN buildworld

At this point, you have a few options. We have tested option 1 below, but the others should also work.

Option 1) Actually install your new "world"

To be really safe, we'll build a new kernel with this source as well. You may be able to skip this step, along with 'mergemaster', but we haven't tried that. For more information, use 'man mergemaster'. The "-iU" options try to make this process as automatic as possible.

!!! You'll need non-ssh access to the box for this.

If you don't want to rebuild the kernel, only perform the third step of the below five:

     cd /usr/src
     make buildkernel
     sudo shutdown now              # drops into single user mode
     make installkernel             # don't use -j
     mergemaster -p

Switching to single user mode results in a new shell, where environment variables and aliases must be set again:

     export OCCAM_HOME=~/occam
     export OCCAM_LOGFILE=/tmp/occam.log
     export OCCAM_LOGLEVEL=INFO
     export MAKEOBJDIRPREFIX=$OBJ_DIR
     alias occam=$OCCAM_HOME/bin/occam

The new world can now be installed:

     cd /usr/src
     occam make installworld       # don't use -j
     mergemaster -iU
     shutdown -r now

When the machine comes back up, you should have a full FreeBSD system with bitcode versions of all of the core libraries and executables.

Option 2) Install your new "world" somewhere else, chroot into it, and set up a

development environment.

Choose somewhere to install everything to $WORLD_DIR.

     occam make installworld DESTDIR=$WORLD_DIR

!Note, you cannot (according to the Handbook) use -j when using installworld. chroot into $WORLD_DIR and reinstall LLVM, Clang, and OCCAM.

Option 3) Extract all the libraries and store them somewhere else.

Choose a folder to store all of your libraries, $LIB_DIR. Extract all the libraries you built into it.

     find /usr/obj/usr/src/lib -name "*.bc.a" -exec cp {} $LIB_DIR \;

You can also extract manifests, partially built applications (pre-library linking), bitcode-compiled applications, and native applications compiled from the bitcode versions. The manifests may contain absolute paths to libraries that may need changing, however.

 find /usr/obj/usr/src/usr.* -name "*.manifest" -exec cp {} $DEST \;
 find /usr/obj/usr/src/usr.* -name "*_main.bc" -exec cp {} $DEST \;
 find /usr/obj/usr/src/usr.* -name "*.bc" -exec cp {} $DEST \;

When trying to use the OCCAM toolchain with these libraries, you'll need to add $LIB_DIR to $OCCAM_LIB_PATH (described further below). When specializing or using "occam2 build" with a manifest, you'll need to add $LIB_DIR to the "search" entry in the manifest.

Building applications and libraries

Once you have your base system installed and ready to go, you should be ready to compile and install other libraries or applications.

First, set the following environmental variables. (You may want to set these in

   CC=clang
   CXX=clang++
   CPP='clang -E'

You should now be able to compile most well-behaved projects with:

   occam ./configure <options>
   occam make
   occam make install

Occam has a few configuration options, set via environmental variables.

  • OCCAM_LOGFILE Absolute path of file to log to
  • OCCAM_LOGLEVEL Logging detail, may be one of INFO, WARNING, or ERROR
  • OCCAM_LIB_PATH Additional directories to include in the search path for finding bitcode versions of libraries. Not necessary if libraries are installed next to their native versions (e.g., /path/to/libs:/morelibs)
  • OCCAM_PROTECT_PATH Set/unset flag. If set, OCCAM will remove all non-OCCAM items from the path and intercept calls to common executables so that they can be logged at level INFO. Can be useful for debugging a build that isn't working properly.

WLLVM installation

Occam leverages WLLVM bitcode generation for building bitcode equivalents for libraries and executables. Install WLLVM:

    https://github.com/SRI-CSL/whole-program-llvm/blob/master/wllvm

Make sure the WLLVM tools are added to the PATH variable.

To compile as much code as possible to bitcode, you should choose any available configure options for building or using static libraries rather than shared ones.

Special notes for Apache

For apache, you'll first need to compile and install libpcre, libapr, and libapr-util. The configurations I used for each of these are:

  • libpcre: built from pcre-8.31.tar.gz
     occam ./configure --enable-shared=no
  • libapr: built from apr-1.4.6.tar.gz
     occam ./configure --enable-shared=no
  • libapr-util:
     occam ./configure --with-apr=/usr/local/apr --disable-util-dso --with-crypto

You may be able to drop --with-crypto, which could fix the configure issue for apache that I describe below. For apache, I used the following configure:

    occam ./configure --prefix=$SOMEPREFIX --enable-modules=none   \
    	  --enable-mods-static=most --enable-static-support        \
	  --with-apr=/usr/local/apr --with-apr-util=/usr/local/apr \
	  --with-pcre=/usr/local

You then need to patch build/config_vars.mk because something in the configure script seems to be detecting crypto options wrong. I'll debug this, but the workaround for now is to fix the following variables:

   CRYPT_LIBS = -lcrypt
   SSL_LIBS = -lssl -lcrypto -lcrypt -lpthread

To run apache using command line options (to make it easy to previrtualize), you'll need to set up a basic ServerRoot directory somewhere with the following configuration.

   $SERVERROOT
   |- conf
      |- mime.types       (an empty text file)
   |- logs		  (an empty directory)
   |- www
      |- index.html	  (something to serve...)
   |- httpd.conf	  (an empty text file)

The following set of arguments should then work to get apache serving files:

   -d $SERVERROOT
   -f httpd.conf
   -C 'Listen 8181'
   -C 'LogLevel debug'
   -C 'User apache' -C 'Group apache'
   -C 'ServerName localhost'
   -C 'ErrorLog /home/moore/root/error_log'
   -C 'DocumentRoot /home/moore/root/www' 
   -C 'AddType text/plain .html' 
   -k start

Using manifest files

Once you have a manifest file and a bundle, you can use it to build a final bitcode version of an application or specialize it. To build from the manifest, just execute:

   occam build $MANIFEST_FILE $NAME_FOR_EXECUTABLE

To specialize the application, add a "args": ["arg1", "arg2", ...] entry to the manifest file and call the previrt tool:

   occam previrt --work-dir=$WORKDIR $MANIFEST_FILE

As before, you can set OCCAM configuration variables to control logging.

TODO items

-- General -------------------------------------------------------------------

[ ] Dead code elimination on all of the repositories to clean up unused passes,
    targets, #ifdef 0-ed regions, etc, etc, etc

-- Toolchain -------------------------------------------------------------------

[ ] Save object files off when bundling. The bundling process uses llvm-ld which
    does not link ELF files into the resulting bitcode. We should save off the
    necessary ELF object files from the inputs or library to link in later at
    build or previrtualization time. (This causes the manifest build for at
    least clang during the buildworld process, and probably others)
[ ] Link in said object bundles during occam-build or occam-previrt

-- Previrt -------------------------------------------------------------------

[ ] Debug previrtualized version of httpd. The previrtualized version of httpd
    does not start correctly. It seems likely this is due to interactions with
    pthreads (possibly some incorrectly eliminated symbols?).
[ ] Figure out how to avoid needing to relink in libc, etc. to get the symbols
    we need for the C runtime. (and C++/pthreads?)
[ ] Add support for better call-graphs. Either improve Gregory's call graph,
    choose a better one that's already in llvm, or port over the DSA call-graph
    stuff in poolalloc.
[ ] Support for previrtualizing un-named functions
[ ] Support for previrtualizing internal function pointers accross module 
    boundaries for call backs.
[ ] To avoid creating multiple clones of the same specialized function, we check
    if it already exists by looking it up in the module by name. This check
    should probably be more robust. This is related to support for un-named
    functions because we need to make sure we're using unique specializations
    for the different functions.
[ ] For ease of understanding, give created globals for strings, etc., names so
    they don't show up as unnamed in the symbol table.
[ ] Extract statistics about the previrtualization process.

About

OCCAM: Object Culling and Concretization for Assurance Maximization

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 96.6%
  • Protocol Buffer 2.2%
  • Makefile 1.1%
  • Other 0.1%