Skip to content

Instantly share code, notes, and snippets.

@rdegraci
Created June 29, 2017 13:36
Show Gist options
  • Select an option

  • Save rdegraci/a66b6934fcc71e87b5b050e0a575b0ab to your computer and use it in GitHub Desktop.

Select an option

Save rdegraci/a66b6934fcc71e87b5b050e0a575b0ab to your computer and use it in GitHub Desktop.
apt-get update
apt-get install -y \
git \
autoconf \
build-essential \
language-pack-en \
libarchive-dev \
libblas-dev \
libboost-all-dev \
libcap-dev \
libcrypto++-dev \
libcurl4-openssl-dev \
libffi-dev \
libmagic-dev \
libfreetype6-dev \
libgoogle-perftools-dev \
liblapack-dev \
liblzma-dev \
libpng12-dev \
libpq-dev \
libpython-dev \
libsasl2-dev \
libssh2-1-dev \
libtool \
libyaml-cpp-dev \
python-virtualenv \
unzip \
valgrind \
uuid-dev \
libxml++2.6-dev
ssh-keygen -N "" -f /home/vagrant/.ssh/id_rsa
Docker
apt-get update
apt-get install -y \
linux-image-extra-$(uname -r) \
linux-image-extra-virtual
sudo apt-get update
apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
apt-key fingerprint 0EBFCD88
apt-get update
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt-get update
apt-get install -y docker-ce
apt-cache madison docker-ce
docker run hello-world
###### Then
# You will first need to have a Github account with SSH keys set up because the repo uses SSH paths in its submodule configuration. You can test that keys are correctly set up by running the following command and seeing "successfully authenticated":
ssh -T [email protected]
#Note the master branch is bleeding edge and the demos or documentation may be slightly out of sync with the code at any given point in time. To avoid this, it is recommended to build the Community Edition from the latest tagged release which is tracked by the release_latest branch.
# NOTE Occasionally, build ordering issues may creep into the build which don't affect the viability of the build, but may cause make to fail. In that case, it is acceptable to repeat the make -k compile step, which may successfully complete on a second pass. (The build order is regression tested, but the regression tests for the build ordering are run less frequently than other tests).
# NOTE Occasionally, tests may fail spuriously, especially due to high load on the machine when running time-sensitive tests or network issues when accessing external resources. Repeating the make -k test step may allow them to pass. It is OK to use MLDB if the tests don't all pass; all code merged tagged for release has passed regression tests in the stable testing environment.
# Build output lands in the build directory and there is no make clean target: you can just rm -rf build. You can speed up recompilation after deleting your build directory by using ccache, which can be installed with apt-get install ccache. You can then create a file at the top of the repo directory called local.mk with the following contents:
COMPILER_CACHE:=ccache
# N.B. To use ccache to maximum effect, you should set the cache size to something like 10GB if you have the disk space with ccache -M 10G.
# To avoid building MLDB for all supported architectures and save time, check sample.local.mk
# To have a faster build, you can use clang instead of gcc. Simply add toolchain=clang at the end of your make command.
# To run a single test, simply specify its name as the target. For python and javascript, include the extension (.py and .js). For C++, omit it.
# Once you have created the local.mk THEN:
git clone [email protected]:mldbai/mldb.git
cd mldb
git checkout release_latest
git submodule update --init --recursive
make dependencies
make -k compile
make -k test
# To speed things up, consider using the -j option in make to leverage multiple cores: make -j8 compile.
# Building a Docker image
# You'll need to add your user to the docker group otherwise you'll need to sudo to build the Docker image:
sudo usermod -a -G docker `whoami`
# To build a development Docker image just run the following command from the top level of this repo:
nice make -j16 -k docker_mldb DOCKER_ALLOW_DIRTY=1
# The final lines of output will give you a docker hash for this image, and the image is also tagged as <username>_latest where <username> is your Unix username on the box.
# To run a development Docker image you just built, follow the Docker instructions from http://mldb.ai/doc/#builtin/Running.md.html except where the tag there is latest just substitute <username>_latest and where the container name there is mldb just substitute something unique to you (e.g. <username> is a good candidate!).
# Docker images built this way will have the internal/experimental entities shown in the documentation. For external releases, the flags RUN_STRIP=-s is passed which, as a side effect, will hide the internal entities in the documentation.
##### THEN
Step 1 - Launch an MLBD container with a mapped directory
Note: the following procedure is meant to be run as a regular user, running the MLDB container as root is not recommended. See the official Docker documentation for more information regarding running containers from regular user accounts.
First, create an empty directory on the host machine by running the following command, where </absolute/path/to/mldb_data> needs to be replaced by the absolute path on your local machine where you want your MLDB working directory to be:
mkdir </absolute/path/to/mldb_data>
You can now execute the following command, where <mldbport> is a port of your choice to be used in the next section (e.g. 8080).
docker run --rm=true \
-v </absolute/path/to/mldb_data>:/mldb_data \
-e MLDB_IDS="`id`" \
-p 127.0.0.1:<mldbport>:80 \
quay.io/mldb/mldb:latest
Once the container is booted, the path /mldb_data inside the container is mapped to </absolute/path/to/mldb_data> on the host machine, so MLDB will be able to access files at </absolute/path/to/mldb_data>/file.ext via the URL file:///mldb_data/file.ext. Read more about URLs here.
# To run without needing a tunnel (security risk), do the following and connect to port 8080
docker run --rm=true \
-v </absolute/path/to/mldb_data>:/mldb_data \
-e MLDB_IDS="`id`" \
-p 8080:80 \
quay.io/mldb/mldb:latest
# But if you want security:
Step 2 - Establish a tunnel (for remote servers)
For security reasons, the instructions above will cause MLDB to only accept connections local to the host it was launched on. If you are not running MLDB on your workstation, you need to establish an SSH tunnel which forwards <localport> (e.g. 8080 again) from your workstation to <mldbport> on the remote host.
This command will do this in a terminal on OSX and Linux, or on Windows using Git Bash, MinGW or Cygwin:
ssh -f -o ExitOnForwardFailure=yes <user>@<remotehost> -L <localport>:127.0.0.1:<mldbport> -N
You can read on how to do this with Putty on Windows here: Documentation, Tutorial.
Step 3 - Activate MLDB
When the line "MLDB Ready" appears in the console output, you can now point your browser to http://localhost:<localport>/. You can then follow the instructions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment