We want to use `pdftk`, a binary cli program to generate PDF files. https://www.pdflabs.com/tools/pdftk-server/ We don't have access to apt-get at runtime on Modulus, so we can't just `apt-get install pdftk`. So, can we compile from source and build a binary that we can ship alongside the meteor source code? To avoid weird toolchain/cross-compilation issues, let's do the build in the same environment as it will run. Handily modulus supplies the `DOCKERFILE`s for the images we run inside of: * http://blog.modulus.io/open-sourcing-our-docker-images * https://github.com/onmodulus/docker-images * https://github.com/onmodulus/docker-run-base To get going first setup a lightweight vm (tiny core linux on virtualbox) that can act as the Docker Host (OS X can't do this). http://boot2docker.io/ `brew install boot2docker` Follow the brew output to get the vm and the Docker daemon up and running. Once live, let's build the modulus images. Also install the Docker client, this can run natively on OS X since it just talks to the daemon. `brew install docker` There are multiple images involved, from general to specific: * https://github.com/phusion/baseimage-docker * https://github.com/onmodulus/docker-base * https://github.com/onmodulus/docker-run-base * https://github.com/onmodulus/docker-run-node `baseimage-docker` is in the public registry, and will be pulled in automatically. For all of the onmodulus images, we'll need to build and register them locally. Clone each onmoudlus repo, cd in, and in the order above run `docker build -t .` This will build the image and can take awhile. The `-t ` will also tag the output and register it locally so that we can run it and build descendant images from it. After each step you can check what images you have via `docker images`. Once all images are built, we need to setup the environment with the assumptions that modulus makes. Notably we need to mount a filesystem as described ``` /mnt The volume mounted at /mnt requires the follow subdirectories to be created by the host system and accessible by the mop user/group. /mnt/tmp Temporary storage. The TEMP_DIR environment variable is defined to here. /mnt/home The mop user's home directory. The HOME environment variable is defined to here. /mnt/log Application stdout/stderr is placed in this directory with the filename app.log. /mnt/app The application itself is placed in this directory. /mnt/notifications Crash and other notifications, generated by supervisor, are placed here. /mnt/app-storage Persistent storage is mounted here. It's also mounted to /app-storage at runtime. /mnt/supervisor.conf The supervisor daemon is run with this configuration file. ``` You can put this directory wherever, just adjust the `-v` argument to specifiy where the volume lives when running `docker`. I punted on permissions setup and just made everything `777`. ``` egoldblum@Ethans-MacBook-Pro(15:56:31):~$ ls -l host-folder/ total 0 drwxrwxrwx 2 egoldblum staff 68 Jul 23 12:19 app/ drwxrwxrwx 2 egoldblum staff 68 Jul 23 12:19 home/ drwxrwxrwx 3 egoldblum staff 102 Jul 23 12:23 log/ -rwxrwxrwx 1 egoldblum staff 0 Jul 23 12:19 supervisor.conf* drwxrwxrwx 2 egoldblum staff 68 Jul 23 15:19 tmp/ ``` Run the image, allocating a tty, mounting the volume to match where you created it locally, mapping port 80 inside to 8080 outside, and dropping into a bash shell. `--rm` will remove an existing container, if present `docker run --rm -v ~/host-folder:/mnt -p 80:8080 -t -i onmodulus/docker-run-node:0.0.1 /sbin/my_init -- bash -l` If it worked you should be sitting at a bash prompt as root inside the container. Setup is done, let's build stuff. Let's see what packages we have installed `apt --installed list` https://gist.github.com/egoldblum/9ec942849ea5424f52aa There's a lot, but not everything we need to build `pdftk` from source according to http://packages.ubuntu.com/trusty/pdftk Get the source while we're at it `wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip` `unzip pdftk-2.02-src.zip` Install the dependencies for building `apt-get install libgcj14` `apt-get install gcj-jdk` Let's check our toolchain versions ``` root@791899f6f4bc:/# gcj --version | head -1 gcj (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4 root@791899f6f4bc:/# gcc --version | head -1 gcc (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4 ``` Looks like we're using 4.8 so let's make the Makefile aware of our toolchain. We're in an Ubuntu image, so use `Makefile.Debian` `export VERSUFF=-4.8` Go build it `make -f Makefile.Debian` After awhile... ``` root@791899f6f4bc:/pdftk-2.02-dist/pdftk# file pdftk pdftk: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=ef4e19fbf402e8d8cd32bdae1ade018f3ff551b5, not stripped root@791899f6f4bc:/pdftk-2.02-dist/pdftk# ./pdftk --version pdftk 2.02 a Handy Tool for Manipulating PDF Documents Copyright (c) 2003-13 Steward and Lee, LLC - Please Visit: www.pdftk.com This is free software; see the source code for copying conditions. There is NO warranty, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` We didn't adjust any compile/link flags yet, so let's see what libraries this thing is using ``` root@791899f6f4bc:/pdftk-2.02-dist/pdftk# ldd pdftk linux-vdso.so.1 => (0x00007ffc839b1000) libgcj.so.14 => /usr/lib/x86_64-linux-gnu/libgcj.so.14 (0x00007fa8389ac000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa8386a8000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa838492000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa8380cd000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa837eaf000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa837ca7000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa837aa3000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fa83788a000) /lib64/ld-linux-x86-64.so.2 (0x00007fa83bafc000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa837584000) ``` Looks like a lot of them, including `libgcj.so.14` which we had to install manually. This means that this resulting binary won't run on the unmodified modulus image since the dynamic library doesn't exist there. Since all these are dynamically linked, the resulting binary is quite reasonable at 4mb. ``` root@791899f6f4bc:/pdftk-2.02-dist/pdftk# ls -lah pdftk -rwxr-xr-x 1 root root 3.9M Jul 23 20:27 pdftk ``` So, can we figure out a way to statically compile & link all dependencies into one (fat) binary that we can execute on modulus? Let's add some flags to `Makefile.Debian` to instruct all of the tools to build/link statically. ``` export CPPFLAGS= -DPATH_DELIM=0x2f -DASK_ABOUT_WARNINGS=false -DUNBLOCK_SIGNALS -fdollars-in-identifiers -static export CXXFLAGS= -Wall -Wextra -Weffc++ -O2 -static export GCJFLAGS= -fsource=1.3 -O2 -static-libgcj export GCJHFLAGS= -force export LDLIBS= -lgcj ``` Most importantly, this tells `gcj` to use a static version of `libgcj` https://gcc.gnu.org/wiki/Statically_linking_libgcj Clean and build again, and `make` complains. uh-oh. ``` root@791899f6f4bc:/pdftk-2.02-dist/pdftk# make clean -f Makefile.Debian > /dev/null root@791899f6f4bc:/pdftk-2.02-dist/pdftk# make -f Makefile.Debian make -f Makefile -iC /pdftk-2.02-dist/pdftk/../java all make[1]: Entering directory `/pdftk-2.02-dist/java' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/pdftk-2.02-dist/java' g++-4.8 -Wall -Wextra -Weffc++ -O2 -static attachments.o report.o passwords.o pdftk.o /pdftk-2.02-dist/pdftk/../java/java_lib.o -lgcj -o pdftk /usr/bin/ld: cannot find -lgcj collect2: error: ld returned 1 exit status make: *** [pdftk] Error 1 ``` Where is the `libgcj.a` static archive to link against? Turns out that that `gcj` doesn't ship with it since it can be buggy/error-prone. True on Ubuntu and Red Hat at least. https://bugzilla.redhat.com/show_bug.cgi?id=1004507#c1 ``` Statically linking gcj doesn't really work, which is why we are intentionally not shipping libgcj.a. If you want to compile/link programs that don't depend on particular libgcj.so version, use -findirect-dispatch (both for compilation and linking). ``` ``` If you don't want the executable to depend on libgcj, you can prepend -static-libgcj to the gcj command-line, but that won't work with the stock gcj package on Ubuntu Lucid, because libgcj.a was not included in the package. However, if you compile your own GCC (and enable Java), that will support -static-libgcj . ``` So we don't have an archive to link against. Some german guy apparently got this working with an older toolchain and a compiler that includes `libgcj.a` http://dokupuppylinux.info/programs:pdf_manipulation#pdftk_141_statically_linked We may be able to build to compile `libgcj` into a static archive ourselves. To be continued??