Short intro to "Grants4Companies"
Our project "Grants for Companies" won the first price in the competition "eGovernment Wettbewerb 2021". In this blogpost, we show you some details about the implementation.
"Open Code is the common platform of the public administration for the exchange of open source software" - and now we're contributing to it, too!
Our first contribution is a set of two small scripts (one in Perl5, one in Python3) that help to reduce image sizes in Kubernetes environments.
First of all - why are small images a good idea? There's the obvious answer - they need less space on your storage.
Another neat advantage is that they can be fetched a bit faster - that means if one of your worker fails and the workload needs to be redistributed, the other nodes might be able to start useful tasks earlier, which means a better availability! This difference can be quite pronounced - a minimal single-layer image of eg. 40MB is much quicker activated than a stack of ten with a total size of a gigabyte or so.
Another good reason is to reduce the impact area of vulnerabilities by adding another layer of defense: if the image running your code doesn't have a /bin/sh, there is no way to run injected shell-code.
As for the reduction itself: we've looked at a few different ways of doing this; the simplest (so far) seems to be to create a sort of "chroot" directory during the build process with all the required files (but no more!), and continue with another build step that starts with an empty layer ("FROM scratch") and copies just that directory as content.
Here's a snippet from Dockerfile:
...
RUN mkdir /image-root/
RUN cp -a ... /image-root/
FROM scratch
LABEL ...
COPY --from=0 /image-root/ /
CMD ["...", "...", ...]
There are a few things to note here, though.
For small, mostly contained applications (an executable build by SBCL, stuff compiled via a C compiler, Go binaries, etc.) it's not that hard - just use one of the scripts we provide on OpenCode and pass the target directory and the binary, and all dependencies visible to the dynamic linker will be copied over, including the versioning symlinks. Copy configuration data into the destination as well, and Bob's your uncle.
For applications that load libraries in user code, ie. during actual runtime, you might get by by specifying the shared objects on the script command line as well - then they get copied, analyzed for their dependencies, and so on (recursively).
And then there's ...., well, not complicated, but more strenuous stuff.
For dynamic, interpreted languages (Perl, Python, etc.), the easiest way is to copy the library tree (be it /usr/share/perl, ~/.pip, or whatever) to the target - but that's probably not minimal.
And if the application itself requires external tools (because it runs md5sum, grep, cut, head, git, or whatever), and maybe even runs them through a shell, you might as well not bother - the effort to get that to work will be greater than the benefit of the "small" (ha!) image.
That said, we've got a java application running, by copying a lot of stuff over:
FROM your-base-image...
RUN mkdir -p /image-root/
COPY my.jar /image-root/
RUN yum install -y java-17-openjdk perl rsync ...
# Version-independent path (via symlinks)
ENV JH=/usr/lib/jvm/jre/
RUN rsync -vaR \
/usr/lib/locale/C.utf8/ \
/lib64 \
/image-root/
RUN env LD_LIBRARY_PATH=${JH}/lib:${JH}/lib/server \
.../copy-loaded-libs.perl /image-root/ \
${JH}/bin/java \
${JH}/lib/libmanagement_ext.so ${JH}/lib/libmanagement.so \
${JH}/lib/jvm.cfg \
${JH}/conf/security/java.security \
${JH}/lib/security/default.policy \
${JH}/lib/modules \
${JH}/lib/tzdb.dat \
${JH}/lib/libnio.so ${JH}/lib/libsystemconf.so \
${JH}/lib/libzip.so ${JH}/lib/server/*.so \
${JH}/lib/libextnet.so ${JH}/lib/libjli.so \
${JH}/lib/libjimage.so \
/usr/lib64/libnss3.so /usr/lib64/libnss_files-2.28.so
# Last stage, actual image
FROM scratch
COPY --from=0 /image-root/ /
CMD ["${JH}/java", ..., "-jar", "/my.jar"]
The difference is nice - 80MB versus 174MB for the (default) UBI-8-OpenJDK-17 image.
Whether the savings (storage, RAM, fewer false positives in security checks, ...) outweigh the additional maintenance burden is up to you!
For your own Java applications and other runtime environments, you'll have to measure yourself; if you're interested, check out the project page!