Packing Python scripts with pyInstaller
June 12, 2017
python
docker
Utility to display HTTP response headers
Let’s build something akin to curl -I <url>
, but way more complicated and with completely unnecessary steps involved. Weather report for today is sunny with a chance to learn something new
.
The app
To get started, create a new directory to store your project files. We will be using the requests
library, as well as pyInstaller
and the easiest way to install those is with pip
via a requirements.txt
file.
So create a file called requirements.txt
with the following content:
requests==2.12.4
pyinstaller==3.2.1
We also need some code to perform the request and show us the response headers. Create a file called run.py
.
import requests
import sys
# we specifically include this package, otherwise pyInstaller
# will not automatically identify it and will omit it
from multiprocessing import Queue
# use google.com as a default url, otherwise select what the user
# supplied as a command-line argument
url = sys.argv[1] if len(sys.argv) > 1 else 'https://www.google.com'
# perform a HEAD request
head = requests.request('HEAD', url)
# and show headers
print(head.headers)
Multi-stage Dockerfile
This is a a really cool feature of Docker (since version 17.05
) that allows us to skip intermediate images
and produce a final, trimmed-down, production-ready image.
The idea here is to separate the building
and running
of our code. While we are building, we can be a bit sloppy and include more packages than we really need. But when we are running, we should be as optimized as possible.
Before Docker 17.05, we would have to use the Builder pattern, which would require at least two Dockerfiles
.
Create a Dockerfile
with the following content:
# ---[ Packer stage ]---
FROM python:3.5 as packer
COPY requirements.txt /requirements.txt
RUN pip install -r /requirements.txt
WORKDIR /app
COPY . /app
# pyInstaller, no support for 3.6 at the time of this writing
RUN pyinstaller run.py
# ---[ Runtime stage ]---
FROM busybox:1.26.2-glibc
WORKDIR /app
COPY --from=packer /app/dist/run/ /app
# use `ldd` to find which libraries are called
COPY --from=packer /lib/x86_64-linux-gnu/libdl.so.2 /lib/x86_64-linux-gnu/libdl.so.2
COPY --from=packer /lib/x86_64-linux-gnu/libz.so.1 /lib/x86_64-linux-gnu/libz.so.1
COPY --from=packer /lib/x86_64-linux-gnu/libc.so.6 /lib/x86_64-linux-gnu/libc.so.6
# this one was called from a python .so, which `ldd` did not pickup
COPY --from=packer /lib/x86_64-linux-gnu/libutil.so.1 /lib/x86_64-linux-gnu/libutil.so.1
ENTRYPOINT ["/app/run"]
The real magic here is the --from=packer
statement. This will force Docker to use the filesystem from the packer
build and copy /app/dist/run/
folder from that filesystem to the current working directory in the new filesytem. If you payed close attention, you will have noticed that we included an extra statement in the first stage, FROM ... as packer
. If we ommited this step, then we could still reference that build with --from=0
.
Your project directory should now look like this:
$ ls
-rw-r--r-- 1 user group 796 Jun 5 18:23 Dockerfile
-rw-r--r-- 1 user group 35 Jun 5 11:50 requirements.txt
-rw-r--r-- 1 user group 385 Jun 5 11:09 run.py
Build and run
Building the final image
With everything else in place, it’s time to actually build some images.
docker build -t jango/headers .
You can of course change jango/headers
to a tag of your preference.
The neat thing about having a separate build and runtime image, is the filesize of the final runtime image.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
jango/headers latest f1d3bbfa9b78 2 minutes ago 31.1MB
<none> <none> d6e0e6f5ca78 2 minutes ago 739MB
As you can see, the build image is almost 24x larger!
Running the image
docker run jango/headers
Running the image, without passing any arguments, will fetch google.com
and output headers.
{
'Cache-Control': 'private, max-age=0',
'X-Frame-Options': 'SAMEORIGIN',
'Server': 'gws',
'Expires': '-1',
'Set-Cookie': '...'
}
To fetch some other url, simply pass it as an argument.
docker run --rm jango/headers https://yahoo.com
We also specified --rm
this time, to automatically remove the container once it has finished.
Makefile
Let’s make use of the make
command to make our lives easier.
Create a file called Makefile
.
DOCKER_TAG=jango/headers
build:
docker build -t ${DOCKER_TAG} .
run:
@docker run --rm ${DOCKER_TAG} ${URL}
Building with make
make build
Running with make
# to fetch google.com
make run
# to fetch yahoo.com
make run URL=https://yahoo.com
Update #1 (2017-06-15): clarified introduction text