# Adventures in Python Core Dumping After watching Bryan Cantrill's presentation on [Running Aground: Debugging Docker in Production][aground] I got all excited (and strangely nostalgic) about the possibility of core-dumping server-side Python apps whenever they go awry. This would *theoretically* allow me to fully inspect the state of the program at the point it exploded, rather than relying solely on the information of a stack trace. I decided to try exploring a core dump on my own by writing a simple Python script that generated one. ## Initial Setup Doing this required a bit of setup on my Ubuntu 14.04 server. First, I had to `apt-get install python2.7-dbg` to install a version of Python with debug symbols, so that `gdb` could actually make sense of the core dump. It seems Ubuntu comes pre-configured with a Python debugging extension for `gdb` built-in, so I didn't have to do any extra configuration here, which was great. I also had to add the following line to `/etc/security/limits.conf` to actually enable core dump files to be created: ```python # * soft core 100000 ``` After that, I created a file called `explode.py` in my home directory: ```python import os def my_exploding_func(): my_local_var = 'hi' os.abort() my_exploding_func() ``` Then I ran the script: ```console $ python2.7-dbg explode.py Aborted (core dumped) ``` This created a `core` file in my home directory. ## Exploring The Stack I opened the core dump in `gdb`: ```console $ gdb /usr/bin/python2.7-dbg core GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. ... warning: core file may not match specified executable file. [New LWP 10020] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/bin/python2.7-dbg ./explode.py'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f996aff7cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) ``` Now I could use all of `gdb`'s [Python debugging extension commands][pydebug]. For example, running `py-bt` gave me: ```console (gdb) py-bt #4 Frame 0x7f996bf28240, for file ./explode.py, line 7, in my_exploding_func (my_local_var='hi') os.abort() #7 Frame 0x7f996bf28060, for file ./explode.py, line 9, in () my_exploding_func() ``` I could also use `py-locals` to show me the values of local variables in the current stack frame, and `py-up` and `py-down` to traverse the stack. This was all pretty awesome, and will be very useful if my Python programs actually segfault. But it'd be cool if I could actually get all this rich information any time one of my servers returned a 500. That's a bit of a different situation since Python servers don't usually segfault when they return a 500--instead, they catch exceptions, return an error code, and continue running. For now I'm going to ignore the "continue running" part; there are [ways to core dump without killing a process][dump_nokill], but right now I'm more interested in figuring out how to get information about handled exceptions. ## Obtaining Information About Handled Exceptions Let's assume we have a script called `explode2.py`: ```python import os def my_exploding_func(): a = 1 call_nonexistent_func() try: my_exploding_func() except Exception, e: os.abort() ``` The thing about the core dump generated from this script is that running `py-bt` only gives us the stack trace from the point that we called `os.abort()`, which is pretty useless: ```console (gdb) py-bt #4 Frame 0x7f3767430450, for file ./explode3.py, line 12, in () os.abort() ``` What we really want is a way to introspect the exception that was currently being handled at the time that `os.abort()` was called. There isn't a particularly easy way to do this with the Python debugging extension for `gdb`, but one nice thing about `gdb` is that its extensions are *written in Python*. This means we can write our *own* extension that gives us easy access to the information we need. Doing this took some research. It looks like the latest version of the Python debugging extension for `gdb` is in a file in the CPython codebase called [`libpython.py`][libpython], but this is actually a much newer version than the one that ships with Ubuntu 14.04. I had to use `strace` to find the actual version on my system, which was at `/usr/lib/debug/usr/bin/python2.7-dbg-gdb.py`. After poring through the code and consulting the CPython source code and documentation on [extending gdb using Python][gdbpy], I wrote my first `gdb` extension, which is in the attached [`py_exc_print.py`](#file-py_exc_print-py) file. It adds a `py-exc-print` command that gives us what we need: ```console (gdb) source py_exc_print.py (gdb) py-exc-print Traceback (most recent call last): Frame 0x7f3767430450, for file ./explode2.py, line 12, in () Frame 0x7f37673f3060, for file ./explode2.py, line 7, in my_exploding_func (a=1) exceptions.NameError("global name 'call_nonexistent_func' is not defined",) ``` Note that it's more useful than a standard stack trace, as the values of local variables are included in the printout. But more work on the extension needs to be done in order to make those locals easily introspectable. ## Conclusion Thus concludes my first foray into Python core dumping. Some open questions: * I'm not sure how feasible core dumping on every uncaught exception actually is. For instance, how big do core files become in production environments? * Are there privacy risks involved in core dumping? Depending on the retention policy, it essentially means that [data in use][] could inadvertently become [data at rest][]. * In order for the core dump to be useful, a debug build of the Python interpreter needs to be used. How is performance impacted by this? As the aforementioned Bryan Cantrill talk mentions, we should be able to inspect core dumps from production environments: yet is it feasible to run a *debug* build of Python in a *production* environment? [aground]: https://www.youtube.com/watch?v=sYQ8j02wbCY [pydebug]: http://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands [libpython]: https://hg.python.org/cpython/file/2.7/Tools/gdb/libpython.py [gdbpy]: https://sourceware.org/gdb/onlinedocs/gdb/Python.html [dump_nokill]: http://unix.stackexchange.com/a/11191 [data in use]: https://en.wikipedia.org/wiki/Data_in_use [data at rest]: https://en.wikipedia.org/wiki/Data_at_rest