Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save cas--/1e5a670235779067d61bbecd0c448b4f to your computer and use it in GitHub Desktop.
Save cas--/1e5a670235779067d61bbecd0c448b4f to your computer and use it in GitHub Desktop.

Revisions

  1. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -24,7 +24,10 @@ do any extra configuration here, which was great.
    I also had to add the following line to `/etc/security/limits.conf` to
    actually enable core dump files to be created:

    ```
    <!-- This isn't actually Python, but the syntax highlighting fits. -->

    ```python
    #<domain> <type> <item> <value>
    * soft core 100000
    ```

  2. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 6 additions and 6 deletions.
    12 changes: 6 additions & 6 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -42,7 +42,7 @@ my_exploding_func()

    Then I ran the script:

    ```bash
    ```console
    $ python2.7-dbg explode.py
    Aborted (core dumped)
    ```
    @@ -53,7 +53,7 @@ This created a `core` file in my home directory.

    I opened the core dump in `gdb`:

    ```bash
    ```console
    $ gdb /usr/bin/python2.7-dbg core
    GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
    Copyright (C) 2014 Free Software Foundation, Inc.
    @@ -75,7 +75,7 @@ Now I could use all of `gdb`'s
    [Python debugging extension commands][pydebug]. For example, running
    `py-bt` gave me:

    ```
    ```console
    (gdb) py-bt
    #4 Frame 0x7f996bf28240, for file ./explode.py, line 7, in my_exploding_func (my_local_var='hi')
    os.abort()
    @@ -120,7 +120,7 @@ The thing about the core dump generated from this script is that running
    `py-bt` only gives us the stack trace from the point that we called
    `os.abort()`, which is pretty useless:

    ```
    ```console
    (gdb) py-bt
    #4 Frame 0x7f3767430450, for file ./explode3.py, line 12, in <module> ()
    os.abort()
    @@ -147,7 +147,7 @@ documentation on [extending gdb using Python][gdbpy], I wrote my first
    [`py_exc_print.py`](#file-py_exc_print-py) file. It adds a `py-exc-print`
    command that gives us what we need:

    ```
    ```console
    (gdb) source py_exc_print.py
    (gdb) py-exc-print
    Traceback (most recent call last):
    @@ -163,7 +163,7 @@ introspectable.

    ## Conclusion

    Thus conclues my first foray into Python core dumping.
    Thus concludes my first foray into Python core dumping.

    Some open questions:

  3. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 7 additions and 4 deletions.
    11 changes: 7 additions & 4 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -104,6 +104,8 @@ handled exceptions.
    Let's assume we have a script called `explode2.py`:

    ```python
    import os

    def my_exploding_func():
    a = 1
    call_nonexistent_func()
    @@ -139,10 +141,11 @@ version than the one that ships with Ubuntu 14.04. I had to use `strace`
    to find the actual version on my system, which was at
    `/usr/lib/debug/usr/bin/python2.7-dbg-gdb.py`.

    After poring through the code and consulting the documentation on
    [extending gdb using Python][gdbpy], I wrote my first `gdb` extension,
    which is in the attached [`py_exc_print.py`](#file-py_exc_print-py) file.
    It adds a `py-exc-print` command that gives us what we need:
    After poring through the code and consulting the CPython source code and
    documentation on [extending gdb using Python][gdbpy], I wrote my first
    `gdb` extension, which is in the attached
    [`py_exc_print.py`](#file-py_exc_print-py) file. It adds a `py-exc-print`
    command that gives us what we need:

    ```
    (gdb) source py_exc_print.py
  4. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 8 additions and 2 deletions.
    10 changes: 8 additions & 2 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -145,22 +145,28 @@ which is in the attached [`py_exc_print.py`](#file-py_exc_print-py) file.
    It adds a `py-exc-print` command that gives us what we need:

    ```
    (gdb) source ext_pretty_print.py
    (gdb) source py_exc_print.py
    (gdb) py-exc-print
    Traceback (most recent call last):
    Frame 0x7f3767430450, for file ./explode2.py, line 12, in <module> ()
    Frame 0x7f37673f3060, for file ./explode2.py, line 7, in my_exploding_func (a=1)
    exceptions.NameError("global name 'call_nonexistent_func' is not defined",)
    ```

    Note that it's more useful than a standard stack trace, as the values of
    local variables are included in the printout. But more work on the
    extension needs to be done in order to make those locals easily
    introspectable.

    ## Conclusion

    Thus conclues my first foray into Python core dumping.

    Some open questions:

    * I'm not sure how feasible core dumping on every uncaught exception
    actually is. How big do core files become in production environments?
    actually is. For instance, how big do core files become in production
    environments?

    * Are there privacy risks involved in core dumping? Depending on the
    retention policy, it essentially means that [data in use][] could
  5. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 23 additions and 0 deletions.
    23 changes: 23 additions & 0 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,5 @@
    # Adventures in Python Core Dumping

    After watching Bryan Cantrill's presentation on
    [Running Aground: Debugging Docker in Production][aground] I got all
    excited (and strangely nostalgic) about the possibility of
    @@ -151,11 +153,32 @@ Traceback (most recent call last):
    exceptions.NameError("global name 'call_nonexistent_func' is not defined",)
    ```

    ## Conclusion

    Thus conclues my first foray into Python core dumping.

    Some open questions:

    * I'm not sure how feasible core dumping on every uncaught exception
    actually is. How big do core files become in production environments?

    * Are there privacy risks involved in core dumping? Depending on the
    retention policy, it essentially means that [data in use][] could
    inadvertently become [data at rest][].

    * In order for the core dump to be useful, a debug build of the Python
    interpreter needs to be used. How is performance impacted by this?
    As the aforementioned Bryan Cantrill talk mentions, we should be
    able to inspect core dumps from production environments: yet is it
    feasible to run a *debug* build of Python in a *production* environment?

    [aground]: https://www.youtube.com/watch?v=sYQ8j02wbCY
    [pydebug]: http://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands
    [libpython]: https://hg.python.org/cpython/file/2.7/Tools/gdb/libpython.py
    [gdbpy]: https://sourceware.org/gdb/onlinedocs/gdb/Python.html
    [dump_nokill]: http://unix.stackexchange.com/a/11191
    [data in use]: https://en.wikipedia.org/wiki/Data_in_use
    [data at rest]: https://en.wikipedia.org/wiki/Data_at_rest

    <!--
  6. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -139,8 +139,8 @@ to find the actual version on my system, which was at

    After poring through the code and consulting the documentation on
    [extending gdb using Python][gdbpy], I wrote my first `gdb` extension,
    which is in the attached `py_exc_print.py` file. It adds a `py-exc-print`
    command that gives us what we need:
    which is in the attached [`py_exc_print.py`](#file-py_exc_print-py) file.
    It adds a `py-exc-print` command that gives us what we need:

    ```
    (gdb) source ext_pretty_print.py
  7. @toolness toolness revised this gist Jan 17, 2016. 2 changed files with 97 additions and 3 deletions.
    13 changes: 10 additions & 3 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -7,8 +7,11 @@ at the point it exploded, rather than relying solely on the information
    of a stack trace.

    I decided to try exploring a core dump on my own by writing a simple
    Python script that generated one. Doing this required a bit of
    setup on my Ubuntu 14.04 server.
    Python script that generated one.

    ## Initial Setup

    Doing this required a bit of setup on my Ubuntu 14.04 server.

    First, I had to `apt-get install python2.7-dbg` to install a version
    of Python with debug symbols, so that `gdb` could actually
    @@ -42,7 +45,11 @@ $ python2.7-dbg explode.py
    Aborted (core dumped)
    ```

    This created a `core` file in my home directory, which I opened in `gdb`:
    This created a `core` file in my home directory.

    ## Exploring The Stack

    I opened the core dump in `gdb`:

    ```bash
    $ gdb /usr/bin/python2.7-dbg core
    87 changes: 87 additions & 0 deletions py_exc_print.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,87 @@
    # Note that when we're loaded into gdb via `source py_exc_print.py`, we
    # seem to be loaded into the same namespace as the Python debugging
    # extension, which is some version of the following file by David Malcolm:
    #
    # https://hg.python.org/cpython/file/2.7/Tools/gdb/libpython.py

    def pm_sys_exc_info():
    '''Just like sys.exc_info(), but post-mortem!'''

    # The _PyThreadState_Current global is defined in:
    # https://hg.python.org/cpython/file/tip/Python/pystate.c
    val = gdb.lookup_symbol('_PyThreadState_Current')[0].value()

    # The PyThreadState type is defined in:
    # https://hg.python.org/cpython/file/tip/Include/pystate.h
    return [PyTracebackObjectPtr.from_pyobject_ptr(val[name])
    for name in ['exc_type', 'exc_value', 'exc_traceback']]

    def pm_traceback_print_exc():
    '''Kinda like traceback.print_exc(), but post-mortem, and no args!'''

    exc_type, exc_value, exc_traceback = pm_sys_exc_info()

    sys.stdout.write('Traceback (most recent call last):\n')

    while not exc_traceback.is_null():
    frame = exc_traceback.get_frame()
    sys.stdout.write(' %s\n' % frame.get_truncated_repr(MAX_OUTPUT_LEN))
    exc_traceback = exc_traceback.get_next()

    exc_value.write_repr(sys.stdout, set())
    sys.stdout.write('\n')

    class PyTracebackObjectPtr(PyObjectPtr):
    '''
    Class wrapping a gdb.Value that's a (PyTracebackObject*) within the
    inferior process.
    '''

    # PyTracebackObject is defined in:
    # https://hg.python.org/cpython/file/tip/Include/traceback.h
    _typename = 'PyTracebackObject'

    def __init__(self, gdbval, cast_to=None):
    PyObjectPtr.__init__(self, gdbval, cast_to)
    self._py_tb_obj = gdbval.cast(self.get_gdb_type()).dereference()

    def _get_struct_elem(self, name):
    return self.__class__.from_pyobject_ptr(self._py_tb_obj[name])

    def get_frame(self):
    return self._get_struct_elem('tb_frame')

    def get_next(self):
    return self._get_struct_elem('tb_next')

    @classmethod
    def subclass_from_type(cls, t):
    '''
    This is called from the from_pyobject_ptr class method we've
    inherited. We override its default implementation to be
    aware of traceback objects.
    '''

    try:
    tp_name = t.field('tp_name').string()
    if tp_name == 'traceback':
    return PyTracebackObjectPtr
    except RuntimeError:
    pass

    return PyObjectPtr.subclass_from_type(t)

    class PyExcPrint(gdb.Command):
    '''
    Display a (sort of) Python-style traceback of the exception currently
    being handled.
    '''

    def __init__(self):
    gdb.Command.__init__(self, 'py-exc-print', gdb.COMMAND_STACK,
    gdb.COMPLETE_NONE)

    def invoke(self, args, from_tty):
    pm_traceback_print_exc()

    PyExcPrint()
  8. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 12 additions and 1 deletion.
    13 changes: 12 additions & 1 deletion adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -67,6 +67,7 @@ Now I could use all of `gdb`'s
    `py-bt` gave me:
    ```
    (gdb) py-bt
    #4 Frame 0x7f996bf28240, for file ./explode.py, line 7, in my_exploding_func (my_local_var='hi')
    os.abort()
    #7 Frame 0x7f996bf28060, for file ./explode.py, line 9, in <module> ()
    @@ -79,7 +80,15 @@ stack.

    This was all pretty awesome, and will be very useful if my Python programs
    actually segfault. But it'd be cool if I could actually get all this rich
    information any time one of my servers returned a 500.
    information any time one of my servers returned a 500. That's a bit
    of a different situation since Python servers don't usually segfault
    when they return a 500--instead, they catch exceptions, return an
    error code, and continue running.

    For now I'm going to ignore the "continue running" part; there are
    [ways to core dump without killing a process][dump_nokill], but right
    now I'm more interested in figuring out how to get information about
    handled exceptions.

    ## Obtaining Information About Handled Exceptions

    @@ -101,6 +110,7 @@ The thing about the core dump generated from this script is that running
    `os.abort()`, which is pretty useless:

    ```
    (gdb) py-bt
    #4 Frame 0x7f3767430450, for file ./explode3.py, line 12, in <module> ()
    os.abort()
    ```
    @@ -138,6 +148,7 @@ exceptions.NameError("global name 'call_nonexistent_func' is not defined",)
    [pydebug]: http://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands
    [libpython]: https://hg.python.org/cpython/file/2.7/Tools/gdb/libpython.py
    [gdbpy]: https://sourceware.org/gdb/onlinedocs/gdb/Python.html
    [dump_nokill]: http://unix.stackexchange.com/a/11191

    <!--
  9. @toolness toolness revised this gist Jan 17, 2016. 1 changed file with 57 additions and 3 deletions.
    60 changes: 57 additions & 3 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -13,7 +13,7 @@ setup on my Ubuntu 14.04 server.
    First, I had to `apt-get install python2.7-dbg` to install a version
    of Python with debug symbols, so that `gdb` could actually
    make sense of the core dump. It seems Ubuntu comes pre-configured with
    Python debugging extensions for `gdb` built-in, so I didn't have to
    a Python debugging extension for `gdb` built-in, so I didn't have to
    do any extra configuration here, which was great.

    I also had to add the following line to `/etc/security/limits.conf` to
    @@ -79,11 +79,65 @@ stack.

    This was all pretty awesome, and will be very useful if my Python programs
    actually segfault. But it'd be cool if I could actually get all this rich
    information any time one of my servers returned a 500. I'll document
    my continuing adventures in an upcoming post.
    information any time one of my servers returned a 500.

    ## Obtaining Information About Handled Exceptions

    Let's assume we have a script called `explode2.py`:

    ```python
    def my_exploding_func():
    a = 1
    call_nonexistent_func()

    try:
    my_exploding_func()
    except Exception, e:
    os.abort()
    ```

    The thing about the core dump generated from this script is that running
    `py-bt` only gives us the stack trace from the point that we called
    `os.abort()`, which is pretty useless:

    ```
    #4 Frame 0x7f3767430450, for file ./explode3.py, line 12, in <module> ()
    os.abort()
    ```

    What we really want is a way to introspect the exception that was currently
    being handled at the time that `os.abort()` was called.

    There isn't a particularly easy way to do this with the Python debugging
    extension for `gdb`, but one nice thing about `gdb` is that its extensions
    are *written in Python*. This means we can write our *own* extension
    that gives us easy access to the information we need.

    Doing this took some research. It looks like the latest version of the
    Python debugging extension for `gdb` is in a file in the CPython codebase
    called [`libpython.py`][libpython], but this is actually a much newer
    version than the one that ships with Ubuntu 14.04. I had to use `strace`
    to find the actual version on my system, which was at
    `/usr/lib/debug/usr/bin/python2.7-dbg-gdb.py`.

    After poring through the code and consulting the documentation on
    [extending gdb using Python][gdbpy], I wrote my first `gdb` extension,
    which is in the attached `py_exc_print.py` file. It adds a `py-exc-print`
    command that gives us what we need:

    ```
    (gdb) source ext_pretty_print.py
    (gdb) py-exc-print
    Traceback (most recent call last):
    Frame 0x7f3767430450, for file ./explode2.py, line 12, in <module> ()
    Frame 0x7f37673f3060, for file ./explode2.py, line 7, in my_exploding_func (a=1)
    exceptions.NameError("global name 'call_nonexistent_func' is not defined",)
    ```

    [aground]: https://www.youtube.com/watch?v=sYQ8j02wbCY
    [pydebug]: http://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands
    [libpython]: https://hg.python.org/cpython/file/2.7/Tools/gdb/libpython.py
    [gdbpy]: https://sourceware.org/gdb/onlinedocs/gdb/Python.html

    <!--
  10. @toolness toolness revised this gist Jan 16, 2016. 1 changed file with 1 addition and 13 deletions.
    14 changes: 1 addition & 13 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -48,19 +48,7 @@ This created a `core` file in my home directory, which I opened in `gdb`:
    $ gdb /usr/bin/python2.7-dbg core
    GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
    Copyright (C) 2014 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law. Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-linux-gnu".
    Type "show configuration" for configuration details.
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>.
    Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
    For help, type "help".
    Type "apropos word" to search for commands related to "word"...
    Reading symbols from /usr/bin/python2.7-dbg...done.
    ...

    warning: core file may not match specified executable file.
    [New LWP 10020]
  11. @toolness toolness revised this gist Jan 16, 2016. 1 changed file with 4 additions and 3 deletions.
    7 changes: 4 additions & 3 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -74,8 +74,9 @@ Program terminated with signal SIGABRT, Aborted.
    (gdb)
    ```
    Now I could use all of `gdb`'s [Python debugging extensions][pydebug]. For
    example, running `py-bt` gave me:
    Now I could use all of `gdb`'s
    [Python debugging extension commands][pydebug]. For example, running
    `py-bt` gave me:
    ```
    #4 Frame 0x7f996bf28240, for file ./explode.py, line 7, in my_exploding_func (my_local_var='hi')
    @@ -94,7 +95,7 @@ information any time one of my servers returned a 500. I'll document
    my continuing adventures in an upcoming post.

    [aground]: https://www.youtube.com/watch?v=sYQ8j02wbCY
    [pydebug]: https://wiki.python.org/moin/DebuggingWithGdb
    [pydebug]: http://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands

    <!--
  12. @toolness toolness created this gist Jan 16, 2016.
    107 changes: 107 additions & 0 deletions adventures-in-python-core-dumping.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,107 @@
    After watching Bryan Cantrill's presentation on
    [Running Aground: Debugging Docker in Production][aground] I got all
    excited (and strangely nostalgic) about the possibility of
    core-dumping server-side Python apps whenever they go awry. This would
    *theoretically* allow me to fully inspect the state of the program
    at the point it exploded, rather than relying solely on the information
    of a stack trace.

    I decided to try exploring a core dump on my own by writing a simple
    Python script that generated one. Doing this required a bit of
    setup on my Ubuntu 14.04 server.

    First, I had to `apt-get install python2.7-dbg` to install a version
    of Python with debug symbols, so that `gdb` could actually
    make sense of the core dump. It seems Ubuntu comes pre-configured with
    Python debugging extensions for `gdb` built-in, so I didn't have to
    do any extra configuration here, which was great.

    I also had to add the following line to `/etc/security/limits.conf` to
    actually enable core dump files to be created:

    ```
    * soft core 100000
    ```

    After that, I created a file called `explode.py` in my home directory:

    ```python
    import os

    def my_exploding_func():
    my_local_var = 'hi'
    os.abort()

    my_exploding_func()
    ```

    Then I ran the script:

    ```bash
    $ python2.7-dbg explode.py
    Aborted (core dumped)
    ```

    This created a `core` file in my home directory, which I opened in `gdb`:

    ```bash
    $ gdb /usr/bin/python2.7-dbg core
    GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
    Copyright (C) 2014 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law. Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-linux-gnu".
    Type "show configuration" for configuration details.
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>.
    Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
    For help, type "help".
    Type "apropos word" to search for commands related to "word"...
    Reading symbols from /usr/bin/python2.7-dbg...done.

    warning: core file may not match specified executable file.
    [New LWP 10020]
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    Core was generated by `/usr/bin/python2.7-dbg ./explode.py'.
    Program terminated with signal SIGABRT, Aborted.
    #0 0x00007f996aff7cc9 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
    56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
    (gdb)
    ```
    Now I could use all of `gdb`'s [Python debugging extensions][pydebug]. For
    example, running `py-bt` gave me:
    ```
    #4 Frame 0x7f996bf28240, for file ./explode.py, line 7, in my_exploding_func (my_local_var='hi')
    os.abort()
    #7 Frame 0x7f996bf28060, for file ./explode.py, line 9, in <module> ()
    my_exploding_func()
    ```

    I could also use `py-locals` to show me the values of local variables
    in the current stack frame, and `py-up` and `py-down` to traverse the
    stack.

    This was all pretty awesome, and will be very useful if my Python programs
    actually segfault. But it'd be cool if I could actually get all this rich
    information any time one of my servers returned a 500. I'll document
    my continuing adventures in an upcoming post.

    [aground]: https://www.youtube.com/watch?v=sYQ8j02wbCY
    [pydebug]: https://wiki.python.org/moin/DebuggingWithGdb

    <!--
    I found a few Stack Overflow questions regarding this concept, but they
    didn't have very useful replies:
    * http://stackoverflow.com/q/141802/2422398
    * http://stackoverflow.com/q/30630102/2422398
    -->