Extracting useful info from FreeBSD malloc tracing

The FreeBSD version of the functions in the malloc() and free() family includes hooks that can dump “utrace” entries for the ktrace utility. The tracing features can be enabled by setting the “U” malloc option.

The manpage of malloc says (among many other useful things):

TUNING
     Once, when the first call is made to one of these memory allocation rou-
     tines, various flags will be set or reset, which affects the workings of
     this allocator implementation.

     The ``name'' of the file referenced by the symbolic link named
     /etc/malloc.conf, the value of the environment variable MALLOC_OPTIONS,
     and the string pointed to by the global variable _malloc_options will be
     interpreted, in that order, from left to right as flags.

     Each flag is a single letter, optionally prefixed by a non-negative base
     10 integer repetition count.  For example, ``3N'' is equivalent to
     ``NNN''.  Some flags control parameter magnitudes, where uppercase
     increases the magnitude, and lowercase decreases the magnitude.  Other
     flags control boolean parameters, where uppercase indicates that a behav-
     ior is set, or on, and lowercase means that a behavior is not set, or
     off. [...]

     U       Generate ``utrace'' entries for ktrace(1), for all operations.
             Consult the source for details on this option.

This means that when you run a program under ktrace you can gather lines of malloc() and free() trace info like the following:

$ env MALLOC_OPTIONS='U' ktrace ./a.out
$ kdump -tu -f ktrace.out
 11981 a.out    USER  malloc_init()
 11981 a.out    USER  0x8101080 = malloc(100)
 11981 a.out    USER  0x8103300 = malloc(600)
 11981 a.out    USER  free(0x8103300)

This short trace was collected by running a compiled version of the following C program:

#include <stdlib.h>

int
main(void)
{
        void *p;

        p = calloc(1, 100);
#ifndef LEAK
        free(p);
#endif
        posix_memalign(&p, 16, 600);
        free(p);
        return 0;
}

With a program of this size, it’s quite easy to spot the memory leak; even without ktrace info. For larger programs, there maybe thousands of trace lines to analyze. A single invocation of the find utility may generate hundreds of thousands of ktrace records:

$ env MALLOC_OPTIONS='U' ktrace find $HOME/mail > /dev/null
$ kdump -tu -f ktrace.out | wc -l
  244813

Trying to match the allocation and free calls in a trace with so many entries can be tedious, error prone and very boring, so I wrote a little Python script to help me do it.

The script can read a kdump trace from its standard input, or from a list of files. When run without arguments it tries to read a stream of trace messages from standard input. It then tries to match the allocation records with the free records, using the memory address of each trace line. If there are no memory leaks, it does not print anything and it exits:

$ kdump -T -f ktrace.out | ./alloctrace.py
$

The script is doing its work silently, searching for patterns in the output of kdump, and keeping a record of the malloc/free “utrace” information it can find. You can see what it’s doing by bumping the script’s verbosity level a bit:

$ kdump -T -f ktrace.out | ./alloctrace.py -v
# Bumping verbosity level to 1
# analyzing trace from <stdin>
trace: allocated 100 bytes at 0x8101080
trace: freed 100 bytes at 0x8101080
trace: allocated 600 bytes at 0x8101300
trace: freed 600 bytes at 0x8101300
$

When the script reaches the end of a trace and it detects that there are some allocation records that haven’t been matched by a free record in the trace, it prints a list of all the “leaked” memory areas:

$ kdump -T -f ktrace.out | ./alloctrace.py
--- memory leaks ---
leaked 100 bytes at 0x8101080
  utrace record:
   12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
total leaked bytes = 100
$

By bumping the verbosity level, it also prints the utrace records as it reads them from the trace:

$ kdump -T -f ktrace.out | ./alloctrace.py -v 
# Bumping verbosity level to 1
# analyzing trace from <stdin>
trace: allocated 100 bytes at 0x8101080
trace: allocated 600 bytes at 0x8103300
trace: freed 600 bytes at 0x8103300
--- memory leaks ---
leaked 100 bytes at 0x8101080
  utrace record:
   12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
total leaked bytes = 100
$ kdump -T -f ktrace.out | ./alloctrace.py -vv
# Bumping verbosity level to 1
# Bumping verbosity level to 2
# analyzing trace from <stdin>
# precompiling alloc-re
# precompiling free-re
# initializing address allocation map
# initializing address cache
trace: allocated 100 bytes at 0x8101080
#   utrace record:
#    12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
trace: allocated 600 bytes at 0x8103300
#   utrace record:
#    12146 a.out    1224066908.226173 USER  0x8103300 = malloc(600)
trace: freed 600 bytes at 0x8103300
#   utrace record:
#    12146 a.out    1224066908.226349 USER  free(0x8103300)
--- memory leaks ---
leaked 100 bytes at 0x8101080
  utrace record:
   12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
total leaked bytes = 100
$

Being able to pinpoint the time that a buffer was allocated may be helpful when trying to locate the place in kdump output where the leaked memory was allocated:

$ kdump -T -f ktrace.out | ./alloctrace.py
--- memory leaks ---
leaked 100 bytes at 0x8101080
  utrace record:
   12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
total leaked bytes = 100
$ kdump -T -f ktrace.out | fgrep -n 1224066908.225938
272: 12146 a.out    1224066908.225938 USER  0x8101080 = malloc(100)
$

In this case, we know that we can start looking near line 272 of the trace. This is the place where the leaked buffer was allocated. By skimming through nearby trace output, we may be able to understand a bit more about the location of the leak in the source too :-)

A short Python script cannot really replace more advanced memory debugging tools, like Valgrind, dmalloc, or a combination of libumem and mdb (my favorite Solaris tools), but having something that can quickly run through a huge ktrace log and show a quick summary of memory leaks is nice. What a script like this can do is to serve as a first-level check that a program isn’t leaking memory like mad.

The script itself

The script source is available online at:

http://people.freebsd.org/~keramida/alloctrace.py.txt

Its source is small enough that I am including it here too, for print versions of this article:

#!/usr/bin/env python

# A short Python script to analyze kdump output for the `utrace'
# debugging output of FreeBSD malloc.

# Copyright (c) 2008 Giorgos Keramidas <keramida@FreeBSD.org>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
#    notice, this list of conditions and the following disclaimer in the
#    documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.

import errno
import getopt
import re
import sys
import os

# -------------------- script options ----------------------------------

verbose = None                  # Run in `quiet' mode by default.

# -------------------- useful functions --------------------------------

def ignoremsg(msg=None):
    """Print nothing.  Ignore the message string passed to us.
    This is useful as a `callback' that uses the same function stub
    as message() but prints nothing."""
    return None

def message(msg=None):
    """Print an optional `msg' message string to our standard error
    stream.  Note that a newline is appended automatically to the
    message string, as if it was displayed with print(), and no
    exceptions are captured by message()."""
    if msg:
        s = "%s\n" % (msg)
        sys.stderr.write(s)
        sys.stderr.flush()
    return None

def error(code, msg=None):
    """Print `msg' as an optional error message string, and die with an
    error of `code'."""
    if msg:
        s = '%s: error: %s' % (progname, msg)
        message(s)
    sys.exit(code)

def warning(msg=None):
    """Print `msg' as an optional warning string."""
    if msg:
        s = '%s: warning: %s' % (progname, msg)
        message(s)
    return None

def debug(level=1, msg=None):
    """Print an optional debugging message `msg', when the verbosity
    level `verbose' of the script exceeds or is equal to `level'
    (default=1)."""
    if verbose and verbose >= level and msg:
        message("# " + msg)

# -------------------- main script body --------------------------------

# Precompiled regular expressions, to match the `utrace' output format
# of the kdump utility, and extract bits of the allocation or memory
# freeing trace entries.  The output of `kdump' looks like this:
#
# 42246 a.out    1224038211.153343 USER  malloc_init()
# 42246 a.out    1224038211.153434 USER  0x8101080 = malloc(100)
# 42246 a.out    1224038211.153453 USER  0x8103300 = malloc(600)
# 42246 a.out    1224038211.153462 USER  free(0x8103300)
allocre = None
freere = None

def loadtrace(fp):
    """Read `kdump' output from the open file passed as `fp', and try to
    detect memory leaks by matching the allocation entries with the
    memory freeing trace entries."""

    global allocre, freere
    if not allocre:
        try:
            debug(2, "precompiling alloc-re")
            allocre = re.compile(r'^.*\s+USER\s+.*(0x[0-9a-fA-F]+)\s+=\s+malloc\((\d+)\)\s*$')
        except Exception, inst:
            error(1, "invalid regular expression: %s" % inst)
    if not freere:
        try:
            debug(2, "precompiling free-re")
            freere = re.compile(r'^.*\s+USER\s+free\((0x[0-9a-fA-F]+)\)\s*$')
        except Exception, inst:
            error(1, "invalid regular expression: %s" % inst)

    if verbose and verbose > 0:
        log = message
    else:
        log = ignoremsg

    debug(2, "initializing address allocation map")
    amap = {}                   # Empty allocation map
    debug(2, "initializing address cache")
    acache = {}
    for l in fp.readlines():
        l = l.rstrip('\r\n')
        debug(4, "line \"%s\"" % l)
        m = allocre.match(l)
        if m:
            debug(3, "alloc line: %s" % l)
            addr = int(m.group(1), 0)
            size = int(m.group(2), 0)
            if addr in amap:
                error(1, "memory at 0x%x allocated already" % addr)
            amap[addr] = (size, l)
            acache[addr] = True
            log("trace: allocated %d bytes at 0x%x" % (size, addr))
            debug(2, "  utrace record:")
            debug(2, "  %s" % l)
        m = freere.match(l)
        if m:
            debug(3, "free line: %s" % l)
            addr = int(m.group(1), 0)
            if addr in amap:
                (size, msg) = amap.pop(addr)
                log("trace: freed %d bytes at 0x%x" % (size, addr))
                debug(2, "  utrace record:")
                debug(2, "  %s" % l)
            else:
                if addr in acache:
                    log("warning: double-free at 0x%x" % addr)
                else:
                    log("warning: free of non-malloced region at 0x%x" % addr)
                debug(2, "  utrace record:")
                debug(2, "  %s" % l)

    leaks = amap.keys()
    if len(leaks) == 0:
        debug(2, "no leaks found; address map empty")
        return None
    message("--- memory leaks ---");
    leakedbytes = 0
    for addr in leaks:
        (size, msg) = amap[addr]
        message("leaked %d bytes at 0x%x" % (size, addr))
        message("  utrace record:")
        message("  %s" % msg)
        leakedbytes += size
    message("total leaked bytes = %d" % leakedbytes)
    return leakedbytes

if __name__ == '__main__':
    try:
        opts, args = getopt.getopt(sys.argv[1:], 'v')
    except getopt.GetoptError, err:
        usage()

    # Go through the list of command-line options, and tune the globals
    # which affect the behavior of the script accordingly.
    for o, v in opts:
        if o == '-v':
            verbose = verbose and (verbose + 1) or 1
            debug(1, "Bumping verbosity level to %d" % verbose)
        else:
            assert False, "Unhandled option `%s'" % o

    try:
        if len(args) == 0:
            debug(1, "analyzing trace from <stdin>")
            loadtrace(sys.stdin)
        else:
            for fname in args:
                try:
                    debug(1, "analyzing trace `%s'")
                    loadtrace(file(fname))
                except IOError, inst:
                    if inst.errno == errno.ENOENT:
                        print "%s: %s" % (fname, str(inst))
                    else:
                        raise
    except KeyboardInterrupt, inst:
        message("Interrupted.")
        sys.exit(0)

4 thoughts on “Extracting useful info from FreeBSD malloc tracing

  1. Pingback: Parsing utrace(2) entries · DragonFly BSD Digest

  2. Pingback: Ne perdez pas la mémoire at FreeBSD-fr: Les nouvelles du géant en français

Comments are closed.