Miguel Ventura's blog : Scripting WinDBG with PyKd

WinDBG + Python = Love

Why WinDBG?

WinDBG is the debugger for Microsoft Windows. If you don't believe me, just ask any of Microsoft's support escalation engineers. It is extremely powerful but has a very steep learning curve, so usually it's used only when Visual Studio isn't up for the job.

Why Automation?

Often you might find yourself doing repetitive work in the debugger console. For instance iterating over items of a linked list or printing some property of all .NET objects of type X or just chewing up data from some memory region. Sometimes you just want to follow a lot of pointers until you get to the stuff that matters, and that means you'll spend a lot of time doing copy-paste of memory addresses and playing Where's Waldo with cryptic data.

The problem is that most WinDBG commands weren't designed with automation in mind. There are some commands to ease automation... it is easy to iterate over a linked list and there's a .foreach command that's able to parse the output of some other command in a somewhat limited fashion. However the mechanisms that exist are neither easy to use, easy to learn, or even sufficient.

It would be awesome if WinDBG provided some well structured scripting language for aiding with this kind of tasks.

PyKd to the rescue!

PyKd is a WinDBG extension that allows you to integrate Python scripts with WinDBG. Python is awesome for parsing strings and lists, and it's extremely easy to learn and to write. The idea of running Python in WinDBG is so good that I believe Microsoft should integrate this extension with the debugger! (this will probably never happen, but it would be awesome if it did).

Unfortunately PyKd's documentation is at the time mostly in Russian only, but this may very well change in a near future.

Installing and Using

PyKd installation is very straightforward if you use the installer. However I already had Python installed, so I took the zip file only. Due to this I sumbled upon some problems:

The usage is very straightforward, especially if all you want to do can be achieved by issuing debugger commands and parsing their output. You can use !pycmd to get a Python shell right in your WinDBG session and dbgClient.dbgCommand to issue debugger commands from python:

.load pykd.pyd
!pycmd
Python 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>from pykd import dbgClient
>>>loaded_modules = dbgClient.dbgCommand("lm")

Alternatively you can use !py <file> to run a script from <file>.

How cool is that?

It's just awesome! Here are some basic examples of how PyKd can save your bacon:

Listing all Namespaces of WebServiceMethod

You could use !dumpheap -stat WebServiceMethod to get the MT of WebServiceMethod and then !dumpheap -mt <mt> -short to list all WebServiceMethod object addresses. The output of !dumpheap -short can be easily parsed by .foreach, but then you'd have to run !dumpobj to get the address for the namespace property and finally !dumpobj on this address. The output of !dumpobj isn't suitable for .foreach, but it's no trouble for a python script.

import pykd

def dump_soapclientmethod():
    # get all SoapClientMethod's
    soapcliaddrs = pykd.dbgCommand("!dumpheap -mt 0000064283abea38 -short").split("\n")

    print "### found %d soap client addresses" % (len(soapcliaddrs))

    for addr in soapcliaddrs:
        # dumpobj to get object properties
        do = pykd.dbgCommand("!do %(addr)s" % { 'addr': addr })
        # get the line for 'action' property
        actionline = [line for line in do.split('\n') if 'action' in line]
        # line ends with "<address> action" and we want the <address>
        actionaddr = actionline[0].split()[-2]
        # get the string in the retrieved <address>
        doaction = pykd.dbgCommand("!do -nofields %(addr)s" % {'addr': actionaddr}).split("\n")
        print "%s -> %s" % (actionaddr, doaction[-2])

the code above renders an output like

### found 125 soap client addresses
00000001c1755b48 -> String: http://schemas.microsoft.com/sharepoint/soap/List
00000001e2085640 -> String: http://schemas.microsoft.com/sharepoint/soap/Copy
0000000240fb35c8 -> String: http://schemas.microsoft.com/sharepoint/soap/List
00000002419c4158 -> String: http://schemas.microsoft.com/sharepoint/soap/Copy
...

Listing dynamically generated assemblies for XML Serialization

def dump_assemblies():
    addrs = pykd.dbgCommand("!dumpheap -stat -type Microsoft.Xml.Serialization.GeneratedAssembly.ArrayOfObjectSerializer").split("\n")
    print "### found %d objects" % (len(addrs))

    for laddr in addrs:
        mt = laddr.split()[0]
        mtdump = pykd.dbgCommand("!dumpmt -md " + mt).split("\n")
        modaddr = [line for line in mtdump if line.startswith("Module:")][0].split()[1]
        mddump = pykd.dbgCommand("!dumpmodule " + modaddr)
        assembly = [line for line in mdump if line.startswith("Assembly:")]
        print assembly[0]

Conclusion

With PyKd you can finally script WinDBG in a decent programming language and write some code to automate those repetitive tasks that make you feel you're wasting time rather than debugging.