In response to my queries about how to catch a hanging application, James posted this wonderful little snippet of code:
import signal, pdb, sys
from twisted.internet import task
sys.stderr.write("SIGALRM timeout, breaking into debugger.\n")
import pdb; pdb.set_trace()
The reason that works is that the alarm call cancels all previous alarm calls, so every second you call alarm from the reactor and cancel the last alarm, but if the Twisted reactor loop hangs, then 10 seconds later that signal you registered with the operating system shows up and calls your function.
So far it hasn't actually allowed me to track down the failing area of the code (because the alarm handler seems to show up in the database thread, instead of the reactor thread). Still, a very cool little hack with a module I've hitherto ignored in most of my programming (I've always considered it too platform dependent to use). May have to rethink that at least for debugging work.