Been playing with my experimental profiler (Coldshot) all day.

At this point it can load an OpenGLContext run with 207MB of trace data and produce a basic textual summary (both cProfile-style calls/timing and file:line level timings) in around 4s.  That's still quite slow, as the profiler records around 4MB/second of data, so multi-GB traces seems ...

