Cython for the last couple of days...
Written by
on
in
Snaking.
I've been writing accelerator modules for PyOpenGL for the last two days. I think I've got to a reasonably comfortable point. I've got a Numpy FormatHandler and an ArrayDatatype written. With some judicious use of cdef'd typed self variables there's a lot of overhead eliminated from the wrapper operation. To make it really fast, however, I'd have to make almost the entirety of the wrapping code Cython and would have to link into ctypes at quite a few points... at which point I could have just generated a Cython wrapper.
Cython is pleasant work work with, I've settled into a pattern of dumping Python code in, running the compiler, running a test suite to be sure everything's working, then looking at the generated code to optimize the hot-spots with typed references and the like.
I still haven't figured out how to make the type-dispatching code algorithmically faster. I've eliminated most of the Pythonic overhead, but it's still doing a getattr for the __class__ reference and then a dictionary lookup for the handler.
Comments
Comments are closed.
Pingbacks
Pingbacks are closed.
Ove Lampe on 04/29/2009 2:50 a.m. #
Nice work. Is this available anytime soon? Do you have any metrics or profiling results that can quantify the resulting optimization?
Mike C. Fletcher on 04/29/2009 9:28 p.m. #
It's in bzr head currently, but it needs a *lot* more testing to be released.
Regarding metrics/profiling, it's not all that impressive.
With OpenGLContext I see the OpenGL __call__ portion of the application go from ~6% of total runtime down to ~2.5% of total runtime. That is, on the order of a 2x improvement. Then again, OpenGLContext is by no means OpenGL-limited, so it's probably not a good test-suite for this kind of optimization.
Still, don't expect a 10x speedup or anything, we're still going through ctypes for all the actual calls and we're still doing a *lot* of work in the wrapper (even if it is in Cython).