Playing with the OBJ file loader for OpenGLContext this evening (briefly). Forgot to check in a number of fixes on Friday (oops). Anyway, as of now CVS OpenGLContext with CVS PyOpenGL can load OBJs from the internet via the (now somewhat inappropriately named) bin/vrml_view.py script. Still haven't found any samples where there's .MTL files (and/or textures) connected to the files. OpenGLContext should be able to process those links, but the functionality is still untested.
Anyway, with a little magnolia object that gets compiled to ~3000 vertex triangle arrays my workstation (with a rather outdated GeForce 7600 GS) easily renders at 100fps (capped render rate for OpenGLContext) with around 10% CPU usage (using VBO support). Obviously textures, colours and the like would alter the speed, but it doesn't seem slow enough to worry about it at this point.
As mentioned, I'd like to get something put together that's GPGPU-ish so that I can be sure we're really supporting the GPGPU operations. If you have PyOpenGL code that you feel provides a great sample of what people want to do with GPGPU in Python, let me know.
Things I'm curious about:
- would it be useful to create an array wrapper object that tries to translate your numpy-like operations into a recipe to be executed on the GPU, or would you rather write the GL-level code yourself to have complete control? Or is it that both approaches would be needed?
- would you want to handle "streaming" data for very large data-sets yourself, or have the system choose the largest available window and stream the data automatically? What kind of feedback do you need for this kind of thing, what kind of resumability is important (if at all)?
- would you rather have the system kick out a low-level recipe to run on dozens of machines, or work interactively to let you play with the numbers, or both?
- would you want the results to be computed as-needed, or to explicitly trigger their generation with a command?
- would you want to integrate the operations with the iPython cluster operations, or are those generally different types of operation? i.e. would you need to scatter/gather your data-sets across many machines? If so, how automatic would you need it to be?


I wrote my own OBJ loader a while back and I've still got the test files for it.
It's nice to try models from different programs, so that you can see their quirks.
Last I tried cgkit was the only python module that could load the models I was using.
Another major performance problem is reusing textures. Say if one image is used on 10 different textures -- you don't want to upload 10 of those images into separate textures.
------
gpgpu
I think portable texture upload/download is kind of tricky code to get right for gpgpu uses. That's the minimum you need -- upload your data, do some calculations on it, then download it again. I have a little script that tries out the various methods and times them for you. You can get 10x speed difference or more depending on the card, driver used and method used.
I assume you have seen pygpu -- with your mention of numpy translation into GLSL primitives?
Haven't looked at the PyGPU stuff, though it is now on my "should look into that" list.
Prashant Saxena
OpenGLContext would likely choke on a 100MB file due to converting everything to the extremely generic VRML97 format then doing an enormous amount of processing to turn it into low-level primitives again.
If you've got complex scenegraphs, straightforward frustum culling would be the first step (given the level you're working at, though, I'd guess you've already got that). Lots of other ways to reduce the amount you need to render depending on what you need to get done.