Oops, forgot the pre-coffee work (Most of the day disappears in my shoddy memory)


Implemented crude throttling (twice) for Cinemon during the first 7 hours of the day. The first time I implemented it based on the idea that I could take the difference in time between when I scheduled an event to occur and the time it actually occured within Twisted.

That worked... sort-of. The problem is that it really only caught a few of the extreme situations where operations are getting backed up in the Twisted queue. The nice thing about it was that it was a very lightweight operation (basically no overhead). However, in the end, it wasn't actually producing any better responsiveness from the application.

So, I re-implemented using the actual metric I wanted to optimise (response time from the web front-end). I now query the front page (which is actually the login page) every 30 seconds and slow down the group scanning if it gets to taking more than 3/4 of a second or so to return, speeding up to full speed when it's below 1/10 of a second.

The algorithm isn't quite right yet. The fluctuations generated between fast and slow are such that about 1/4 of the time the web front-end performance is just as bad as it originally was, while the rest of the time it's lightning fast but the CPU is only running about 60% utilisation (i.e. we should be scanning more aggressively). That's all just tweaking the parameters (I hope).

[Update] Looking at it in the morning I realise what the fluctuations are. They're the 20 minute rediscovery spikes, where the system's running flat-out for 5 minutes or so re-scanning all the CMTSs.

[Second Update] And that's also where the huge drops in UI responsiveness are occuring. Guess I should break up the discovery operation so that it lets other operations run as it processes its results.

Comments

Comments are closed.

Pingbacks

Pingbacks are closed.