Multi-threading with PyPgSQL can be a serious PITA (Another few hours tracking down hangs due to a missing commit...)


Full-featured databases tend to have low-level locking mechanisms such that if a particular thread or client has modified a row in one transaction, then attempts to reference that row in other transactions are blocked. When your database interface (e.g. PyPgSQL) doesn't allow for sharing transactions between threads, you can (when a problem occurs) wind up with a transaction in one thread blocked on a commit in another that won't happen until the second thread's operation completes.

Problem with this is (at least for me), that PyPgSQL doesn't (at least by default) have the ability to specify a timeout value for a PyPgSQL query, so deadlocks are very easy to create, and basically (AFAIK) impossible to guard against without writing a new asynchronous API for PyPgSQL. That makes it necessary to be very careful when writing database code (which slows things down).

So, moral of the story, if you are finding yourself with a deadlock in a multi-threaded PyPgSQL application, look for a missing commit/rollback somewhere in the application.

Comments

Comments are closed.

Pingbacks

Pingbacks are closed.