Networking fun this week.
The big nasty that bit me was that a .bind( ('',port) ) on a multicast socket will get all packets sent to any local socket listening to that port, even if this socket has not registered membership in the group. So if you have two capture processes, each trying to listen to different groups on the same socket number (say two multicast video streams where all streams go out on <group>:8000), they will receive their own and each other's data, and as a result, they will get garbage.
The key issue here is that with a multicast socket, you specify the interface on which to listen via the socket option IP_MULTICAST_IF, and the bind call also controls to which group you are going to listen (which is not what you're used to in non-multicast-land, and seems a bit pointless given that you're explicitly joining groups). Unfortunately, the useful behavior of binding to ('',port) is also lost; that is, you can't listen to multiple groups in which you are interested with the single socket. If you will have multiple processes needing to have separate groups reading on the same port you need to have one socket per group in the reading processes.
I'm going to have to go back to my PyZeroConf fork and make sure that we're handling this stuff correctly. I expect that this is the root cause of the various bugs we've seen where multicast would not be received when bound to a specific local-interface-ip, but would be when bound to the default ip.
Another thing that bit me this week was one of those annoying things you run into with Python every once in a while. The device I'm running on is rather under-powered (think a lower-power 586 class device), but it does okay for the most part. However, when you add capturing a couple of seconds of (38Mbps) video to its select loop it pretty much falls over. To get reasonable performance, use a non-blocking socket with a tight receive() loop just pulling into a queue until it exhausts the socket (luckily we have lots of RAM), and push the data out to disk only during network lulls (again, with non-blocking writes). You can use select() when you run out of both read and write items in order to back off the processor when there's no actual data.
Lastly, came across a bit of a WTF moment when doing a receive() call on a socket (same task as above). MPEG-TS streams are composed of 188 byte packets. Every packet I receive is a 188 byte packet, so I set my receive maximum to 188*2... and the whole data-file would be ridiculously corrupt. Randomly poking around to see if something else was coming in on the port (see above, but no, nothing in this case), I increased the receive size to something unreasonable and low-and-behold the stream is perfect, but all packets are still 188 bytes. (I'm referring here to the argument to receive declaring maximum message size, not the OS buffer, which is much larger than 188 bytes). Would like to spend some time investigating why that happened some day.