Though necessary, hardware is almost always problematic (At least, I only notice the hardware when there's a problem...)

Wednesday evening one of the (5) fans in the workstation/server at home died. No big deal, you say, the machine has 4 other fans...

I cracked open the case a few hours later to see which fan to replace and the internal cages for the drives were too hot to touch. One of the other fans had died without my noticing, leaving just the CPU, PSU and one case fan that sits right next to the CPU.

The two failed fans were both drive/card coolers. Still, computers are fairly robust, maybe they hadn't noticed they'd been cooked... but on the next reboot I got an ominous message that the drive was getting CRC failures on its DMA memory. Oh well, serves me right for not checking the fans every once in a while. The drive is working in PIO mode, but it's just a secondary drive, I'll just pop down to Canada Computers and pick up some replacement parts.

This morning I sat down to fix the problem. Pop in the fans, plug them into the motherboard... and nothing. They give a bit of a jump when the power is switched on, but nothing more. Half an hour of plugging in various fans into various ports, searching through BIOS for any suggestion that the case fans might be thermostatically controlled... no luck.

Well, sub-optimal, but I'll install the new drive anyway... duh! It's an SATA drive and I don't have the little adapter for the power cable. Batting 0 for 3 here. Grr, thinks I. Fine, another day of running on a limping server, I'll just boot it up and get off to Linux Caffe to work.

Except it doesn't boot. Gets through BIOS and then just sits there blinking a cursor at me. Argh, thinks I. Must be that the secondary disk (which is only used as a boot disk and for old-documents storage) failed... so I'll just boot up a live CD, chroot into my system install GRUB on the existing SATA drive's unused boot partition... except the Gentoo LiveCD can't execute /bin/bash for chrooting. Close everything down in disgust.

Okay, boot to Knoppix... weird, Knoppix is seeing all the disks. Browse for a few minutes, yes, seems to have everything there. Oh well, still don't trust that fried drive, let's install GRUB on the main disk anyway... except halfway through the process I lost the internet connection and couldn't remember the sequence of commands to start up the chroot (you have to mount two "special" filesystems so that they can be seen in the chroot by GRUB).

Somewhere in here Soni told me I had to take a break and a shower. So off I go to the shower, where it hits me. I wasn't seeing a disk failure, I was just seeing an old BIOS problem where if you reboot the machine too soon after a shutdown it doesn't see the disks. Sure enough, having left the machine powered down during the shower it booted up perfectly well from the suspect disk.

So, 2.5 hours basically wasted and still in exactly the same state as when I started. Hardware, when it makes it's presence known, is problematic.


Comments are closed.


Pingbacks are closed.