My brother specializes in database analysis for non-profits. As such, he’s usually very involved in the organizations’ data centers.
One such organization he worked for many years ago had received a grant that included the ability to do a major upgrade to their data center. As part of the upgrade, they had to increase the cooling capacity of the HVAC system in their computer room.
They selected a contractor and had the upgrade installed. They then proceeded with the upgrade their computer systems. Everything was working fine. Until the next day.
They came into the computer room and found that it was over 80℉!
The obvious cause was that the HVAC system hadn’t been sized correctly or was malfunctioning.
The organizations policies required that they had to get 3 quotes before selecting a contractor do any repair work.
So they found 3 contractors. One of the contractors quoted $20,000 to completely replace the existing HVAC system. Another contractor quoted around $10,000 to upgrade the existing HVAC system.
The third contractor came in, looked around the room, picked up a box that was lying on the floor, placed it over the thermostat that was controlling the existing HVAC system, taped it to the wall, and said “No charge”.
Apparently the new HVAC system that had been installed was blowing cold air directly on the thermostat. So, when the system detected that the room was getting warm, it would turn the A/C on. It would immediately detect that the room was cool enough, and turn off the A/C. Clearly the A/C wasn’t running long enough to cool the room at all.
Rarely have I had the SMART capabilities of a hard drive actually tell me that the drive was going to fail.
Recently I had an encounter with SMART errors in a totally different way.
Basically, my ReadyNAS NV+ storage server was telling me that “Disk 1” was having problems and might fail soon … but all my tests indicated that the drives were fine.
After a lot of hassle, and going back and forth with Netgear support, I finally figured out the problem.
It started a few weeks ago with an email the ReadyNAS sent
Reallocated sector count has increased in the last day.
Disk 1: Previous count: 671 Current count: 677
Growing SMART errors indicate a disk that may fail soon. If the
errors continue to increase, you should be prepared to replace the
disk.
The odd thing was, the SMART information on the drives, that was available via the ReadyNAS web interface, did not indicate any of the 4 drives currently installed had any reallocated sectors.
For quite some time I haven’t been happy with the level of data protection on my servers … a while ago I ran mirrored (RAID 1) IDE (PATA) drives on my system using a Arco Duplidisk adapter. It seemed adequate, but after I upgraded my servers to the Dell PowerEdge systems, it didn’t seem to work quite right. It was reporting failed drives when there were none.
So, after a fair bit of research, I decided to get a NAS (Network Attached Storage) device. My criteria were a) had to support various RAID levels (1 & 5 at least), have hot swappable drives, and support NFS (the linux network file system).
The device I decided on is a Netgear ReadyNAS NV+. The model I got came with 2 x 500gb drives, with bays for two more. It wasn’t cheap, but I think it will be worth it in the long run.
It supports various RAID levels … RAID 1 (mirroring, where the data on one drive is completely duplicated on the other), RAID 5 (where data is stored on two drives with a parity bit on the 3rd … if any one of the drives fails, the data can be reconstructed on the fly using two of the drives), and it’s own RAID X … which is an eXpandable and adaptive RAID variation … which will use RAID 1 if you only have two drives, and RAID 5 when you add more.
Although there were a few hiccups, I’m not displeased.