- War Stories from the Front
- Shake It Up, Baby
- Water, Water, Everywhere
- Dont Take the Word Uninterruptible Literally
- Technician Doesnt Necessarily Equal Data Center Manager
- Reduce Vulnerability To Increase Survivability
Don’t Take the Word ’Uninterruptible’ Literally
Let’s look at another firsthand account of what can happen if standards and procedures are not in place or are not strictly followed—this time in Dallas, Texas:
"Our [telephone company] had an operations and monitoring center in Dallas with more than 100 seats. It supported numerous widely dispersed customers throughout the territory. This center sustained a serious power problem due to an overload on the uninterruptible power system (UPS). This condition occurred due to several factors.
- Many noncritical pieces of equipment such as radios, heaters, and other nonessential machines were plugged into the special orange power outlets that were designated for UPS mission-critical equipment, including the PBX and ACD (Automatic Call Distribution unit). Many of the ’butt-in-seat’ folks who comprised the call center staff had heard a rumor that ’orange power’ was somehow superior, so moved their devices to these outlets under their desks.
- As the load on the UPS continued to climb to near its capacity, the heat in the room where the UPS resided also climbed. There was no temperature alarm in the UPS room.
- In addition, since so many people used the room, there was inadequate space and airflow.
- Another contributing factor was management of storage and boxes. It seems that what airflow might have been available was further blocked by someone using the UPS system room as storage for boxes, paper, and anything else that was in need of being tossed aside.
A rectifier board in the UPS finally failed due to thermal overload. The resulting disruption put the call center out of business for several days. There was no alternative power source for the PBX and the ACD, although some of the network tools came online after some juggling of power plugs by the technical staff. No single technologist, however, was assigned responsibility specifically to the UPS; therefore, no alternative power source was ever considered for the PBX/ACD. It stands to reason that this is also why nobody ever noticed the numerous other items that were running on the UPS erroneously. I became the HVAC/power/UPS person after that ordeal.It was fortunate in this case that there was not a fire, particularly with the number of combustibles in the UPS room. In addition to blocking airflow, these materials easily could have ignited.
[We had] more than a few customers who were not on the happy side even before this incident. After the repairs were made and the center recovered, several of them left us for other alternatives with other companies, and the company lost business due to the disaster."
The lessons in the account above should be obvious:
- Follow manufacturers’ specifications (including airflow instructions) for the environment that your equipment requires.
- Don’t assume that because UPS systems hardly ever have trouble that they’re invulnerable. They’re not. I’ve had firsthand contact with many customers who sustained UPS problems; believe me, that’s the last place where you want to have issues.
- In this instance, nobody was responsible for the UPS. That’s asking for trouble, because general oversight and maintenance schedules go out the window. Every major piece of equipment should have a person designated as responsible for that equipment, even if that person is a vendor who comes in occasionally to inspect the equipment and perform preventive maintenance.