Data centres across Asia are growing at an unprecedented pace, with the region poised to account for 40% of the world’s capacity.
But power failures remain a huge issue for data centres — especially with the increasing use of renewable energy in the industry. According to Uptime Institute’s 2024 report, power problems make up 52% of major data centre outages. Over half of operators say their most recent severe outage cost them more than US$100,000, while 16% reported losses above US$1 million.
While uninterruptible power supply (UPS) systems and lithium-ion batteries play a critical role in keeping operations running during outages, these backup systems come with risks. These include power supply inconsistencies, thermal runaway fires, and generator failures that can shut down facilities for hours or even days.
To help operators navigate these risks, Marsh Asia’s Regional High-Tech Expert, Fred Chuan, and Communications, Media and Technology Industry Leader, Larry Liu, share practical steps to combat these costly risks.
Fred:
Consistency and reliability of energy sources start with the facility’s design. Proper storage, monitoring, and maintenance of batteries and UPS systems can mean the difference between seamless operations and unexpected outages. Here are some recommendations:
An electrical outage occurred at a data centre due to issues within its UPS system. The main UPS initially experienced power problems and automatically transferred the load to a redundant UPS unit to maintain continuous power. However, the primary transfer switch failed, triggering the Static Transfer Switch (STS) to shift the load to the redundant UPS. When the primary UPS recovered, the STS attempted to switch the load back, but unstable utility power prevented the primary UPS from delivering full power.
This instability caused the STS to rapidly toggle the load between the two UPS units. To prevent damage from this repeated switching, the STS was ultimately locked out. Consequently, the data centre experienced an 11-hour power outage, resulting in significant downtime and operational disruption.
![]()
Primary UPS fault → transfer to redundant UPS |
STS priority is to switch back to primary UPS
|
Primary UPS recovers but cannot support full load
|
STS rapidly toggles load between primary and redundant UPS units |
STS locked out → outage |
This incident underlines that redundancy alone is not a safeguard. Proper assessments, regular inspections, and rigorous testing are critical to reveal underlying faults and mitigate risks associated with UPS power outages.
Fred:
Cooling systems are an essential part of data centres. Without them, servers overheat and data centres risk equipment and component failures and massive downtime. The good news is that there are practical steps to keep cooling systems resilient.
In October 2023, a major data centre cooling system failure caused overheating that disrupted operations for two leading banks in Singapore. The incident led to the unavailability of their online banking apps for close to 14 hours. This affected 2.5 million payments and ATM transactions as well as caused 810,000 failed login attempts.
The root cause was traced to a contractor error during a planned upgrade. The contractor incorrectly closed valves in the chilled water system, causing temperatures to rise beyond safe limits. Although both banks activated their disaster recovery and business continuity plans, technical issues at their backup data centres — including network misconfiguration and connectivity problems — prevented full recovery.
This outage exceeded regulatory limits on unscheduled downtime for critical systems, resulting in significant penalties and restrictions on IT changes.
Larry:
Even with robust precautions forming the first line of defence, failures of transformers, switches, UPS systems or on-site batteries can still occur. This is why insurance plays an essential role as the last line of defence. Here’s what we recommend for operators managing risks associated with lithium-ion battery and UPS systems.
Property Damage and Business Interruption (PDBI) covering:
PDBI provides operators the financial resources to repair or replace damaged equipment and cover the costs of business interruption.
Marsh is the trusted broker for more than 80% of the world’s largest cloud service and data centre providers. With deep industry knowledge and experience across Asia, we help data centre operators design safer facilities and transfer risk to keep their business running even during unexpected scenarios.
Get in touch to learn how we can help you protect your critical infrastructure.