Electrically coupled UPS technology advances offer a safer path and more uptime
A power outage in a data centre that results in service loss can propel data centre operators into the headlines.
In 2020, two separate power outages at two different London data centres which resulted in loss of services did just that.
One outage lasted more than 14 hours before power and services were fully restored. Hundreds of service provider customers were directly impacted and thousands of their customers were without network-based services for a whole working day.
In the first of these outages the operator identified the problem as originating with a faulty static UPS. A few days later in a second incident at a facility run by a different operator, unconfirmed reports say the operator reported that a small fire was caused by a faulty UPS.
For any data centre operator a serious outage is initially viewed through the impact on its customers. Customers want to know:
How quickly was the problem spotted?
How did the operator react?
How quickly did it communicate with its customers?
How quickly was the issue identified?
How fast and effective were the remediation measures?
How quickly were services restored?
Where effective processes in place and where the protocols followed?
The initial stage of any major outage occurs when boards light up and social media goes into overdrive and journalists start asking questions.
Once the initial crisis has been addressed the next step for the operator is to identify what went wrong and why? A thorough review ensues.
Don’t waste a crisis
All data centres, and especially commercial data centres, live or die by their uptime. If the equipment that is designed to protect the power provision and back up was the cause of an incident the investigation goes in a particular direction.
Firstly, for a UPS to be identified as the cause of an outage incident raises many questions. In turn if the root cause was a technical failure this can raise serious long-term concerns. Whenever an outage occurs the opportunity must be grasped to ensure it doesn’t recur.
If a UPS failed and caused an incident the questions asked will include:
Was it due to a service issue?
Is there a fundamental design issue?
Was the age of the particular unit a factor?
For an operator it is critical that there is no repeat of any failure.
Advances in reliability and uptime
For data centre management teams and the engineers who report to them approaches to electrical engineering in large data centre environments is changing.
Operators are no longer constrained by fixed power topologies which were intended not to change once they were designed, deployed and commissioned.
As data centres become more important and uptime becomes even more vital it seems odd that some designers and developers feel they are unable to take advantage of UPS technology advances in reliability, availability and safety which deliver high uptime levels.
There exist real alternatives to multiple static UPSs being paralleled throughout data centre power chains. Using many lower power Static UPS’s introduces multiple points of failure into the data centre. So, what operators end up with is often dozens of units connected in a chain. And within each static UPS there is a high component count of inherently unreliable components such as power capacitors.
Any thorough review of an outage where the fault came from the UPS must explore alternative approaches to power provision and back up. This should include asking if electrically coupled UPS technology such as Piller’s UB-V can provide more reliable and efficient power protection at scale when compared with traditional static UPS technologies.
Whether the root cause of the recent high profile outages was a battery fault, a faulty capacitor or a worn out fan is something that is unlikely ever to make it into the public domain.
What is clear is that with rising capacity demands meaning more power will be needed, UPS selection becomes even more vital. With so many ageing Static UPS units in operation these outages are unlikely to be last we hear of.
Lower Risk
Data centre operations are about managing risk. Reducing power outages that put service at risk is the first duty of the M&E teams. Piller’s UB-V series is changing the way engineers view provision of conditioned power in large scale data centres across the world.
If you are concerned about outages, looking at alternatives to the Static UPS is a good place to start.