Data centre operators are trained to anticipate upheaval due to fires, floods, power outages and other catastrophic events. The novel coronavirus, however, is sending people in charge of mission-critical facilities into uncharted territory.
"Data centres and IT teams are typically very good at planning. We plan for normal operations, we plan for the future, we plan for abnormal events ... [but] very few people have planned for the type of pandemic that we're facing now," said Fred Dickerman, senior vice president for management services at Uptime Institute.
The data centre advisory organisation just released a report aimed at helping operators of critical infrastructure facilities respond to the impact of Covid-19 as well as prepare for future epidemics by refining strategies and procedures. The free report, "Covid-19: Minimising critical facility risk," details recommendations and possible next steps.
"This is a very difficult but necessary subject," said Andy Lawrence, Uptime Institute's executive director, in a webcast put on by the organisation to discuss the industry's response to Covid-19 so far and what needs to happen going forward.
"It’s clear, I think, that this pandemic is going to last many months, possibly longer," Lawrence said. "A lot of what we talk about today will form the foundation for, probably, a class of resilience planning that we're all going to have to invest more time and effort into. Hopefully this is the first stage of that."
As a result of the pandemic, tech dependency is increasing: The number of teleworkers is skyrocketing, online retail is surging, business-to-business communications are taking place digitally, and social interactions are moving online.
"All of that is going to drive more network use and more use of the essential infrastructures that we're responsible for maintaining," Dickerman said.
"In the midst of all that, those of us in the data centre industry are facing the same health challenges that the general population is facing," Dickerman said. Maintaining the health and safety of data centre staff is the top priority – and it's essential if companies want to ensure data centre resiliency.
Here are some recommendations to get started:
Adapt existing response plans to Covid-19
Many companies' disaster recovery plans may not include pandemic preparedness, but that doesn't mean they have to start from scratch.
Instead, try to adapt an existing emergency plan that may have been prepared for a scenario that would make it difficult for staff to access a data centre site, such as a hurricane, Dickerman said. "If you don't have a [pandemic-specific plan] in your reference library, you can look for a plan that can be adapted to the current situation," he said.
Create tiered Covid-19 response plans
"We don't want people going from normal activity to 100 per cent isolation in one step. It's not necessary and may not be cost effective," Dickerman said.
Best practices call for a three- to five-level contingency plan, according to Uptime Institute, with tiers of escalation ranging from taking reasonable precautions up to worst-case scenarios such as lights-out operation or even a complete site shutdown.
A plan should clearly identify the actions to be taken at each level and the circumstances that would trigger implementation of the next level.
For example, with respect to staff availability, companies should develop a staffing threat matrix for different staff absenteeism scenarios – such as less than 25 per cent absenteeism or 25-50 per cent absenteeism. Each tier should summarise what the business impact will be, how it will affect data centre operations, and the impact on service levels.
Clean data centres more often, more deeply
When it comes to implementing a Covid-19 response plan, the first thing companies should consider is site cleaning.
"Data centres tend to be fairly clean, but we're going to have to up our game to limit the risk to the staff and to be in compliance with the requirements that we have to keep the site available," Dickerman said.
"Your data centre cleaning crews are going to become a more important part of your operation. You're going to want to reevaluate the frequency of your regular cleaning and your deep cleaning. You're going to want to review the materials and procedures that those cleaning crews use."
Establish workplace health protocols
New Covid-19 response procedures should include specific actions. For example: distributing tissues to people as they enter the data centre; using non-contact thermometers as part of access procedures; having a shift-change process that enables staff turnover at a distance; creating a cleaning checklist for incoming staff.
"The objective, of course, is not just to take the right actions but also to shift the way that your people are thinking about the things that they do during the day. [It's about] getting them in the habit and in the mindset of being prepared and avoiding possible contamination or infection," Dickerman said.
Safeguard supply chains; stress test VPNs
Among the safeguards recommended by Uptime Institute are to plan for supply chain disruptions on items such as cabling and server racks; top off fuel tanks; and defer nonessential maintenance when possible.
Also, stress-test your VPNs, Dickerman added. "Stress-testing your VPNs is not just for remote work and for having your administrative people work remotely.
"If you do come into a worst-case scenario where you're not able to staff the data centre, you're going to want to be able to access your building management system and other operating systems remotely, safely and securely, and that means, in most cases, having a good VPN."
Restrict travel; create reserve data centre teams
It's important to make sure staff doesn't travel between data centre sites to reduce the risk of transmitting an infection from one site to another, Dickerman said.
At higher levels of response escalation, consider setting up two shift teams so that one team can be at home, self-isolating, for two weeks while the other team works.
Designate key reserves who self-isolate and only come to the data centre if absolutely needed. "Evaluate your key personnel and [identify] who is critical to your operation, and designate alternates for those people. And in this current situation, establish a rule that those key personnel and their alternates don't come into contact with each other," Dickerman said.
Make the case that your data centre is critical
Another benefit of having a comprehensive Covid-19 response plan is it can help establish that data centre operations are critical; government and health authorities might limit travel in an effort to contain coronavirus spread, and a solid plan could help a company make the argument for an exception.
"Have a good plan that documents the criticality of your sites and the consequences to an area, or to the people you serve, of an outage, and also shows the steps that you're taking to keep your staff safe. That will go a long way to convincing authorities that your data centres are, in fact, critical and that those kinds of exceptions should be granted," Dickerman said.
Enlist management to ensure plans are carried out
Companies need to have a plan for both internal and external communications. The plans should cover modes of communication and frequency, for starters.
In the big picture, it's very important that management be actively involved in the execution of all plans and able to shift tactics as circumstances change, Dickerman said.
"Only management can make sure that all these things that you established as procedures and practices are actually implemented consistently and over time," he said. "People tend to relax after they've been in a crisis situation for 10 days or 20 days, and you want to avoid that if at all possible and make sure that the procedures keep on being applied for as long as necessary."
There are many more specific action items in the full report, "Covid-19: Minimising critical facility risk. Going forward, Uptime Institute plans to release a regular bulletin with updates on Covid-19.