Survey: Outages, staffing challenge data centres

Survey: Outages, staffing challenge data centres

Data centre operators are working to increase IT infrastructure reliability, keep key talent from being poached, and stay ahead of environmental regulations, Uptime Institute reports.

Credit: Dreamstime

Data centres are working to improve the resiliency of their physical infrastructure, avoid increasingly expensive outages, and recruit skilled staff in a competitive labour market. Meanwhile, many aren’t tracking critical environmental metrics, even as they face looming sustainability requirements.

These are some of the highlights of Uptime’s Institute's 12th annual Global Data Centre Survey, which tracks trends in capacity, tech adoption, and staffing.

Server refresh cycles getting longer

The lifespan of servers is increasing, according to Uptime, and it often exceeds the vendor-recommended three to five years. In Uptime’s 2015 survey, 34 per cent of respondents said they kept their servers in operation for five years or longer. In 2022, that climbed to 52 per cent.

There are multiple reasons for the increase, according to Uptime. Semiconductor availability is one factor. Component shortages resulted in higher prices and increased delivery times, and “smaller organisations with less buying power were often required to hold off on nonessential upgrades,” the research firm reported.

The trend may also reflect a slowdown in server power-efficiency gains. New IT hardware usually improves data centre efficiency, but Uptime suggests that those efficiency incentives are slowing. 

“Generational changes, particularly in Intel-powered servers, which make up most of the market, are delivering much lower performance and energy improvements than before,” Uptime wrote. “Supply of more efficient servers using alternative (AMD and ARM-based) processors is still limited.”

Cost of data centre outages climbing

There are some encouraging metrics around data centre outages, but Uptime cautions that they can be misinterpreted.

In the big picture, Uptime has tracked a steady improvement in the number of outages per site. In 2022, 60 per cent of operators surveyed say they had an outage in the past three years, down from 69 per cent in 2021 and 78 per cent in 2020.

Another positive: Fewer managers reported serious or severe data centre outages. Historically, outages deemed serious/severe account for about 20 per cent of all outages, according to Uptime. In 2022, that fell to 14 per cent.

Despite lower numbers of outages per site and less frequent severe outages, the overall number of outages globally grew year-over-year. On the plus side, the frequency of outages didn’t grow as fast as the global data centre footprint grew.

While metrics around outages can be tricky to interpret, one trend is clear, according to Uptime: Outages are becoming more expensive. In particular, the number of outages costing more than $1 million is on the rise.

When asked about the cost of their most recent outage, 25 per cent respondents said the outage cost more than $1 million in both direct and indirect costs, a significant increase from 2021, when 15 per cent reported million-dollar outages. Another 45 per cent of 2022 respondents said their most recent outage cost between $100,000 and $1 million compared to 47 per cent in 2021.

From the report: “Why is the cost of outages increasing? This can be attributed to a variety of factors, ranging from inflation, fines, service level agreement breaches and the cost of labour, call outs and replacement parts – but the biggest single reason is the growing dependency of corporate economic activity on digital services and on the data centre. The loss of a critical IT service often translates directly and immediately into disrupted business and lost revenue.”

Power issues still main cause of outages

On-site power problems remain the single biggest cause of significant site outages by a large margin, according to Uptime. In 2022, 44 per cent of respondents said power was the primary cause of their organisation’s most recent impactful incident or outage.

The next most common cause was network issues, cited by 14 per cent. Other notable causes include cooling failures (13 per cent), IT systems problems (13 per cent), and problems at third-party providers such as SaaS, hosting, and cloud providers (eight per cent).

Failure to back up apps in multiple cloud zones

There are mixed messages on the cloud front, too. On the one hand, enterprises are becoming more confident about using the cloud for mission-critical workloads.

In 2019, 74 per cent of respondents said they wouldn’t place mission-critical workloads in a public cloud. In 2022, that fell to 63 per cent. At the same time, the percentage of respondents who said they have adequate visibility into the resiliency of the service provided by a public cloud rose from 14 per cent to 21 per cent, Uptime reports.

“Organisations are becoming more confident in using the cloud for mission-critical workloads, partly due to a perception of improved visibility into operational resiliency,” Uptime wrote. “However, other data suggests cloud users’ confidence may be misplaced.”

The issue is availability zones. An availability zone typically has redundant power and networking, and cloud providers recommend that users distribute their workloads across multiple availability zones in case on zone suffers an outage, according to Uptime. The data suggests enterprises aren't doing that as diligently as they ought to.

When asked about the potential impact if a primary cloud provider were to experience an outage across a single availability zone, 35 per cent of respondents said that it would result in significant performance issues or downtime, and another 49 per cent said minor performance issues or downtime would be expected.

“This presents a clear contradiction. Users appear more confident that the cloud can handle mission-critical workloads, yet over a third of users are architecting applications vulnerable to relatively common availability zone outages,” Uptime wrote.

Data centre staffing problems worsening

As the number and size of data centres worldwide continues to grow, the number of job openings also grows and is outpacing recruiting efforts, according to Uptime. It estimates staff requirements will grow globally from about 2.0 million full-time employee equivalents in 2019 to nearly 2.3 million in 2025. Some of those data centre jobs are in new categories and require specialised skills.

From the report: “The staff shortage affects almost all data centre job roles globally. In mature data centre markets, such as North America and Western Europe, much of the existing workforce is aging and many professionals expect to retire around the same time, leaving data centres with a shortfall on both headcount and experience.

"Hiring efforts are often offset by jobseekers’ poor visibility of the sector. Efforts to bolster talent pipelines by attracting career-changers to the data centre industry are still nascent.”

In the 2022 survey, 53 per cent of data centre operators reported difficulty finding qualified employees in 2022, up from 47 per cent in 2021 and 38 per cent in 2018. In addition, 42 per cent reported issues with staff being hired away, in most cases to competitors. That’s a significant jump from just 17 per cent in 2018.

Failure to track environmental data

Most respondents say they report on overall data centre power use and power usage effectiveness (PUE), but many still are not tracking critical environmental metrics, according to Uptime. Most data centre operators expect to soon be required to report on carbon emissions, for example, yet many are unprepared to comply.

Among survey respondents, 63 per cent said they believe authorities in their region will require them to publicly report environmental data in the next five years, however just 37 per cent collect and report carbon emissions data (up from 33 per cent in 2021) and only 39 per cent currently report their water use (down from 51 per cent 2021). 

New laws, standards, and requirements will force operators to address these gaps and establish more stringent sustainability tracking and reporting practices in the coming years, Uptime reports.

This year’s Global Data Centre Survey includes responses from 800 data centre owners and operators and input from 700 data centre suppliers, designers, and advisors worldwide.

Tags Data Centre

Brand Post

Show Comments