Servers can consume more than half of the energy in modern data centres, which makes server efficiency attractive to companies looking to hit carbon-neutral sustainability targets. Plus, reducing energy usage can save money.
To help reach that goal, here are five ways to boost server efficiency, according to recent research from the Uptime Institute, which is focused on improving the performance, efficiency, and reliability of business-critical infrastructure.
- Upgrade to a newer server generation. For decades, server energy efficiency has consistently improved thanks to improved efficiency of processors that power them.
- Pick servers with high compute capacity as measured in number of transactions per second. Those are the most energy efficient.
- Go for high core count. In general, efficiency improves with the number of cores, although there is some tapering off at the highest end.
- Be aware that while a server can be more energy efficient, its actual overall power consumed (Watts) can increase even as its efficiency (transactions per second per Watt) increases.
- Embrace power-management features in two ways: by reducing core CPU voltage and frequency as utilisation increases, and by moving unneeded cores to idle state.
For its analysis, Uptime focused servers that use AMD EPYC or Intel Xeon processors, and it examined server generations from 2017, 2019, and 2021 using data from The Green Grid’s SERT database (additional details on the SERT data can be found at the end of the article).
Get rid of old, power-sucking servers
Older servers are less energy efficient than new ones, says Jay Dietrich, Uptime Institute's research director of sustainability. For example, Intel servers' efficiency improved by 34% between 2017 and 2019 for CPUs running at 50% utilisation, according to a recent report he co-authored. And AMD-based servers saw a whopping 140% improvement, he says.
Upgrading from 2019 to 2021 CPU-based servers will increase efficiency by 32% for Intel servers, and by 47% for AMD servers. The improved efficiency numbers cut across all levels of utilisation.
When comparing AMD and Intel servers, Intel servers were more efficient in 2017 at all levels of CPU utilisation, but since 2019 AMD has leapt ahead. With 2021 servers running at 50% utilisation, the average AMD server is 74% more efficient than an Intel.
Don't underuse servers
Just like a car idling in traffic, servers that aren't running at full capacity are just wasting energy.
According to a 2022 Uptime Institute data-centre survey, only 47% of companies got 50% or better server utilisation, up from 36% in 2020. Those numbers might be inflated some because companies that responded may have reported just their best-performing servers—for example those only running batch jobs, which might push utilisation as high as 80%, Dietrich says.
Utilisation rates in general, though, would likely be lower because many applications don’t run consistently. Business and enterprise software, for example, are used heavily during working hours but much less after hours. The utilisation of servers can be increased by having the ones hosting business apps run less time-sensitive workloads during off-peak hours.
The effort is worth it. Doubling low CPU utilisation (20% to 30%) to higher levels of (40% to 60%) can boost average efficiency dramatically, Uptime says.
For maximum impact, companies should look at increasing utilisation while also upgrading servers to the latest models. According to Uptime, combining increased utilisation with a server refresh, efficiency can more than double.
That means an increase of 100% or more in workload processed for the same amount of energy. When done at scale, this can result in significant capital and operational savings, reduce energy requirements, and improve sustainability performance.
On the flip side, directly replacing a legacy server with a higher capacity one without also increasing the legacy workload actually reduces utilisation rates, says Dietrich, thereby undoing some of the benefits of the upgrade.
It takes additional planning to increase utilisation while also doing a hardware upgrade, but the result is not just better efficiency, but possibly fewer servers because the necessary number of new machines may be less.
Opt for more powerful machines
Buying more powerful hardware can also result in better energy efficiency. For AMD servers in particular, efficiency improves sharply as server work capacity increases. Upgrading from a low-end server that handles two million SSJs to a high-end server that can do more than eight million can double server efficiency. For Intel servers, there are still efficiency benefits, though they are less dramatic, Uptime says.
Increase server cores
Another way to improve efficiency dramatically is increasing the number of processor cores. In the case of 2021 AMD servers, as the number of server cores increases from eight to 64, the efficiency triples, Uptime found. For Intel, the increase was less but still significant for 2021 machines.
It’s important to note that not all workloads are capable of using all available cores, says Dietrich. "Some workloads will work most efficiently on, say, a 12-core processor," he says. So it’s important to match processors’ ability with the needs of the applications running on the server in order to gain the most efficiency.
In some cases, hypervisors and virtual machines can be used to maximise usage, he says, but not all applications lend themselves to these environments.
Manage power effectively
Power-management features of servers can improve the energy-efficiency equation, according to Uptime’s research, boosting server efficiency by at least 10%.
The way this works is that CPU voltage and frequency can be increased or decreased, and unused cores can move into a low-power idle state. Many organisations don't use these features, however, because of performance worries or latency issues.
According to the Uptime Institute report, power management can increase latency by 20 to 80 microseconds, which is unacceptable for some types of workloads, such as financial trading.
"And there are some applications where you might decide not to use it because it will cause performance or response time problems," he says. But there are other applications where delays won’t have a business impact.
"The biggest mistake is that some operators are risk averse," he says. "They think that if they're going to save a couple of hundred bucks a server on their energy bill but are risking breaking their SLA which will cost them a million dollars, they're not going to turn [power management] on."
Dietrich recommends that when companies buy new servers and run their performance tests, make sure they test whether power management affects the applications adversely or not.
"If it doesn't bother them, then you can use power management," he says. "You can implement a set of power-management functions that will let you save energy and still provide response time and performance that your customers want."
How Uptime measured efficiency
Uptime analysed the efficiency of 429 server platforms using The Green Grid’s Server Efficiency Rating Tool (SERT) database. The Green Grid is a consortium whose goal is to create tools, provide technical expertise, and advocate for energy and resource efficiency in data centre environments.
The SERT suite is an industry standard for measuring server efficiency; mandatory server efficiency requirements set by the EU’s Ecodesign Directive and the US Energy Star program specify that servers report the SERT overall efficiency metric.
Uptime analysed AMD and Intel server data from the SERT database, noting that different processor types have advantages and disadvantages depending on the workload. Uptime focused on servers that use AMD EPYC or Intel Xeon processors, and analysed server generations from 2017, 2019, and 2021.
The institute ran the servers through their paces with a simulated enterprise online transaction-processing application that stresses processors and memory. That simulation is the SERT worklet server-side Java (SSJ).
Uptime says it was chosen in part because SSJ data is available for eight levels (rather than just four levels) of server utilisation (12.5%, 25%, 37.5%, 50%, 62.5%, 75%, 87.5% and 100%), which allows for a more granular analysis.