AI tackles workload management challenges in the data centre

AI tackles workload management challenges in the data centre

AI is ready to automate essential data centre management tasks. But are data centre managers prepared to make the transition from human to machine management?

Credit: Dreamstime

As data centre workloads spiral upward, a growing number of enterprises are looking to artificial intelligence (AI), hoping that technology will enable them to reduce the management burden on IT teams while boosting efficiency and slashing expenses.

AI promises to automate the movement of workloads to the most efficient infrastructure in real time, both inside the data centre as well as in a hybrid-cloud setting comprised of on-prem, cloud, and edge environments.

As AI transforms workload management, future data centres may look far different than today's facilities. One possible scenario is a collection of small, interconnected edge data centres, all managed by a remote administrator.

Due to a variety of factors, including tighter competition, inflation, and pandemic-necessitated budget cuts, many organisations are seeking ways to reduce their data centre operating costs, observes Jeff Kavanaugh, head of the Infosys Knowledge Institute, an organisation focused on business and technology trends analysis.

"AI and automation have proven to be powerful tools in workload management, as it frees employees from time-consuming and mundane tasks and allows them to focus on work that actually requires a human," he says.

Most data centre managers already use various types of conventional, non-AI tools to assist with and optimise workload management. Yet these tools tend to be reactive rather than proactive, says Sean Kenney, director, advisory, at professional services firm KPMG. "They react to the problems in the data centre, but they don't collect data to determine any foresight to reduce the problem behaviour," he notes.

Sanket Shah, a clinical assistant professor of biomedical and health information sciences at the University of Illinois, Chicago, believes that AI now is poised to help data centre managers who find themselves with no reliable way to predict or plan for future needs.

"With AI, capacity and horsepower can be allocated in a more efficient manner, allowing organisations to scale and become flexible," he explains. "Automating certain processes and shifting power where necessary will ultimately lower costs for those [managers] that have rapidly evolving data needs."

The idea of using AI technology for data centre management is hardly new. Back in 2014, for instance, Google disclosed that it was using technology acquired by its purchase of UK-based AI specialist DeepMind to enhance data centre facilities equipment management at several of its sites.

Today, the AI workload management field has expanded considerably to include a number of startups, such as DLabs, digitate, Redwood Software, and Tidal Software. Larger players, such as Cisco, IBM and VMware, have also started entering the market.

As with most things AI, workload management technology is advancing rapidly. "There are a ton of choices and a ton of limitations, but there are usually ways to mitigate those limitations," notes Bill Howe, an associate professor at The Information School of the University of Washington. "I don't see the problem of choosing the right methods and engineering solutions ... to be particularly more or less challenging in workload management than any other complex AI application," he observes.

Fulfilling a need

A top priority for most data centre managers is optimising operations to meet peak demand. Yet no matter how carefully they plan and prepare, demand peaks and valleys often remain beyond their control.

"Where AI can bring unique improvements is that it can understand workload patterns and match those demands with data centre capacity," says Goutham Belliappa, vice president of AI engineering at business advisory and consulting firm Capgemini North America.

AI management promises to free data centre teams from attending to an array of mundane, repetitive tasks, including server management; security settings; compute, memory, and storage optimisation; load balancing; and power and cooling distribution.

"All of these workloads can be automated or enhanced by AI," says Lian Jye Su, principal analyst at tech market advisory firm ABI Research.

AI can help analyse the data collected from individual machines and spot anomalies in the parameters that are being monitored, says Ramprakash Ramamoorthy, product director for AI and ML at IT management software developer ManageEngine.

"AI can also help predict breakdowns and outages much earlier, and this can help the data centre management team to mitigate downtime and to keep the clusters up and running in good health," he adds. "AI can also enable better temperature and voltage management, thereby directly cutting down on operational costs and helping reduce carbon footprint."

While various AI approaches can be used, a workload management tool should always ensure that model predictions are fully explainable, Ramamoorthy says. "More often than in other domains, a decision taken by an AI system in data centre workload management will be acted upon by a team or teams of people working together," he explains.

Therefore, AI model decisions should be interpretable, allowing the IT team to better understand the intent of the model's decision and to act accordingly. "AI models can, at best, be 80 to 85 percent accurate, so this would also help the human teams correlate sensible decisions by rightly interpreting the AI model's decision," he notes. It would also be useful for effective workload management if the AI model could give a confidence score to the decision it presents.

As AI and ML tools become more widespread, organisations are learning that the best outcomes are achieved when human intelligence collaborates, not competes, with the technologies, says Richard Boyd, co- founder and CEO of artificial intelligence and machine learning developer Tanjo.

"Machines simply cannot replace humans in many respects, but there are certainly areas where machines are much better than humans," he says. "Popular opinion will shift once AI and ML become prevalent and workers adapt to this new partnership."

Data centres can leverage AI/ML to improve performance as well as to optimise configuration and deployments, says Brons Larson, AI strategy lead at Dell Technologies.

"AI/ML enables dynamic orchestration of resources versus workloads to optimise resource utilisation to better manage costs," he states. All AI solutions, regardless of application or vendor, require expertise to properly configure and optimise value, Larson adds. "This starts with properly capturing and evaluating data for training and testing and managing deployed models against drift and bias."

Read more on the next page...


Show Comments