Story image

Lessons learned from running the world’s largest data centers

15 Jan 18

While managing facility operations for large data centers certainly takes specialized skills in a range of disciplines, the more you do it, the better you get at it.

Given that Schneider Electric has more than 800 people managing facility operations for some 100 large data centers around the globe, it’s fair to say we’ve learned a great deal.

In fact, I recently viewed a webinar that a colleague of mine presented on the topic, “Lessons Learned from Running the World’s Largest Data Centers.” 

In this post, I’ll pass along at least a few of those lessons (and invite you to check out the webinar for the rest).

Most of the lessons we’ve learned fall into one of five general categories:

  • Competency
  • Standardization
  • Risk management
  • Tracking and reporting
  • Operation and maintenance costs

Competency

In terms of competency, the main issue is that most companies have expertise that lies in areas other than managing data centers, a topic we covered in this previous post.

That’s as it should be.

If you’re in, say, retail, healthcare or manufacturing, your expertise lies in those areas; the data center is merely a supporting function.

But it’s an issue if you want to run the data center using internal employees, because you don’t have a large workforce to pull from. I’ve been to conferences where entire panels have been dedicated to the issue of training millennials in data center operations. Universities are only now starting programs to address the issue.

As a result, we routinely see companies with data center infrastructure management (DCIM) and other tools installed, but they’re not using them to their full extent – because they simply don’t have the appropriate expertise.

Standardization

With respect to standardization, companies tend to run into trouble after mergers and acquisitions, or if they experience rapid growth.

They wind up with a series of data centers, with no common set of standards in terms of how to operate them.

No matter if you’ve got two data centers or 20, you need to share learnings among all of them.

Schneider Electric’s standards and procedures are best in class in part because we are diligent about sharing what we learn in operating each one of the 100 or so that we operate. We use those learnings to continually update our processes and procedures so when a problem occurs, we have sound emergency procedures in place to follow.

They should include back-out procedures to follow in the event something unexpected happens after a data center change – to prevent the issue from getting worse.

Risk management

Such procedures are closely related to the risk management topic. One of the big lessons here is to have a full-system approach to data center management.

If you need to take a component out of service to perform maintenance, for example, you need to first understand the impact and dependencies of that component with respect to the rest of the data center.

Doing so requires a thorough understanding of the data center.

For any data center we manage, Schneider Electric likes to get in on the construction phase, or as close to it as possible.

That way we can gain a thorough understanding of the architectural drawings, piping, wiring and so forth – all of which is knowledge that helps mitigate the risk that goes into operating a data center.

Tracking and reporting

Tracking and reporting is an area that gets overlooked far too often, leading to wasted operational costs.

With proper tracking and reporting, you should be able to identify stranded IT capacity – that old rack of servers over in the corner, for example, that nobody is really sure still serves a purpose. (We’ve all seen those, right?) 

Reclaiming that capacity can help you stave off a data center expansion by getting more out of the space you’ve already got.

Operation and maintenance costs

Which leads to the final area, operation and maintenance costs.

We’ve learned plenty of lessons in how to keep these costs down, like using condition-based and predictive maintenance to replace components only when they really need it, as opposed to when some schedule says they do. 

And if you effectively track your assets (see previous point), then you can start determining which ones require the most maintenance – and potentially save money by replacing them. 

Article by Anthony DeSpirito, Schneider Electric Data Center Blog 

Can it be trusted? Huawei’s founder speaks out
Ren Zhengfei spoke candidly in a recent media roundtable about security, 5G, his daughter’s detainment, the USA, and the West’s perception of Huawei.
How HCI helps enterprises stay on top of data regulations
Increasing data protection requirements will supposedly drive the demand for Hyper-Converged Infrastructure solutions across the globe.
Vodafone and PNSol champion new ‘invisble network’ broadband project
"As an industry, we've increased the speed of broadband to one gigabit and beyond, which is a remarkable achievement, but we now have to look beyond speed."
Top 3 cloud computing predictions – what’s in store for 2019?
Virtustream's Deepak Patil shares his predictions for how cloud computing will evolve in 2019.
London’s pricy data centres allow Frankfurt to overtake
According to a new report, data centre pricing in the UK is among the highest in Europe, which is seeing other countries prosper.
Rubrik welcomes $261m funding for new market expansion
The company intends to use the funds from new investor Bain Capital Ventures will go toward future innovation and expansion.
Survey finds retailers 'bullish' on hybrid cloud adoption
The retail industry takes no prisoners and that’s made clear in its 'on the pulse' adoption of new technologies.
New research shows data centre acquisition upsurge
Despite the rise in number of deals, the total value for the year actually fell below the 2017 peak.