Story image

Opinion: How to capture “Lessons Learned” with cloud-based DCIM

09 Oct 18

Article by Schneider Electric Digital Services and Data Centre Software vice president and general manager Kim Povlsen

“Those who cannot remember the past are condemned to repeat it.” – George Santayana

From a distributed IT and data centre monitoring perspective, what if cloud-based data centre infrastructure management (DCIM) software could help you determine whether current events are critical or unimportant based on historical data?

Knowing the difference between a non-event, an emergency, and an outright crisis is vital to continuous and efficient operations. IT and data centre professionals must be able to differentiate between the thousands of alarms that can be activated at any time in a normal working day.

Different level events must drive different actions

Operators can face many challenging scenarios and must make quick decisions based on real-time data. They must learn to classify the impact of alarms to determine a proper response.

Software with predictive analytics could take routine events – like an alarm firestorm – and prioritise which events are true emergencies.

Other events are unique – so extreme they are impossible to predict based on historical information.

Then there are those events that are difficult, but not impossible to predict.

In the European heat wave of 2018, one data centre operator reported getting 20,000-30,000 alarms a day because the outside air temperature was regularly rising above 35°C (95°F).

Operators quickly recognised a pattern.  An uptick in temperature routinely happened before the cooling system kicked in and reduced temperature.

This repeating cycle caused temperatures within the data centre to regularly rise above, and then fall below, target thresholds as often as 30 times a day. Monitoring systems responded by generating thousands of alarms on the rack PDUs as the thresholds were crossed.

Analysis helped operators understand which alarms needed attention and which could be ignored.

A software system that can recognise the characteristics of an event based on historical data, and grade the effects and responses to previous alarms, can inform the personnel about how critical or not such an event is likely to be.

Stop reacting, start predicting

New cloud-based data centre infrastructure management (DCIM) tools are constantly evolving to improve operations and mitigate potential issues.

By capturing endless, anonymised data points, cloud-based DCIM software is designed to help operators move from being reactive to being predictive.

The software system learns from operator’s reactions to different events and becomes “smarter” by establishing alarm priorities based on previously seen conditions.

By automatically building ratings of criticality each time an alarm occurs, the software can guide operators’ reactions with appropriate warnings.

If a crisis does occur, the system stands ready to highlight what is important and push to the background anything that is not. Through this support, operators can more quickly perceive what constitutes the real cause of a problem and respond appropriately.

The software system can become personalised to a particular IT environment and those who operate it in much the same way that a search engine, online retailer, or digital content provider learns your preferences and serves items that relate directly to your primary interest.

Discover today’s cutting-edge cloud-based DCIM tools

In a distributed IT or data centre environment, we are always learning more about real problems.

Improving today’s alarm systems will increase productivity and help deal with major hazards.

As systems scale and more data is captured, predictive analytics become feasible. This has the potential to raise the bar on what is possible through state-of-the-art, cloud-based DCIM software.

Achieving cyber resilience in the telco industry - Accenture
Whether hackers are motivated by greed, or a curiosity to assess a telco’s weaknesses; the interconnected nature of the industry places it in a position of increased threat
DigiCert's QuoVadis acquisition extends PKI expertise in EU
DigiCert has now officially completed its acquisition of QuoVadis Group from Swiss security firm WISeKey International.
Commvault fully integrates backup with Cisco Hyperflex
Its IntelliSnap technology has been validated to work with Cisco HyperFlex hyper-converged systems without the need for third-party tools.
Huawei continues 5G trials despite ongoing concern
Huawei completed the 5G NR test at 2.6GHz spectrum in the 5G trial organised by the IMT-2020 (5G) Promotion Group. 
Experts comment on record 772mil-user data breach
Dubbed “Collection #1”, the data set contains emails and passwords with over a billion unique combinations of email addresses and passwords.
Top risk facing organisations? Why, it’s an IT talent famine
For some time there has been talk about how the IT industry is crying out for new talent and skills, which a lot of people have glossed over. But now Gartner says it is a harsh reality.
LISA Double Access fibre management system to launch at Cisco Live
“In a data centre, the protection of the fibre is key, which is exactly what the LISA Double Access offers customers.”
Data centre cybersecurity actions that most people overlook
Schneider’s Steven Carlini discusses ways to improve data centre cybersecurity that most people don’t think of until it’s too late.