Story image

The price of recovery

01 Aug 09

DR can cost more than it is really worth.

New Zealand businesses are increasingly interested in exploring disaster recovery (DR) policies and procedures. This new interest is driven by a variety of factors, such as  the recentralisation of IT infrastructure, increased real estate costs, and server virtualisation.

As communications infrastructure improves and becomes cheaper, organisations are now able to recentralise a lot of their IT infrastructure. For example, instead  of having email servers at each branch office, many companies are now bringing their email server infrastructure into one or two main data centres.

Data centres that have been installed in offices within the central business district are now being moved or redeployed to cheaper real estate on the outskirts of cities.

But the trouble with the centralisation of IT infrastructure and sharing of resources is that there are a lot more “eggs to the basket”. When you couple this with cheaper communications costs and a push for data centre redesign, many organisations (which may have never previously had a DR plan for a very small part of their infrastructure) are now investigating more comprehensive DR options.

The problem many organisations face when they look into DR is that the traditional costs are still prohibitively high. These costs are driven by factors such as duplicated servers, storage, networks and workstations sitting at the DR sites “doing nothing but eating power”.

Costs are also driven up by high-speed network connections for data replication, which cannot be used for anything else, storage vendors that charge a high cost for data replication on a cost per TB basis, and power rack space and cooling costs that previously may have been absorbed by other parts of the business.

All of this just for DR, which like most other forms of insurance, adds no immediate value to the business. These costs become especially difficult to justify when the main thing they protect against is unlikely calamities such as random meteorite strikes or man-made disasters.

While outages caused by “acts of God”, terrorist attacks and utility failures garner significant press coverage, the more mundane “day to day” causes of data loss go unreported and generally unnoticed. Furthermore, a quick Google search on the term causes of data loss turns up far more results on what could be more accurately described as illegal data access, driven by legislation in the US that mandates public reporting of this class of failure in information security. As NetApp founder, Dave Hitz states in his blog, this kind of reporting “baffles our risk intuition and results in significant amounts of resources being dedicated to solve problems that may never occur”.

As a case in point, a recent survey of over 260 IT professionals found that internal infrastructure issues were the main cause of unexpected downtime, with database problems and other software issues making up a significant proportion of the remaining failures. An interesting feature of this report is that most business continuity plans still pay little attention to these common failures. Instead, companies focused their investments on mitigating problems that come from the outside, even while very few companies in the survey reported these kinds of external events ever contributing to downtime.

Disaster recovery plans tend to do little to protect the business against the main causes of data loss, such as administrator errors, and companies are still stuck with having to go back to their tape backup infrastructure to recover from the majority of problems.

NetApp believes that disaster recovery should provide ongoing business value, as well as secure against the worst-case scenarios. There are several technologies that can do this, such as MetroCluster, which allows the main site and the DR site to be production sites; Flexcone, which allows multiple virtual copies of the data to be presented at the DR site, where all the other equipment can be used to accelerate software development and increase business agility; and Production Manager, which automates the DR process, making it easy to use, deploy and test.

In this way, disaster recovery infrastructure is transformed from an insurance policy to something that can help drive business innovation.