Dr. Kai Li, a professor at Stanford University and co-founder of Data Domain, was recently in the country to give a guest lecture at AUT about de-duplication technology and the massive potential that can be harnessed by reducing redundant data.
Li said the technology will replace tapes and that data de-duplication will do for the storage industry what the iPod did for the music industry.
“As humans we just don’t like tapes,” he said. “That’s why we replaced music cassettes with the iPod, we’re replacing VCRs with DVRs, and now we’re replacing backup tapes with deduplication storage.”
Unlike backup tapes, which take a long time to access the stored data, de-duplication technology can access it quickly and, if the software has strong architecture, data can be recovered reliably. Li said that with manual recovery of tapes, the probability of a successful recovery is 90% because 10% of the time the tape can’t be found or read back. Another area where data deduplication presents opportunities is within archival storage, especially with emails, which typically have a lot of duplicate copies stored, so the compression rate is typically five to one.
“De-duplication can be done in multiple locations, it can be done at the source, or at the server, but in that kind of situation [archiving] de-duplication can help reduce network traffic in addition to reducing the space and power in the data centre,” he said.
Like with any technology, Li said there will always be those late to catch on, but like the iPod, the technology is set to revolutionise the storage market.
“Any revolutionary replacement will take time. There are always people who are very eager to replace, there are always people who are resisting changes,” he said. “So this change will take time, but it is clear, I think in the industry, and in the marketplace, this is the right thing to do.”
But just as all mp3 players are not the same, Li said that as de-duplication becomes a ‘hot product’, adopters should be wary of the number of products in the marketplace by testing the actual compression ratio before committing to any one technology or product.
Li said: “You can call one thing data deduplication, but a good de-duplication technology will give you a much higher compression ratio, which is what you want".