Monday, March 29, 2010

Thin provisioning, deduplication and storage virtualization

Akhtar Pasha writes that the market will grow in single digits and that server virtualization will force businesses to redesign their backup strategies. Thin provisioning, deduplication, storage virtualization and tape libraries will gain traction and SANs will be built using SSDs

Indian enterprises moved to reclaim storage, redesign backup strategies and move data to tape libraries freeing up storage space rather than buying additional storage boxes in 2009. Surajit Sen, Director-Marketing & Alliances, NetApp, said, “Businesses were deleting data from their existing storage devices and moving it to cheaper media and, in this manner, adding capacity.”
The capacity increase of 40-60% was in sharp contrast to the falling revenues in the external controller based storage market that contracted 17% in CY 2009 to $235.9 million from $285 million in CY 2008 as per IDC’s preliminary data. The worst quarters were Q2 and Q3, which completely eroded growth. However, the storage industry’s inflection point was only in Q4 when it grew by 29% to $65.9 million as opposed to $50.8 million in Q3. IBM was the clear leader in external controller-based storage with a 31.5% market share followed by EMC with 22%.
Virtualization, deduplication and tape libraries carried the storage market
Customers were willing to do only incremental spending on storage. Therefore technologies such as storage virtualization, deduplication, thin provisioning and tape automation and tape libraries gained traction as businesses were looking at increasing the utilization of their networked storage assets. Sen added, “Their focus was on reevaluating their data storage strategy and increase storage utilization and therefore storage virtualization gained traction. We have sold our V-Series virtualization appliance that can virtualize storage by sitting in front of the storage subsystem to Apollo Hospitals, Aviva Life Insurance, TTSL and HSBC.”
Vivekanand Venugopal, Managing Director, Hitachi Data Systems India added that one group, especially banks and telcos, was focused upon reclaiming storage and therefore invested in technologies such as storage virtualization, dynamic provisioning of storage resources, thin provisioning and redesigning backup strategy. The second group was focused on accelerating time to market and faster response time to improve operational efficiencies. These customers build new data centers and storage was inevitable part of their IT infrastructure.
Manoj Chugh, President of EMC India & SAARC Region, and Director for Global Accounts for EMC APJ, said, “Though the revenue for external controller based storage did not grow in CY 2009, there was a 60% growth in data volumes. Unfortunately IT budgets didn’t grow at that pace leading to a challenge in information management. This forced customers to take a radical approach towards server virtualization as a natural corollary to reduce costs. Server virtualization has severely impacted backup and we saw quite a few customers reframing their strategies in this regard."
Lakshman Narayanaswamy, Co-founder and Vice President Products, Sanovi Technologies also saw customers reworking on backup and DR strategy to meet the new requirement of storage virtualization. Customers have reworked on their RTO and RPO and IOPS as applications running in virtual machines (VMs) have to map to storage for high availability and reliability.
Sandeep Lodha, Vice President-Sales & Marketing, Netweb Technologies, noted that there was pent up demand for unified storage in media houses and TV broadcasters. “We had a couple of engagements with new channels and TV broadcasters who are converting the data residing in tapes to disk. This was driven by two factors—one, to meet the regulatory compliance requirement where they are supposed to store 90 days of news videos that were broadcasted. Two, they are moving to unified storage from tapes because they want to report their stories first and for it they may need to link to a previous video.”
Another significant trend that was seen was the unprecedented demand for tape automation products and tape libraries that grew on its own without much effort from vendors. According to Harmeet S. Malhotra, Senior Manager-Storage Marketing Asia Pacific & Japan, Dell India, “While networked storage took a hit, we saw huge growth in tape automation and tape libraries without spending a penny on marketing. Businesses continued to add raw capacity to their network storage infrastructure that kept the traction going in the secondary storage market.”
Niraj Mandal, Regional Sales Manager-SAARC & Middle East Countries, Tandberg Data (Asia) Pte. Ltd., added, “Many large and mid-market customers were simply adding raw physical capacity to manage their data explosion by investing in tape libraries and autoloaders.” Tandberg Data won the industry’s biggest deployment of tape libraries (in a single order) from LIC (through a tender published in August 2009). It consisted of 114 tape libraries with two FC disks for the insurer’s primary center and another six tape libraries for its DR center. Insurance companies need to preserve and archive policy details and transaction data throughout the lifecycle of the policy holder typically spanning 25-35 years. Tape is the cheapest media for long term data retention and archival.
Another interesting trend in the secondary storage market was the increased number of AMC renewal cases for tape drives. Since investments dried up in storage and every project was questioned, the market saw customers wanting to extend the warranty (AMC renewal) of their tape drives and libraries. It meant that customers were buying more AMCs. Mandal added, “With no fresh investments in storage, we saw a huge number of cases where the customers asked us to extend the warranty of their existing tape drives and libraries rather than investing in new ones. This clearly signaled two things—that customers want to reuse storage and that every rupee set against fresh purchases was questioned.”
Going forward, things are not expected to change substantially in the storage landscape in 2010. Analysts and vendors agree that the external controller-based storage would see positive but single digit growth in 2010 as storage budgets are not likely to grow any time soon. The CIO’s task would be cut-out and KRA directly linked to how he can save money on IT spending.
Server virtualization will force businesses to redesign their backup strategies
"To get the maximum out of tiered storage, dynamic provisioning software would be required to create data volumes for each type of LUN. This would drive high utilization of disks"
- Vivekanand Venugopal
Managing Director, Hitachi Data Systems
"FC drives are losing their relevance to SSDs and we see the latter playing a big role in building SANs in the future along with SAS and SATA drives"
- Surajit Sen
Director-Marketing & Alliances, NetApp
"Ultimately, most backup will use source-based intelligence like Avamar. Along with continuous data technologies, it could kill the concept of the backup window"
- Manoj Chugh
President of EMC India & SAARC Region, and Director for Global Accounts for EMC APJ
Businesses are expected to take an aggressive stance on storage as a result of server virtualization projects. Pallab Talukdar, CEO, Fujitsu India Pvt. Ltd., said, “Thin provisioning, deduplication and tiered storage have become relevant as Indian companies shifted their focus to storage reclamation and on-demand storage. The savings in these technologies are measureable and will continue to gain traction in 2010.”
Virtual servers force users to address storage management and data-protection issues such as backup, remote replication, capacity planning, and information security in new ways. Of all the concerns about implementing virtual server environments, performance comes out on top, although, collectively, storage management issues are also of great concern. There are several factors that affect backup strategy and therefore the design and architecture of backup processes in the wake of server virtualization trends has to change.
Backup is disk IO and network intensive. You may need more network bandwidth than you currently have within your virtual environment. Not all storage devices support protocols for LUN backup and mirroring within the hardware. So you may need to spend more money to achieve this level of backup. Not everything can be backed up without first handling data integrity issues such as those required for databases. Businesses need to investigate how applications and storage would behave in a virtual environment. Off-site storage of backups may be required via tape or disk.
Consolidation (multiple OSs) across different storage platform, data deduplication, and thin provisioning are having a severe impact on business continuity and disaster recovery and there is considerable interest in continuous data protection (CDP). The challenge is to do all this with a single standard platform for all kinds of backup.
Additionally, some large businesses that have consolidated 10% of their servers (in 2009) and have seen the benefits, would like to ramp-up their server virtualization projects from 10% to 25-30%. Chugh said, “As enterprises move into virtualization, they will have to overhaul their data backup and DR strategies because these won't apply so well in the new virtualized world.” There are two major concerns why virtualization requires a new approach to data backup and disaster recovery. One is virtual sprawl, the unchecked proliferation of virtual machines (VM) and these complicate matters from the data perspective. Additionally, distributing applications across VMs or across VMs and physical servers puts further strain on the backup and recovery systems.
As businesses consolidate data from distributed physical servers to VMs, they have to use storage technologies that will reduce storage cost and the information management chaos. As a result of this, they would have to redesign their storage and backup strategies for all kinds of data and tiers of storage (SSD/FC, SATA, and SAS). Sandeep K. Dutta, Vice President-Storage, Systems & Technology Group, IBM India, said, “We had a couple of engagements with customers for our storage virtualization solution, the SAN Volume Controller (SVC), and for deduplication to reduce storage TCO. SVC is a virtualization engine that maps virtualized volumes and makes it visible to hosts and applications as physical volumes of storage devices. A large number of customers are reworking their backup strategies in the wake of server consolidation and virtualization.”
Improving disaster recovery is another driving force behind the combination of server virtualization and networked storage. A primary driving force behind remote replication in the context of server virtualization is the desire to reduce recovery time objective (RTO). Replicating virtual machine images for disaster recovery helps lower RTO. One of the advantages of server virtualization is that it enables users to replicate many servers to relatively inexpensive virtual machines rather than to physical servers, which significantly reduces the primary barrier to disaster recovery—high costs. In addition, disaster recovery with virtual machines can be less expensive than it would be with physical servers because the process can in many cases be managed by the virtualization software.
Deduplication diminishes the appeal of VTL
Virtual tape library (VTL) is a technology that has seen little success in India.
Firstly, VTL makes sense where you have hundreds of servers that need to be backed up with tons of data and hence in the event of failover, you would need faster retrieval of data such as in the banking and telecom verticals—clearly indicating that it would remain a niche technology for large enterprise customers only. Secondly, the cost of VTLs is prohibitive. Mandal explained, “VTL makes sense when you have hundreds of servers that needed to be backed up. Currently VTL vendors are charging license fees based on the number of servers that a customer wants to backup. This licensing policy has significantly increased the acquisition cost of VTLs. Additionally VTL as a technology has slowly lost its sheen because of deduplication.”
Dutta agreed, “VTL has lost its relevance after deduplication strongly made inroads along with virtualization. We had a sizable pipeline for VTL in 2009 but we did not find many takers for this technology.”
Deduplication everywhere
"Thin provisioning, deduplication and tiered storage have become relevant as Indian companies shifted their focus to storage reclamation and on-demand
storage"
- Pallab Talukdar
CEO, Fujitsu India Private Ltd.
"Many large and mid-market customers were simply adding raw physical capacity to manage their data explosion by investing in tape libraries and autoloaders"
- Niraj Mandal
Regional Sales Manager-SAARC & Middle East Countries, Tandberg Data (Asia) Pte. Ltd.
Deduplication which was hitherto applied at the time of backup onto secondary storage is now going to happen on networked storage starting with NAS appliances and then move on to block devices. Mandal said that customers were using deduplication at the destination (secondary storage) to reduce backup time but that the effort was slowly shifting to NAS, where customers now want to apply deduplication at the source rather than at the destination.
EMC bought Avamar for host-based deduplication. On the backup front, Chugh said that spreading data deduplication throughout the infrastructure and closer to the data source will become more important in 2010, along with continuous data protection. EMC will offer primary storage deduplication for file systems with the next upgrade of its Celerra NAS platform, due early this year. “Ultimately, most backup will use source-based intelligence like Avamar. Along with continuous data technologies, it could kill the concept of the backup window,” said Chugh. Deduplication with VTL will help companies do branch office consolidation to reduce the backup window.
Malhotra added that deduplication was a feature and not a solution. He believed that there was a market for appliance-based target deduplication as well as a migration to other types of deduplication. Dell has a deduplication strategy for the high-end market thanks to an alliance with Data Domain. For the mid-range, it counts on Symantec DD BE 2010 deduplication and for the low-end of the market its partner is CommVault, which has added block-level deduplication to its latest version of Simpana.
NetApp has 40,000 customers using deduplication on primary storage globally. In India, it has 300 customers using its deduplication technology on primary storage.
EMC sees deduplication in a different light and said that deduplication would become an integral feature of VTLs
So, we will see more vendors, especially NAS vendors rolling out primary deduplication on those devices first. Tandberg too has announced an Atom-powered NAS/iSCSI box that will have deduplication and encryption as a standard feature. There will be quite a few of these devices during the next six to 12 months. Probably within 12 to 18 months some block vendors will come up with primary deduplication as well.
According to Darshan Joshi, Vice President, Storage and Availability Management Group, Symantec, de-duplication will become widely deployed as a feature in 2010, rather than as a standalone technology. 70% of enterprises have still not deployed deduplication, but will do so going forward as the technology is built into tape libraries, NAS and, eventually, SAN.
SSDs in storage arrays
Solid state disks (SSDs) have started shipping in both servers and in storage arrays although the latter application seems more prevalent. They score on speed and offer long term cost savings on power and cooling.
Although the price per gigabyte for SSDs is prohibitive in comparison to hard disk drives (HDDs), there are certain cases in which SSDs do save money over their HDD counterparts. This is possible in applications that use large numbers of HDDs at a fraction of their capacity to increase the storage system's I/Os per second (IOPS) performance. Sen said, “FC drives are losing their relevance to SSDs and we see the latter playing a big role in building SANs in the future along with SAS and SATA drives.”
Dell has jumped on the SSD bandwagon with the PS6000S, a system with either 400 or 800 GB of capacity depending on the number of SSDs (8 or 16 of 50 GB each). The SSDs are made by Samsung. "While the rest of the market went for high performance and expensive SSDs, we are taking a less expensive approach with the Samsung SSDs, but are still offering a significant performance improvement compared to 15K hard drives. It gives our customers the benefits of solid state disk technology at a lower cost than what other systems are offering,” said Malhotra.
According to Chugh, the true benefit of tiered storage will come when data can be migrated automatically (based on the IO activity) from one disk (LUN) to another without disruption. The advent of new storage media (SSDs) and new automatic tiering methods such as Fully Automated Storage Tiering (FAST) have started to make people think less about physical storage and more about virtual storage. He added that EMC had quite a few customers that had started using SSDs. “We have strong reasons to believe that it will soon become a mainstream technology because of its power savings, space and cooling advantages,” he commented.
According to Venugopal, SSDs in storage arrays alongside other disk technologies such as SATA and SAS was a trend setter for tiered storage. “To get the maximum out of tiered storage, dynamic provisioning software would be required to create data volumes for each type of LUN. This would drive high utilization of disks. For example, HDFC Bank is using our dynamic provisioning software,” he said.
Another interesting technology is used in IBM’s XIV. It offers block SAN like storage and uses SATA drives to deliver enterprise-class raw performance. The product is a cluster or grid of up to 15 storage and interface nodes linked by Ethernet and using 1 TB SATA drives. All of its capacity is virtualized into a single, mirrored pool of storage that delivers up to 79 TB of usable data capacity. Dutta said, “Data is striped across all the drives in an XIV frame and this leads to RAID rebuild times of 30 minutes or so after a drive failure. Even if a drive fails its contents are rebuilt automatically.” He asserted that incoming extra capacity for an XIV frame is plugged into the frame and the XIV software automatically discovers it and adds it to the pool of storage. It then balances the current data load across it.
Cloud storage would wait
Forrester globally released the findings of a survey where over 1,200 enterprise executives were interviewed and the data showed that barely 3% used the cloud for general storage purposes. That's not the worst of it—a good 43% said that they were disinterested in cloud storage, citing issues such as service level guarantees, security and reliability as the principal reasons for holding back. However, even if there is a silver lining for cloud providers here, it's that interest in the cloud as a backup platform is slightly higher, most likely reflecting that such an approach would still house critical data on local storage infrastructure, reserving the cloud for older data or for emergency use. Some say that while the survey may throw cold water on general-purpose storage plans of cloud providers such as Amazon, Google and Microsoft, they can take heart in the rising number of backup and archival solutions that are adding built-in cloud compatibility. We found exactly the same development here in India.
Malhotra said that he doubts that large enterprises would try cloud storage in 2010 and they would rather choose a wait and watch approach till it becomes necessary. We see this trend in two practical directions for deployment—cloud backups and for non-critical data.
Venugopal said, “We see quite a bit of cloud storage as SPs ramp up their operations. This would boost dynamic block, file and content management to be available as private clouds ready for consumption. Additionally, I feel that with the availability of bandwidth, SPs will create content repositories say for record management (for banks and insurance companies). Telcos would consider backups to the cloud for their captive customers. These are some examples of new business models that are evolving in cloud storage.”
There are some exciting cloud backup models that exist globally. One is where Fujitsu does cloud backup for SAP’s global operation.
Fundamentally, as far as the idea of taking non-mission critical data and putting it into a public cloud service goes, there's no doubt that companies would becomes more comfortable with time. However, we see this type of exercise confined to file and object data rather than block storage, which still makes up the bulk of storage business today. Ask any vendor what is their private cloud infrastructure for block-based data and they would have a tough time answering this question.
Renewed interest in tape automation
According to Talukdar there will be a renewed thrust on tape automation and tape libraries in 2010 as businesses, across verticals, start to deploy video surveillance. This would generate tons of data that needs to be backed up, achieved and vaulted and should be made available for review. Given the scenario where data volumes are growing at 40-60% annually and IT budgets are not increasing in 2010, the relevance of tape and tape libraries would grow manifold as businesses would continue to move their data-at-rest on to tapes to cut networked storage costs. Tapes are not only cheaper and greener, they help save money on power and cooling costs as well.
The industry has largely bid goodbye to DAT and DLT-based tapes and embraced LTO (LTO-3 and LTO-4) and LTO-5 is expected to debut in 2010. The reason is that DLT and DAT are proprietary technologies from Quantum and are not pursued by OEMs anymore be it IBM, HP, Dell or Tandberg. The cost of running DLT and cost of media is high when compared to LTO that offers high capacity at a cheaper price.
LTO-5 hikes the speed up to 180 Mbps and capacity to 1.6 TB in native form (uncompressed). The compression ratio in the LTO-5 tape format will be 2:1, which is the same as that of earlier versions. Compared to LTO-4, generation 5 doubles the capacity and boosts the speed by another 50% while maintaining backward compatibility with previous generations. For those that need the speed or performance, LTO-5 is a good option for early adopters.
Server virtualization has been—and will continue to be—a catalyst for new storage spending and increased adoption of networked storage. It will drive companies to invest in technologies that allow them to optimize their existing storage resources. Storage virtualization, thin provisioning and deduplication will be key technologies that will attract investments. The use of SSDs in storage arrays will start to take off while tape backup will continue to go strong.

akhtar.pasha@expressindia.com

No comments:

Post a Comment