Storage Virtualization
Abstracting Physical Storage into Logical Pools for Cloud Scalability
What is Storage Virtualization?
Storage virtualization is the process of pooling physical storage from multiple network storage devices into what appears to be a single storage unit that is managed from a central console. It hides the complexity of the Storage Area Network (SAN) by presenting a logical view of the physical hardware to the user or application.
In a cloud environment, storage virtualization allows administrators to perform tasks such as backup, archiving, and recovery more easily and in less time by disguising the actual complexity of the underlying storage infrastructure.
Key Mechanisms
Storage virtualization typically occurs at one of three levels: the Host (Server), the Storage Device, or the Network (Switch).
- Block-level Virtualization: Abstracting the logical drive from the physical memory blocks. This is used in SAN environments to provide flexible "Virtual Volumes" to servers.
- File-level Virtualization: Used in Network Attached Storage (NAS) to eliminate dependencies between the data and the location where the files are physically stored.
- Pooling: Combining multiple physical disks (HDDs or SSDs) from different vendors into a single "Storage Pool" that can be sliced into smaller virtual disks.
Key Use Cases in Cloud Computing
Storage virtualization is fundamental to high-availability and disaster recovery strategies in the cloud:
Non-Disruptive Data Migration
Moving data from an old storage array to a new one while the applications are still running, as the application only sees the "logical" address which stays constant.
Thin Provisioning
Optimizing storage by "over-subscribing." A virtual disk can be set to 1TB, but it only consumes the actual physical space being used (e.g., 100GB), allowing for better resource utilization.
Storage Tiering
Automatically moving "hot" frequently-accessed data to fast SSDs and "cold" data to cheaper HDDs without the user knowing where the data physically resides.
Real-World Cloud Examples
Public cloud providers leverage storage virtualization to offer scalable services:
- AWS EBS (Elastic Block Store): Provides virtualized block-level storage volumes for use with EC2 instances.
- Google Cloud Persistent Disk: A durable, high-performance block storage service for GCP instances.
- AWS S3 (Simple Storage Service): While an object storage service, it utilizes massive-scale storage virtualization to provide virtually unlimited capacity.
- Software-defined Storage (SDS): Used in private clouds like OpenStack to manage storage through software policy rather than proprietary hardware.
Integration with SDDC
Storage virtualization is a vital pillar of the Software-defined Data Center (SDDC). By decoupling data from physical disks, it enables the automation of storage provisioning and ensures Data Portability, allowing consumers to move data across different cloud environments without being locked into specific hardware vendors.