The Distributed Data Center

The Distributed Data Center

Posted on January 7, 2012 by TheStorageChap in Data Center, Federation, VPLEX

One of the key focuses for 2012 will be the distributed data center.

The key attributes of the distributed data center include:-

  • ‘Building Blocks’ of resources
  • Load Balancing across sites
  • Federated data repository
  • Continuous or High Availability of applications and data
  • Optimal routing and response times

Why distribute your data center?

  • Increased Availability – Both DCs serve Production workloads whilst providing resiliency for the other.
  • Increased Asset Utilisation – Alternate DCs are expensive, normally with idle resources.
  • Increased Performance (Locality of Data Access) – Data does not have to be read from the ‘Production’ site as the same data is R/W accessible at both sites.

Distributed Data Center Considerations

  • Front-End IP Access Layer – Needs to be able to achieve ‘Content Routing’ and Site Selection
  • Application and Database Layer – Needs to be able to support Workload Mobility, Load Balancing and Service Clustering
  • Back-End Data Access – Needs to be supported simultaneously across sites with locality of access

Storage Federation

One of the key elements of the distributed data center is the ability to federate your data across sites. There are effectively two methods of creating simultaneous  data access across sites.

Split Node Configuration

A physical LUN is exposed to all members of the virtualization solution from a storage array at each site. A RAID 1 device is created from these LUNs and a Virtual Volume created from this device. Hosts at either site are dual pathed across the sites. For any host at site A or site B, the virtual volume is accessed through a single preferred node, which is accessing only the primary storage. For example if the preferred node is in Site A and the host in Site B, all traffic would traverse the site links.

Federated Virtualization

A physical LUN is exposed from the local storage array in Site A to the local virtualization cluster in site A. A physical LUN is exposed from the local storage array in Site B to the local virtualization cluster in site B. A distributed device is created across the virtualization clusters and a virtual volume created from that device that is accessible across both sites. Hosts in Site A are connected to the virtualization cluster in Site A and hosts in Site B are connected to the virtualization cluster in Site B, however the hosts have concurrent read / write access to the same distributed virtual volume. In this configuration the physical IO is served from the physical storage array local to that site, enabling greater utilization.


EMC VPLEX uses a federated virtualization approach.

Each VPLEX site has a local VPLEX Cluster and physical storage and hosts are connected to that VPLEX Cluster only. The VPLEX Clusters themselves are interconnected across the sites to enable federation. A device is taken from each of the VPLEX Clusters to create a distributed virtual volume.  Hosts connected in Site A actively use the storage I/O capability of the storage in Site A, Hosts in Site B actively use the storage I/O capability of the storage in Site B.

Split node virtualization solutions require all components to be cross-connected across sites. For example Node A in Site A would need access to the storage in Site A and Site B, likewise for the Node B in Site B. When a virtual LUN is created, it is mirrored across the physical disks, but only one of those disks is the primary for the virtual volume whether it is accessed from Site A or Site B.

VPLEX distributed volumes are available from either VPLEX cluster and have the same LUN and storage identifiers when exposed from each cluster, enabling true concurrent read/write access across sites.

Most split node virtualization solutions have a preferred node, through which all access must be performed, irrespective of if the host is at Site A or Site B.

When using a distributed virtual volume across two VPLEX Clusters, if the storage in one of the sites is lost, all hosts continue to have access to the distributed virtual volume, with no disruption. VPLEX redirects all read/write traffic to the remaining storage at the other site.

I will talk more about the distributed data centre and the importance of Federated and Virtualised storage over the coming weeks; we will also be demonstrating the technology at @CiscoLive at the end of January.