Enterprise VDI

My name is Peter and I am the Principal Engineering Architect for Desktop Virtualization at Dell. :-)
VDI is a fire hot topic right now and there are many opinions out there on how to approach it. If you’re reading this I probably don’t need to sell you on the value of the concept but more and more companies are deploying VDI instead of investing in traditional PC refreshes. All trending data points to this shift only going up over the next several years as the technology gets better and better. VDI is a very interesting market segment as it encompasses the full array of cutting edge enterprise technologies: network, servers, storage, virtualization, database, web services, highly distributed software architectures, high-availability, and load balancing. Add high capacity and performance requirements to the list and you have a very interesting segment indeed! VDI is also constantly evolving with a very rich ecosystem of players offering new and interesting technologies to keep up with. This post will give you a brief look at the enterprise VDI offering from Dell.
As a customer, and just 1 year ago I still was one, it’s very easy to get caught up in the marketing hype making it difficult to realize the true value of product or platform. With regard to VDI, we are taking a different approach at Dell. Instead of trying to lure you with inconceivable and questionable per server user densities, we have decided to take a very honest and realistic approach in our solutions. I’ll explain this in more detail later.
Dell currently offers 2 products in the VDI space: Simplified, which is the SMB-focused VDI-in-a-box appliance I discussed here (link), and Enterprise which can also start very small but has much longer legs to scale to suit a very large environment. I will be discussing the Enterprise platform in this post which is where I spend the majority of my time. In the resources section at the bottom of this posting you will find links to 2 reference architectures that I co-authored. They serve as the basis for this article.

DVS Enterprise

Dell DVS Enterprise is a multi-tiered turnkey solution comprised of rack or blade servers, iSCSI or FC storage built on industry leading hypervisors, software and VDI brokers. We have designed DVS Enterprise to encompass tons of flexibility to meet any customer need and can suit 50-50,000 users. As apposed to the more rigid “block” type products, our solutions are tailored to the customer to provide exactly what is needed with flexibility for leveraging existing investments in network, storage, and software.
The solution stacks consist of 4 primary tiers: network, compute, management, and storage. Network and storage can be provided by the customer, given the existing infrastructure meets our design and performance requirements. The Compute tier is where the VDI sessions execute, whether running on local or shared storage. The management tier is where VDI broker VMs and supporting infrastructure run. These VMs run off of shared storage in all solutions so management tier hosts can always be clustered to provide HA. All tiers, while inextricably linked, can scale independently.
image

The DVS Enterprise portfolio consists of 2 primary solution models: “Local Tier 1” and “Shared Tier 1”. DVS Engineering spends considerable effort validating and characterizing core solution components to ensure your VDI implementation will perform as it is supposed to. Racks, blades, 10Gb networking, Fiber Channel storage…whatever mix of ingredients you need, we have it. Something for everyone.

Local Tier 1

“Tier 1” in the DVS context defines from which disk source the VDI sessions execute and is therefore faster and higher performing disk. Local Tier1 applies only to rack servers (due to the amount of disk required) while Shared Tier 1 can be rack or blade. Tier 2 storage is present in both solution architectures and, while having a reduced performance requirement, is utilized for user profile/data and management VM execution. The graphic below depicts the management tier VMs on shared storage while the compute tier VDI sessions are on local server disk:
image
This local Tier 1 Enterprise offering is uniquely Dell as most industry players focus solely on solutions revolving around shared storage. The value here is flexibility and that you can buy into high performance VDI no matter what your budget is. Shared Tier 1 storage has its advantages but is costly and requires a high performance infrastructure to support it. The Local Tier 1 solution is cost optimized and only requires 1Gb networking.

Network

We are very cognizant that network can be a touchy subject with a lot of customers pledging fierce loyalty to the well-known market leader. Hey I was one of those customers just a year ago. We get it. That said, a networking purchase from Dell is entirely optional as long you have suitable infrastructure in place. From a cost perspective, PowerConnect provides strong performance at a very attractive price point and is the default option in our solutions. Our premium Force10 networking product line is positioned well to compete directly with the market leader from top of rack (ToR) to large chassis-based switching. Force10 is an optional upgrade in all solutions. For the Local Tier 1 solution, a simple 48-port 1Gb switch is all that is required, the PC6248 is shown below:
image

Servers

The PowerEdge R720 is a solid rack server platform that suits this solution model well with up to 2 x 2.9Ghz 8-core CPUs, 768GB RAM, and 16 x 2.5” 15K SAS drives. There is more than enough horsepower in this platform to suit any VDI need. Again, flexibility is an important tenet of Dell DVS so other server platforms can be used if desired to meet specific needs.

Storage

A shared Tier 2 storage purchase from Dell is entirely optional in the Local Tier 1 solution but is a required component of the architecture. The Equallogic 4100X is a solid entry level 1Gb iSCSI storage array that can be configured to provide up to 22TB of raw storage running on 10k SAS disks. You can of course go bigger to the 6000 series in Equallogic or integrate a Compellent array with your choice of storage protocol. It all depends on your need to scale.
image

Shared Tier 1

In the Shared Tier 1 solution model, an additional shared storage array is added to handle the execution of the VDI sessions in larger scale deployments. Performance is a key concern in the shared Tier 1 array and contributes directly to how the solution scales. All Compute and Mgmt hosts in this model are diskless and can be either rack or blade. In smaller scale solutions, the functions of Tier 1 and Tier 2 can be combined as long as there is sufficient capacity and performance on the array to meet the needs of the environment. 
image

Network

The network configuration changes a bit in the shared Tier 1 model depending if you are using rack or blades and what block storage protocol you employ. Block storage traffic should be separated from LAN so iSCSI will leverage a discrete 10Gb infrastructure while fiber channel will leverage an 8Gb fabric. The PowerConnect 8024F is a 10Gb SFP+ based switch used for iSCSI traffic destined to either Equallogic or Compellent storage that can be stacked to scale. The fiber channel industry leader Brocade is used for FC fabric switching.
In the blade platform, each chassis has 3 available fabrics that can be configured with Ethernet, FC, or Infiniband switching. In DVS solutions, the chassis is configured with the 48-port M6348 switch interconnect for LAN traffic and either Brocade switches for FC or a pair of 10Gb 8024-K switches for iSCSI. Ethernet-based chassis switches are stacked for easy management.

Servers

Just like the Local Tier 1 solution, the R720 can be used if rack servers are desired or the half-height dual-socket M620 if blades are desired. The M620 is on par to the R720 in all regards except for disk capacity and top end CPU. The R720 can be configured with a higher 2.9Ghz 8-core CPU to leverage greater user density in the compute tier. The M1000E blade chassis can support 16 half-height blades.

Storage

Either Equallogic or Compellent arrays can be utilized in the storage tier. The performance demands of Tier 1 storage in VDI are very high so design considerations dealing with boot storms and steady-state performance are critical. Each Equallogic array is a self-contained iSCSI storage unit with an active/passive controller pair that can be grouped with other arrays to be managed. The 6110XS, depicted below, is a hybrid array containing a mix of high performance SSD and SAS disks. Equallogic’s active tiering technology dynamically moves hot and cold data between tiers to ensure the best performance at all times. Even though each controller now only has a single 10Gb port, vertical port sharing ensures that a controller port failure does not necessitate a controller failover.
Compellent can also be used in this space and follows a more traditional linear scale. SSDs are used for “hot” storage blocks especially boot storms, and 15K SAS disks are used to store the cooler blocks on dense storage. To add capacity and throughput additional shelves are looped into the array architecture. Compellent has its own auto-tiering functionality that can be scheduled off hours to rebalance the array from day to day. It also employs a mechanism that puts the hot data on the outer ring of the disk platters where they can be read easily and quickly. High performance and redundancy is achieved through an active/active controller architecture. The 32-bit Series 40 controller architecture is soon to be replaced by the 64-bit SC8000 controllers, alleviating the previous x86-based cache limits.
Another nice feature about Compellent is its inherent flexibility. The controllers are flexible like servers allowing you to install the number and type of IO cards you require: FC, iSCSI, FCoE, and SAS for the backend… Need more front-end bandwidth or add another backend SAS loop? Just add the appropriate card to the controller.
In the lower user count solutions, Tier 1 and Tier 2 storage functions can be combined. In the larger scale deployments these tiers should be separated and scale independently.

VDI Brokers

Dell DVS currently supports both VMware View 5 and Citrix XenDesktop 5 running on top of the vSphere 5 hypervisor. All server components run Windows Server 2008 R2 and database services provided by SQL Server 2008 R2. I have worked diligently to create a simple, flexible, unified architecture that expands effortlessly to meet the needs of any environment.
image
Choice of VDI broker generally lands on customer preference, while each solution has its advantages and disadvantages. View has a very simple backend architecture consisting of 4 essential server roles: SQL, vCenter, View Connection Server (VCS) and  Composer. Composer is the secret sauce that provides the non-persistent linked clone technology and is installed on the vCenter server. One downside to this is that because of Composer’s reliance on vCenter, the total number of VMs per vCenter instance is reduced to 2000, instead of the published 3000 per HA cluster in vSphere 5. This means that you will have multiple vCenter instances depending on how large your environment is. The advantage to View is scaling footprint, as 4 management hosts are all that is required to serve a 10,000 user environment. I wrote about View architecture design previously for version 4 (link).
image
View Storage Accelerator (VSA), officially supported in View 5.1, is the biggest game changing feature in View 5.x thus far. VSA changes the user workload IO profile, thus reducing the number of IOPS consumed by each user. VSA provides the ability to enable a portion of the host server’s RAM to be used for host caching, largely absorbing read IOs. This reduces the demand of boot storms as well as makes the tier 1 storage in use more efficient. Before VSA there was a much larger disparity between XenDesktop and View users in terms of IOPS, now the gap is greatly diminished.
View can be used with 2 connection protocols, the proprietary PCoIP protocol or native RDP. PCoIP is an optimized protocol intended to provide a greater user experience through richer media handling and interaction. Most users will probably be just fine running RDP as PCoIP has a greater overhead that uses more host CPU cycles. PCoIP is intended to compete head on with the Citrix HDX protocol and there are plenty of videos running side by side comparisons if you’re curious. Below is the VMware View logical architecture flow:
XenDesktop (XD), while similar in basic function, is very different from View. Let’s face it, Citrix has been doing this for a very long time. Client virtualization is what these guys are known for and through clever innovation and acquisitions over the years they have managed to bolster their portfolio as the most formidable client virtualization player in this space. A key difference between View and XD is the backend architecture. XD is much more complex and requires many more server roles than View which affects the size and scalability of the management tier. This is very complex software so there are a lot of moving parts: SQL, vCenter, license server, web interfaces, desktop delivery controllers, provisioning servers… there are more pieces to account for that all have their own unique scaling elements. XD is not as inextricably tied to vCenter as View is so a single instance should be able to support the published maximum number of sessions per HA cluster.
image
One of the neat things about XD is that you have a choice in desktop delivery mechanisms. Machine Creation Services (MCS) is the default mechanism provided in the DDC. At its core this provides a dead simple method for provisioning desktops and functions very similarly to View in this regard. Citrix recommends using MCS only for 5000 or fewer VDI sessions. For greater than 5000 sessions, Citrix recommends using their secret weapon: Provisioning Server (PVS). PVS provides the ability to stream desktops to compute hosts using gold master vDisks, customizing the placement of VM write-caches, all the while reducing the IO profile of the VDI session. PVS leverages TFTP to boot the VMs from the master vDisk. PVS isn’t just for virtual desktops either, it can also be used for other infrastructure servers in the architecture such as XenApp servers and provides dynamic elasticity should the environment need to grow to meet performance demands. There is no PVS equivalent on the VMware side of things.
With Citrix’s recent acquisition and integration of RingCube in XD, there are now new catalog options available for MCS and PVS in XD 5.6: pooled with personal vDisk or streamed with personal vDisk. The personal vDisk (PVD) is disk space that can be dedicated on a per user basis for personalization information, application data ,etc. PVD is intended to provide a degree of end user experience persistence in an otherwise non-persistent environment. Additional benefits of XD include seamless integration with XenApp for application delivery as well as the long standing benefits of the ICA protocol: session reliability, encrypted WAN acceleration, NetScaler integration, etc.  Below is the Citrix XenDesktop logical architecture flow:

High Availability

HA is provided via several different mechanisms across the solution architecture tiers. In the network tier HA is accomplished through stacking switches whether top of rack (ToR) or chassis-based. Stacking functionally unifies an otherwise segmented group of switches so they can be managed as a single logical unit. Discrete stacks should be configured for each service type, for example a stack for LAN traffic and a stack for iSCSI traffic. Each switch type has its stacking limits so care has been taken to ensure the proper switch type and port count to meet the needs of a given configuration.
Load balancing is provided via native DNS in smaller stacks, especially for file and SQL, and moves into a virtual appliance based model over 1000 users. NetScaler VPX or F5 LTM-VE can be used to load balance larger environments. NetScalers are sized based on required throughput as each appliance can manage millions of concurrent TCP sessions.
Protecting the compute tier differs a bit between the local and shared tier 1 solutions, as well as between View and XenDesktop. In the local tier 1 model there is no share storage in the compute tier, so vSphere HA can’t help us here. With XD, PVS can provide HA functionality by controlling the placement of VDI VMs from a failed host to a hot standby.
The solution for View is not quite so elegant in the local tier 1 model as there is no mechanism to automatically move VMs from a failed host. What we can do though is mimic HA functionality by manually creating a resource reservation on each compute host. This creates a manual RAID type of model where there is reserve capacity to host a failed server’s VDI sessions.
In the shared tier 1 model, the compute tier has shared storage so we can take full advantage of vSphere HA. This also applies to the management tier in all solution models. There are a few ways to go here when configuring admission control. Thankfully there are now more options than only calculating slot sizes and overhead. The simplest way to go is specifying a hot standby for dedicated failover. The downside is that you will have gear sitting idle. If that doesn’t sit well with you then you could specify a percentage of cluster resources to reserve. This will thin the load running on each host in the cluster but at least won’t waste resources entirely.
If the use of DRS is desired, care needs to be taken in large scale scenarios as this technology will functionally limit each HA cluster to 1280 VMs.
Protection for the storage tier is relatively straight forward as each storage array has its own built-in protections for controllers and RAID groups. In smaller solution stacks (under 1000 users) a file server VM is sufficient to host user data, profiles, etc. We recommend that for deployments larger than 1000 users that NAS be leveraged to provide this service. Our clustered NAS solutions for both Equallogic and Compellent are high performing and scalable to meet the needs of very large deployments. That said, NAS is available as an HA option at any time, for any solution size.

Validation

The Dell DVS difference is that our solutions are validated and characterized around real-world requirements and scenarios. Everyone that competes in this space plays the marketing game but we actually put our solutions through their paces. Everything we sell in our core solution stacks has been configured, tested, scrutinized, optimized, and measured for performance at all tiers in the solution. Additional consulting and blue printing services are available to help customers properly size VDI for their environments by analyzing user workloads to build a custom solution to meet those needs. Professional services is also available to stand up and support the entire solution.

The DVS Enterprise solution is constantly evolving with new features and options coming this summer. Keep an eye out here for more info on the latest DVS offerings as well as discussions on the interesting facets of VDI.

References:

Dell DVS: VMware View Reference Architecture 

Dell DVS: Citrix XenDesktop Reference Architecture


2 comments:

  1. Thanks for the article. I like how you got rid of all the 'fluff' and got to the point on several topics. I do have a quick question for you... I tried emailing our Dell account rep, but have yet to get a response... I was wondering if you would be able to confirm that VMware running vSphere/vMotion (w/ VSA v5.1) is supporting the PowerEdge R720? I checked on the VMware HCL and it only lists the R510, R610, & R710... I would love to order 3 or 4 R720s and load them up with a few SSDs and several 15k SAS HDDs; and not have to worry about SAN/NAS. What are your thoughts? THANKS!

    ReplyDelete
  2. Hi Randy,

    I don't believe there's been a formal announcement yet, but you can bet that it will be supported. Unfortunately, even with VSA 5.1 you'll still be limited to 3 hosts max. I do think VSAN, generally speaking, is very compelling in the local tier 1 space but the cost has got to be right to make it effective. I am also not personally willing to sacrifice performance or user experience for additional features (vMotion). But you are exactly right, local disks, VSAN, no shared storage.

    You could go the SSD route but 15K SAS should provide suitable boot storm and steady state performance based on the number of sessions you can reasonably squeeze onto a single host.

    ReplyDelete

Powered by Blogger.