metro

You are currently browsing the archive for the metro category.

Well Christmas break started a little early for me yesterday with the crazy snow that piled up. I ended up working at home and missing the IT Christmas party due to the weather and the impossibility of driving down our little street.

I have the week off from work and it’s a good thing because I’m going to be all over the place. Monday and Friday are basically the only weekdays that I won’t be going somewhere. Tuesday we’re headed out to Chelsea to see my folks for the day and then go with them to my aunt’s annual Christmas-Eve-Eve party, then driving back home. Apparently another winter weather ‘event’ is supposed to break out on Tuesday so it could be some rough travel on the way home.

Wednesday is ash’s dad’s side of the family’s annual Christmas eve party at her parents’ house. Thursday is Christmas here and then going over to ash’s mom’s side of the family.

So it’s going to be a busy week. Hopefully the weather holds up and things don’t get too ugly.

In other news, I’m still waiting to hear when AT&T is going to come out and hook me up with UVerse. The dude who signed me up claimed to be out of town when I called him today (sounded like he was in a bar) and he said he’d call/email me tomorrow about it. I think he signed my neighbor up across the street last Saturday after I signed up and my neighbor definitely had it installed a few days ago already. Comcast internet has been spotty around then too so all the more reason to switch away.

Also, I’m trying to arrange for yet another move of our humble webserver to bigger and better hardware hopefully at a local (Grand Rapids) based hosting provider. Hopefully we can find some affordable colocation for a similar price to what we’re paying now. I don’t have high hopes though that we can afford it.

That’s all for now. brb.

According to this article at SearchStorage, Forrester Analyst’s report questions value of SANs.

Reichman maintains that SANs haven’t accomplished what they were designed to do: improve performance while lowering cost and complexity of managing applications and databases.

“It’s been the conventional wisdom of the past 10 years that to provide the best performance, protection and capacity utilization for applications and databases, you need a robust storage array in a storage area network,” Reichman wrote. “But with low capacity utilization, the inability to prioritize application performance, long provisioning times and soaring costs, SANs haven’t lived up to their promise.”

Reichman wrote that with application vendors putting more storage functionality into their applications, “the time has come for buyers to question the value of their SAN and consider simpler options that fit better with the applications they truly care about.”

The article has got me thinking about our own SAN and possible ways to move it forward. I already know that there are plenty of things on the high end arrays that could function just fine on a cheaper lower-end array. Also, some servers/applications would be just fine installing and running their application from locally attached storage!

The nice thing about the SAN is that everything is replicated to the SAN at the other datacenter so in case of a catastrophic failure, we don’t have to worry about trying to restore the information. But then disaster recovery situations need to be investigated and plans developed to recover from various types of them. Does it really make sense to replicate everything? Do we really need to worry about losing an entire array? Things to consider.

A friend pointed me to a few articles discussing some of the new features in VMware View 3 and these articles really went straight to the point for me in regards to the hype surrounding the linked-clone technology and how it might not necessarily be the great solution that everyone makes it out to be.

The first article comes from a blog that I will now start tracking: vinternals: VMware View – Linked Clones Not A Panacea for VDI Storage Pain!.

The author makes two points, the first being that “snapshots can grow up to the same size as the source disk.” While not a common situation, the author points out that the Windows NTFS filesystem will always write to blocks on the disk that are zero’d (completely empty) before it will write to blocks containing deleted files. The author gives the example of having 10GB free space on the filesystem according to the Windows guest OS and then writing/deleting a 1GB file 10 times will result in the snapshot growing to 10GB.

The gist of this point is that folks make the whole linked-clone thing to be a space-saving measure as all the clones reference the master snapshot for their base image and then record any changes they need to make to it to their own snapshot delta disk. The problem becomes if you have a user population with a very wide diversity of applications and they for various reasons cannot be included in the base image. This means snapshot growth for installation, patching, regular use, etc. I can’t think of any good way to estimate what to expect a snapshot to grow to without just actually doing it. So it becomes very scary if you need to plan for a storage environment of a certain size and you really just cannot plan for the growth, other than just expecting the worst case scenario.

The second point that the author makes is in regards to a problem that I’ve battled a lot over the last couple of years, which is LUN locking on the storage array. I have known and as the author points out, “a lock is acquired on a VMFS volume whenever volume metadata is updated. Metadata updates occur everytime a snapshot file is incremented, at the moment this is hardcoded to 16MB increments.” This plays in to the recommendation to keep the number of snapshotted-VMs per LUN to a low number. So if you have a very large VDI environment, in order to keep LUN locking/SCSI reservations manageable and under control, a very large number of LUNs need to be allocated leading to a storage management nightmare. And VMFS locks aren’t only for snapshots, they also occur for VM power-on/off and VMotions. We don’t have a lot of VM power-on/off operations but VMotions are usually always happening.

The author’s main point though about LUN locking is that snapshots grow at 16MB increments so during an initial deployment when users start launching and installing applications (which again for various reasons can’t be in the master snapshot) there would be a lot of locks being acquired as snapshots expand.

These two points make me wary of using snapshots – something I have stayed away from in our server environment and will continue to use sparingly, if only for temporary uses such as upgrades.

The second article is a response to the first. Musings of Rodos: Linked Clones Not A Panacea takes the two points and responds in a way that seems positive to him but to me further confirms my fears about snapshots and LUN locking.

In regards to snapshot growth, the second author recommends automated desktop refreshes at a regular interval where the VM delta file is removed and the machine is essentially reverted to a clean slate. For a large environment like mine, this is not practical or possible. With so many users and a diverse collection of applications (over 400), it is not possible to force the users to reinstall their applications every couple of days in order to keep the snapshot growth low. And either way, once they reinstall, the snapshots again start growing out of control. Some would recommend ThinApp’ing these so there is no install in the VM and I would agree that this is a good idea in theory but it remains to be seen if it is possible and the number of hours to change all our applications from their currently delivery method to ThinApp is a hidden cost to the whole environment and basically not possible given our manpower and other commitments. And one can pretty much guarentee that many of the apps will not play nice with ThinApp packaging in one way or another requiring further hours to track down and resolve issues.

So frequently refreshing VMs would not be possible in my environment.

The second author’s response to the LUN locking issue is basically to pay close attention and rebalance datastores/LUNs if there are problems. That’s a great statement to make but again, I can imagine the few weeks of time that would take – not to mention the interruption to the users – so it becomes a pretty big deal and not just something to toss around lightly.

This author does indeed make the point that the first author did not and that is the possibility of I/O storms in the VDI environment – something of which I am extermely familiar with. All the author says is that “things could get very interesting.” Well I’ll tell you how interesting they can get, how about bringing the whole environment to its knees? And this was without snapshots! Imagine all the LUN locking that would occur for snapshot growth if we were using thin-clones. And speaking of that, imagine just the growth! We already need to be careful with how we roll out updates for applications, antivirus, OS patches, etc. Adding these new storage pieces in to the mix just makes things more complicated.

The first author (Stu) comments on the second author’s article and points out that the support cost for refreshing the environment every 9-12 months would have wiped out any storage savings. I think this is something that very blatantly needs to be brought front and center. Any savings achieved through technology is great but if the technology demands a huge increase in manpower to manage it correctly, the savings then become negative. I don’t have any numbers but I am certain our overtime for this project has negated any savings we hoped to have, as well as our relience on enterprise storage.

Basically, I feel like people are tripping over themselves to get in to VMware View 3 and I don’t want to be the first to get there and discover new challenges that need solving. I need to hear all the mundane/minute details from enterprise customers with environments similar to mine showing in fine detail how linked-clones and application virtualization has saved them time and money before I’ll give it a shot.

In defense of VMware so this post doesn’t sound too negative, I tell them all the time that I am consistently impressed with their innovation and drive forward in this and other virtualization arenas. I am a fan of their virtualization goals and I always look forward to their software releases for all the new features and improvements that are always being made. If View3 and this implementation of linked-clones isn’t the panacea tha some make it out to be, then perhaps the new few iterations will make progress towards getting there.

I have achieved total celebrity. VMware View 3 launched today and Metro (me) is featured in a video on the product Overview page.

Also, a whitepaper: Solving the Desktop Dilemma

While searching for information about VMware Fault Tolerance in ESX4, I came across the x4live.com blog entry: New features on VMware ESX4. They point to a couple videos at VMware illustrating some of these really cool new features:

  • Fault Tolerance – A VM hosting a mission critical application gets a clone and constantly sends over its execution instructions using VMotion. If the ESX server hosting the VM goes down, the clone immediately takes over without missing a beat.
  • Distributed Switches – Clusters can have distributed virtual switches which is a virtual switch configuration that is applied to all hosts in the cluster. No more manually configuring identical vSwitches on each host in the cluster prior to using it.
  • Host Profiles and Linked VirtualCenters – speaking of no more manual switch creation, host profiles removes most of the work of setting up a new host. A host profile is like Active Directory group policy where it’s a set of configuration settings that are applied to hosts when added to a cluster. Makes it really easy to bring new server capacity online in a cluster. The Linked VirtualCenters is awesome too, where a VI Client connects to one VC and sees the others that are linked together. One interface, multiple VC servers. We have 3 VC servers here at Metro so it’d be great to unify them in one console.

Virtualization.info has lists some more features we might expect in ESX4.

With ESX4 probably releasing in the first quarter of 2009, I’m pretty excited to get ahold of these long-awaited features.

« Older entries § Newer entries »