SharePoint Distributed Cache Service ….DANGER WILL ROBINSON!!

Posted: October 29, 2014 in SharePoint 2013, Uncategorized
Tags: ,

SharePoint Distributed Cache Service ….DANGER WILL ROBINSON!!

 

This is a late night blog so sorry for the brevity and less cohesive thought pattern. I just needed to get this out there as much for my benefit as anyone elses.

So I have to admit. This service has always been a bit of a mystery to me. Simply put, it always worked. Never had to dig. Never had to care. Just a few commandlets and it was running and I went on my merry way. Well I had a recent case where I lost a host and had to dig when it would not come back.

What a scary experience. All these tools to use, the App Fabric cache tool, App fabric cache commandlets, SP commandlets, Services MMC, and multiple CA points to manage it. And then the warnings. If you use the wrong ones, you may need to rebuild your farm. Yes, rebuild the farm. Rebuild a 13 server farm. Can you say Holy crap!

I found a lot of blogs on fixing it, a lot of MSDN and Technet articles on App Fabric cache architecture then I got a virtual and began dissecting it. I looked here (http://almondlabs.com/blog/manage-the-distributed-cache/ ) got some great pointers. I looked here http://www.microsoft.com/en-us/download/details.aspx?id=35557 and got some good reference material. I found lots of Commandlets on configuring the cluster and manhandling it. Pushing in providers (which they tell you your choices are XML or SQL), and found plenty of hacks. In many areas the lines between SharePoint 2013 and App Fabric blurred and people were unknowingly straying into dangerous areas without realizing it. Then the clouds parted around 2200 after 3 cups of coffee. The answer came to me when I looked at the HKEY_LOCAL_MACHINE -> SOFTWARE\Microsoft\AppFabric\V1.0\Providers\AppFabricCaching registry key. The provider listed was a third…SPDistributedCacheClusterProvider.

The fog lifted

                So for those who want to better understand it. In VERY simple terms. An app fabric cluster is a set of cache hosts tied together through a central point which is a SQL Server DB or a XML file on a file share. Similar to SharePoint and its reliability on a Config Db to help keep all the servers in line. In the case of SharePoint 2013, that cluster is managed through the SharePoint configuration database and the set of Distributed Cache Service instances in the farm. The SharePoint Distributed Cache Service sits on top of it. It manages it. It maintains the App Fabric cluster underneath, and keeps everything in line. This is why you only run SP commandlets to add/remove, etc. It is also why if you use alternate tools like the App Fabric cluster tool you can kill your farm. A SharePoint farms greatest weakness has always been its config DB. It is the 1 point where you can kill the farm. Using another tool risks putting the cache cluster in App Fabric out of sync with the Configuration database in SharePoint. Once that happens, well anything goes. You might be able to recover it, you might never get another DC service ever running on your farm again.

If you want to manage this service then, stick to those commandlets. Stick to the guidance on the Microsoft documents at http://www.microsoft.com/en-us/download/details.aspx?id=35557 for maintaining it. Be VERY careful on any mods outside those commandlets. The farm is literally at stake here. It is a very dangerous game and from the blog posts I have seen out there, a lot of folks are hacking in fixes.

Another great article I found which just about laid it out for me was here(I am dense and did not get the simple truth of it right away when I read this): http://social.technet.microsoft.com/wiki/contents/articles/20348.sharepoint-2013-appfabric-and-distributed-cache-service.aspx

 

Leave a comment