AH-
After another weekend of outages due to Savis is there any hope of having a reasonable expectation that AH will be up and running at any given point? I do not know what the SLA is with Savis but I do not think they are capable of delivering. Isn't it reasonable to expect at least five 9's (99.999%) uptime out of this service we all pay for?
I personally do not have experience with Savis but I can tell you I have never heard anything positive about them. I have used hosting facilities such as: UUNET, Digex, GenuityNet, AT&T, Inflow (Now SunGuard) and I have never seen so many outages as I have seen with AH since it has moved to Savis.
What’s the plan here? Or are we (The players) just expected to live with the shotty service being provided by Savis?
As a network engineer and unix sysadmin here are some other thoughts/suggestions.
Is it feesable to run backup bandwidth into your systems and use BGP to route the traffic when there is a failure. I have heard statements that the amount of data the end users need is relatively small. Could a single or maybe even multiple T1's be purchased through another bandwidth provider and ran into your cab's at Savis? This way when Savis goes down you will have another route into the servers.
Also may I suggest fixing your DNS. hitechcreations.com lists 2 ip's as DNS providers. Both the IP's are on the same network. When your network goes down so does your DNS. I would suggest listing 2 or three more. What I typically do is have my service provider slave my tables from me but actually list my providers dns servers as primarys with internic. That way their servers take all the hits and I can update my tables at will on my own server without going through them. Of course list your servers as additional DNS servers...
Have you considered moving the website off that network and away from the game machines? I personally would consider moving it to another data center (A data center close to the main office could be an option) and using someone like
Mirror Image or
Akamai to deliver the downloads for you so that bandwidth is kept away from the game servers. I am sure the stats stuff play a role in the decision process but all that backend stuff could still be they way it is. If for whatever reason that portion goes down at least HTC web presance would still exist and be able to provide notifications of any outages or issues....
Oh well just some thoughts and rants about my dissapointment with the avialability of AH. I hope no offense is taken and if there is anything I could do to help out with a solution please let me know.
GrmRpr