Death to downtimes!
First off, I would like to introduce myself. I am CCP Hunter, Database Administrator in the Virtual World Operations department. The VW Ops department is responsible for the daily operation of Tranquility and all that relates to it: the databases, endless amount of servers, test servers and web servers.
The Big Announcement!
Since the launch of EVE Online in May 2003, our official daily downtime has been 60 minutes, from 11:00 - 12:00 UTC.
However, over the past two years we have been aggressively working at reducing this downtime. In actuality a typical downtime today is 20-30 minutes while officially downtime has remained the same drumroll until now.
Starting November 1st, the new official daily downtime on Tranquility will be 30 minutes, 11:00 - 11:30 UTC.
Why does EVE even have daily downtime?
Tranquility is one of the largest single-sharded Massively Multiplayer Online Games in operation. The database it runs on is 1.5 Terabytes and in order to maintain the core systems of the game, we perform daily cleanups during these downtimes. Most of these cleanups are "under the hood" kind of operations which are not noticed by the users but are needed to maintain good health of the database.
In addition to database cleanup some essential operations are also performed during downtime such as asteroid reseeding, outpost building, NPC standing updates, etc.
Lastly, during downtimes, the load balancer is also updated and fleet fight systems made dedicated (remember to use our Fleet Fight Notification form if you are having a fleetfight!).
These operations can take a long time to complete but we have been taking steps to reduce them. Today the majority of the downtime involves shutting down and starting up the cluster.
What has been done to reduce downtime?
In the old days, systems in EVE Online were built on the fact that there was daily downtime. In the last few years no new code has been produced that relies on downtime and a great deal of work has been done in removing old dependencies on downtime. You could say that we are still paying for past sins.
In addition to this we have worked on the cluster shutdown procedure and startup procedure so that the cluster goes down and up faster.
What does the future hold, when will the daily downtime go away?
As a part of the Carbon initiative, cluster management is being re-architected. It is our goal that sometime in the not too distant future, EVE Online will have no daily downtime. How awesome will that be!
We are not there yet, but the duration of downtime is being reduced substantially. The actual time required for a typical downtime is just under 12 minutes. The rest of the time is a buffer for applying hotfixes, patches and short maintenance on the cluster when needed.
We will continue our efforts on reducing daily downtime, until we reach the ultimate goal of no daily downtime, I'll leave you with these graphs that show you the road towards no daily downtime.
Death to downtimes !
Thanks to CCP Atlas for helping with graphs and text.