EVE64 | EVE Online

EVE64

2008-10-01 - By CCP Explorer

EVE64

We recently deployed a new technology to the EVE universe, StacklessIO, which is a new, robust, network technology for both the EVE server and clients. The server version was released 16 September and the client version was released 30 September with the Empyrean Age 1.1.1 patch. We have received great feedback and we hope you are enjoying it. In my dev blog on StacklessIO I mentioned that there would be a follow-up dev blog on related topics.

And here it is: 2EVE = EVE64.

StacklessIO, after years of development, has been a big success. We measured the improved performance, and you've told us on the forums and in the local chat in Jita that we have made a significant advancement in our goal of eliminating all lag from EVE Online.

Normally Jita reaches a maximum of about 800-900 pilots on any given Sunday. On the Friday following the deployment of StacklessIO, 19 September, there were close to 1,000 concurrent pilots in Jita and on the Saturday, 20 September, the maximum number reached 1,400. This is more than have ever been in Jita at the same time. Under our old network technology Jita could become rather unresponsive at 800-900 pilots but on the Sunday, 21 September, it was quite playable and very responsive with 800 pilots, thanks to StacklessIO.

Alas, there were teething problems. At 1,400 pilots the node hosting Jita ran out of memory and crashed. As crazy as it may sound this was very exciting since we had not been in the position before to be able to have that problem, as Jita would lag out before reaching that point under our old network technology. We immediately turned our attention to solving the challenge of giving the EVE server more memory to access.

CCP porkbelly wrote a dev blog three years ago entitled "64 Bits" where he described our first attempts at compiling the EVE server as a 64-bit program and the main reason for doing so: Access to more memory. At that time we were not able to complete the 64-bit migration since the old network technology did not work correctly as a 64-bit program. Having replaced the old network technology with StacklessIO we were in the position to continue that work.

And we started it, completed it and deployed EVE64 last week! Yes, we pulled it off in a single week! That might almost sound recklessly fast to some but this was achieved with a strike team that stepped up to the challenge.  There is a lot of enthusiasm within CCP today to tackle the lag monster now that we have this new platform to build on.

The EVE server runs on a cluster of blades and is divided into proxy nodes and server nodes. The EVE clients connect to the proxy nodes, which act as dispatchers and are also an outer layer of defense for the server nodes that run the solar systems simulation.

The proxies are now all running EVE64. We are planning to reduce the number of proxy nodes, which in return will lead to overall increased performance of the EVE server as the total number of proxy servers in our system affects scalability of our application layer. Now that the proxy nodes can address more memory they have the ability to service more client connections, as their performance is mostly a function of IO capacity (StacklessIO) and memory. The proxies are essentially proprietary software routers that just became vastly more powerful under this new paradigm.

The server nodes will run a mix of 32- and 64-bit nodes since most nodes in the cluster don't have memory requirements requiring EVE64. By replacing 32-bit code with 64-bit code more memory is immediately required since, e.g., all memory pointers double in size.  The need has to be clear as there is not gain in all cases to run EVE64, but where there is need we are now able to respond to it. Our network protocols that run on top of StacklessIO make sure that this mixed mode cluster configuration of EVE32 and EVE64 runs completely transparent to all code within the system.

The normal setup in the cluster for the server nodes is that each blade has two 64-bit processors, 4 GB of memory and runs Window Server 2003 x64. Each blade runs two nodes and each node then hosts a number of solar systems. There are also dedicated nodes for the market, dedicated nodes for corporation services, a dedicated head node for the cluster, etc.

Finally there is a pool of dedicated dual-CPU, dual-core, machines that only run a single EVE64 node per machine. Jita and four other high use solar systems are assigned to that pool. That pool is now running all native 64-bit code and the blades have been upgraded to 16 GB of memory. These blades also have more powerful CPUs which has helped as well. We are currently working with our vendors on testing out even more powerful hardware options now that we can utilise the hardware much better.

This Monday, 29 September, we saw a fleet battle with over 1100 pilots reported in local. Field reports indicate that the fight was quite responsive for the first 10 minutes but then the node "missed its heart beat" as we call it and was removed from the cluster by our cluster integrity watchdog routines. This again is another exciting problem as we can address that as well under our StacklessIO world and that will be the subject of the next blog.