64 bits

A common cause for node deaths is memory exhaustion. Sometimes this is due to some memory-eating monsterbug, but often the virtual address space of 2GB in a 32-bit process simply fills up with legitimate user data. No matter how much memory is installed on the machine, each process can only address 2GB of this.

In order to alleviate this and to buy us more room for growth, we have been working on porting the server binaries to the new x64 architecture and have them run as 64 bit processes under Windows 2003 server 64, or even XP 64.
Moving to this architecture should give us some other key benefits as well: The number of registers is now larger, which allows for better code optimization, and it uses the sse2 FP architecture exclusively which would mean better FP performance.

Having previously worked with 32/64 portable software in the unix world I was suprised at how relatively painless the switch seems to be going to be. Regular integral datatypes don't change at all, only such old chestnuts as size_t and ptrdiff_t grow to reflect the new address space.

The biggest hurdle so far has been the porting of the Stackless Python code to support x64. Windows has completely revamped the ABI (application binary interface) for the x64 platforms. Stack switching is assember stuff, and there is no inline-assembly support in the 64 bit compilers by design. MASM, microsoft's macro assembler has a new 64-bit brother, MASM64, but it is very poorly documented. Still we have a running 64 bit stackless in debug mode now, although there are some issues in the optimized build to iron out yet.

To test all of this, we have put in orders for some 64 bit machines to put into our clusters. We are going to compare offerings from both Intel and AMD, single and dual core. We expect to be able to start test this seriously on Singularity in a few weeks time.