[Note from Brian Madden on March 24, 2004: Since I originally posted this article, I received some corrections from David Solomon, author of the book "Inside Windows 2000." (Thanks David!) I've since rewritten some portions of this article to incorporate his corrections.]
There seems to be a lot of confusion in the industry about what's commonly called the Windows “4GB memory limit.” When talking about performance tuning and server sizing, people are quick to mention the fact that an application on a 32-bit Windows system can only access 4GB of memory. But what exactly does this mean?
By definition, a 32-bit processor uses 32 bits to refer to the location of each byte of memory. 2^32 = 4.2 billion, which means a memory address that's 32 bits long can only refer to 4.2 billion unique locations (i.e. 4 GB).
In the 32-bit Windows world, each application has its own “virtual” 4GB memory space. (This means that each application functions as if it has a flat 4GB of memory, and the system's memory manager keeps track of memory mapping, which applications are using which memory, page file management, and so on.)
This 4GB space is evenly divided into two parts, with 2GB dedicated for kernel usage, and 2GB left for application usage. Each application gets its own 2GB, but all applications have to share the same 2GB kernel space.
This can cause problems in Terminal Server environments. On Terminal Servers with a lot of users running a lot of applications, quite a bit of information from all the users has to be crammed into the shared 2GB of kernel memory. In fact, this is why no Windows 2000-based Terminal Server can support more than about 200 users—the 2GB of kernel memory gets full—even if the server has 16GB of memory and eight 3GHz processors. This is simply an architectural limitation of 32-bit Windows.
Windows 2003 is a little bit better in that it allows you to more finely tune how the 2GB kernel memory space is used. However, you still can't escape the fact that the thousands of processes from hundreds of users will all have to share the common 2GB kernel space.
Using the /3GB (for Windows 2000) or the /4GT (for Windows 2003) boot.ini switches is even worse in Terminal Server environments because those switches change the partition between the application memory space and kernel memory space. These switches gives each application 3GB of memory, which in turn only leaves 1GB for the kernel—a disaster in Terminal Server environments!
People who are unfamiliar with the real meaning behind the 4GB Windows memory limit often point out that certain versions of Windows (such as Enterprise or Datacenter editions) can actually support more than 4GB of physical memory. However, adding more than 4GB of physical memory to a server still doesn't change the fact that it's a 32-bit processor accessing a 32-bit memory space. Even when more than 4GB of memory is present, each process still has the normal 2GB virtual address space, and the kernel address space is still 2GB, just as on a normal non-PAE system.
However, systems booted /PAE can support up to 64GB physical memory. A 32-bit process can "use" large amounts of memory via AWE (address windowing extension) functions. This means that they must map views of the physical memory they allocate into their 2GB virtual address space. Essentially, they can only use 2GB of memory at a time.
Here are more details about what booting /PAE means from Chapter 7 of the book "Inside Windows 2000," by David Solomon and Mark Russinovich.
All of the Intel x86 family processors since the Pentium Pro include a memory-mapping mode called Physical Address Extension (PAE). With the proper chipset, the PAE mode allows access to up to 64 GB of physical memory. When the x86 executes in PAE mode, the memory management unit (MMU) divides virtual addresses into four fields.
The MMU still implements page directories and page tables, but a third level, the page directory pointer table, exists above them. PAE mode can address more memory than the standard translation mode not because of the extra level of translation but because PDEs and PTEs are 64-bits wide rather than 32-bits. The system represents physical addresses internally with 24 bits, which gives the x86 the ability to support a maximum of 2^(24+12) bytes, or 64 GB, of memory.
As explained in Chapter 2 , there is a special version of the core kernel image (Ntoskrnl.exe) with support for PAE called Ntkrnlpa.exe. (The multiprocessor version is called Ntkrpamp.exe.) To select this PAE-enabled kernel, you must boot with the /PAE switch in Boot.ini.
This special version of the kernel image is installed on all Windows 2000 systems, even Windows 2000 Professional systems with small memory. The reason for this is to facilitate testing. Because the PAE kernel presents 64-bit addresses to device drivers and other system code, booting /PAE even on a small memory system allows a device driver developer to test parts of their drivers with large addresses. The other relevant Boot.ini switch is /NOLOWMEM, which discards memory below 4 GB and relocates device drivers above this range, thus guaranteeing that these drivers will be presented with physical addresses greater than 32 bits.
Only Windows 2000 Advanced Server and Windows 2000 Datacenter Server are required to support more than 4 GB of physical memory. (See Table 2-2.) Using the AWE Win32 functions, 32bit user processes can allocate and control large amounts of physical memory on these systems.