PAE – what’s that, and how bad for performance?

Err, what's PAE?

PAE (Physical Address Extension) is a "workaround" for letting x86-32bit(!) OS see more than 4GB of RAM. 4GB is the limit for 32bit memory addresses. PAE is not needed and not implemented on x86-64 processors when 64 bit ("long mode") is enabled.

How does it work?

In short, it simply adds 4 bits to the memory addresses (32bit -> 36bit) and one more level of memory-lookup-hierarchy, and: voila, OS can access up to 64GB ram (which is not science fiction these crazy days..). Of course, a single 32bit process is not aware, and would still only have a 4GB of virtual address space, even with PAE.

Performance penalty: yes or no, and how much?

I was given a task to research the PAE technology for recommending my company whether we should use it or not, especially performance-wise.

PAE is hardware supported since Intel Pentium Pro (back in the mid 90s..). Reading "hardware supported" might mislead, and make one think that it's all accelerated and there's no performance penalty. But still, even in hardware, PAE adds one more hierarchy level for memory lookups, and my weak hardware knowledge tells that it might still slow something down...

The Linux kernel of most 32bit distributions (in particular RedHat 5) ships with PAE disabled, while an optional PAE-enabled kernel is available. On 32bit Windows, PAE used to be disabled by default, but since WinXP SP2 windows is PAE-enabled by default. I've also seen some Linux distros enabling PAE in their default kernel recently.

Googling for "PAE performance effects" was not easy, and that's the main reason I wrote this article: to spread the knowledge. Best articles I've found are specified at the bottom.

My research conclusions, then:

  1. Main reason for PAE being disabled by default, seems to be hardware compatibility: hardware with no PAE support can't boot a PAE-enabled kernel. That's mostly history now, anyway, for all recent x86 processors support PAE.
  2. Performance penalty is very low (according to RedHat average is 1% and no more than 10%). Of course it depends on your exact scenarios, etc etc.
  3. As a friend suggested me: in most cases PAE is not needed, for x86-64 is so widely spread. One can simply run his 32bit apps on a 64bit OS. PAE lets the OS see up to 64GB, while x86-64 (current implementations) lets the OS see at least 16TB! So my main conclusion is.. that PAE is dead. All modern x86 processors (since ~year 2006) have x86-64 support.
  4. Best thing is compiling your code as 64bit and running a 64bit OS, of course 🙂

Accessing beyond the 4GB on 32bit mode

According to the PAE article on Wikipedia, there are ways for accessing areas of RAM beyond the regular 4GB of virtual address space. Looks like Windows has a nice API for that: Address Windowing Extention, and Linux is able to do it with mmap() system call. I couldn't figure out how, though. Any idea? 🙂

References

  1. RedHat article by Bob Matthews & Norm Murray: page 9, last paragraph till page 10.
  2. Novell NetWare article which seems relevant.

Phewww. My longest post so far, I think 🙂

8 thoughts on “PAE – what’s that, and how bad for performance?

  1. Ofir Manor

    As for AWE/mmap - it allows user-space application to access more than 2-3 Gb of RAM. Your userspace program defines a "window" in virtual memory (ex: 1Gb) and then remaps it constantly to a different physical memory regions.
    Oracle database had this feature for a long time (SQL Server as well), but of course, as you said, 32-bit is dead.
    Wiki has some more info:
    http://en.wikipedia.org/wiki/Physical_Address_Extension

  2. Tomer Gabel

    The technique described above is roughly equivalent to XMS/EMS back in the real mode DOS days. It worked, but it also thoroughly sucked (from both programmatic and performance perspectives).

  3. Robert

    Well, the problem with using 64bit just to be able to use more memory is this: compatability. Many software is not written to run in 64bit, so it needs compatability libraries in order to do so. If compatability libraries are not present, then the program will either not run, or will have reduced functionality. So for me, I would prefer to use 32bit with PAE until all my programs cease to have issues with 64bit.

  4. Malcolm McCaffery

    Robert: I'm running 64 bit operating systems (Windows 7 & Vista) at home and on my work laptop and not coming across any 32 bit applications that don't run. Windows 16 bit applications don't run (i.e. From Windows 3.1 days) However if you're really desparate to keep software that old you can use DosBox 🙂

    The only potential issue is if your manufacturer hasn't provided 64 bit drivers....

  5. Larswad

    I have a Ubuntu server (D510MO board) with 4GB ram. I installed the PAE kernel and it works just fine. I agree 64 bit compatibility these days are just fine, there is no significant performance gain (or loss for that matter) but there is one more counterargument to using 64 bit OS's. If you have 4GB ram and want to make the most of it, a 64 bit system wastes quite a bit of that memory since pointers are always 64 bit size.
    I guess, if you are below 8GB, there is no gain in using 64 bit from either performance nor memory point of view. If you're on 3GB or lower you better use a standard 32 bit kernel, 4GB up to let's say 8GB I find the PAE kernel being a really good alternative.
    Am I right?

  6. maxx

    i have win7 32bit with 4gb ram but reads 3gb, usually using PAE is for getting more than 4gb readable. 32bit has max 4gb but why it reads only 3? my question is should i use this to get that 1 more gb?

  7. ProDigit

    I think there are benefits to PAE.
    1- You can address the full 4GB of RAM. Unlike PAE disabled 32bit os, where a part of the RAM is used for other things (like vram).
    2- In case you have 4-8GB of RAM, you benefit from PAE vs 64Bit.
    True, PAE enabled 32bit operating systems, come at a performance penalty, compared to a standard 32 bit non PAE enabled; but not as much as a 64bit OS. 64 bit OS runs much larger code, has even more memory address (even if it's unused), and thus overall runs slower.

    For laptops or desktops with 4-8GB of memory installed, that can't upgrade from there, this is the perfect solution.
    Usually 6-64GB of RAM is better done with a 64bit OS.
    From a mobile point of view, 16 bit operating systems are much more efficient than 32 bit, and 32 bit are much more efficient than 64 bit. Mainly a matter of code and hardware power consumption.

Leave a Reply

Your email address will not be published. Required fields are marked *