Just some facts.
32 bit addressing gives 4GB of direct addressable RAM. Intel CPU's have never had a flat memory model. Even with 4GB of addressable space, it is still using an index register just like the old 16bit days. PAE extended the index register itself to allow more indices's with each segment being 4GB long and added 4 extra bits to the address/segment register (36 bit).
From a hardware perspective, it is not any more inefficient than the using the current single 4GB segment.
However, Microsoft's memory management is atrocious, thus leading some to believe PAE is the issue. Fact of the matter is, UNIX's have been using PAE for years before Microsoft and have never suffered a performance issue when using it. Only Microsoft operating system's suffer. Microsoft is also the one who arbitrarily limits how much RAM thier OS's will address, regardless of what the hardware is capable of.
With PAE, there is only one additional look-up (always an internal CPU cache hit) performed when it is enabled in the CPU, and that look-up occurs during time when RAM cannot be addressed. Due to the way it works, it will not have any visible impact on performance, from the hardware level.
Now PAE still does not allow an application to run outside of a single 4GB segment. Meaning the application still only has 4GB of RAM, at most, to use.