Skip to content
Tech

AMD to extend x86 for multicore era

AMD has announced a new hardware specification they intend to incorporate into …

Joel Hruska | 0

AMD has announced a new set of extensions to the x86-64 ISA that it may incorporate into future processors. Called Light-Weight Profiling (LWP), the new technology is the first in a series of initiatives that AMD is calling "Hardware Extensions for Software Parallelism." The general point of the newly announced LWP technology, and of the parallelism-related announcements that will follow it as AMD unveils more of its plans, is to make it easier for programmers to extract performance from multicore processors. LWP contributes to this goal by giving running processes a set of low-overhead profiling tools that enable them to get a better look at themselves and each other in real time, so that they can see what they're doing and adjust their behavior accordingly.

In theory, the feedback that LWP gives to processes and to the OS will be used to improve software parallelism and memory allocation on the fly, thereby increasing overall performance. Notably, LWP will apparently consume very little overhead while performing such optimizations, and its benefits aren't strictly limited to multicore scenarios. Single-core products could also apparently benefit from the technology, though obviously such products will make up an increasingly small amount of AMD's sales as time progresses. AMD references both Java and Microsoft .NET as two operating environments that could conceivably benefit from such technology.

Ars Video

 

According to the hardware specification (PDF) that AMD has posted, LWP proposes additional registers, memory structures, and instructions that operate in both legacy and long modes. A new model-specific register (MSR), populated by the OS, controls what types of events, if any, a process is allowed to monitor using LWP. When LWP is enabled for a process, the processor's profiling hardware checks a special LWP control block (LWPCB) that's stored in the process's memory space (and possibly cached in a special set of registers for quick access) in order to see what types of events it should be monitoring. It then monitors those events—cache misses, instructions and branches retired, instructions executed, etc.—using a set of counters and event records that are kept in memory and can be accessed by the process.

The novel thing about LWP—at least in the world of x86 hardware—is that it doesn't require interrupts in order to do the polling and event tracking. The polling and tracking happen automatically in a behind-the-scenes fashion once LWP is enabled, so no costly context switch is required for the process to find out how it's doing.

LWP data can be used to generate interrupts, with the result being that the process can define a set of run-time conditions which will trigger an interrupt and hand off control to the OS. This interrupt mechanism is what allows the processes to coordinate with each other based on the real-time performance measurements that LWP generates.

Obviously, the operating system must be modified in order to take advantage of LWP. The OS has to save the new, LWP-related process state on each context switch, it must handle LWP-generated interrupts, and it must ensure the security and privacy of each thread's LWP data and control structures. It's likely that Linux support for the extensions will come fairly quickly, but more widespread support will depend on what performance improvements, if any, the technology yields in those initial Linux implementations.

Don't expect to see LWP debut any time soon, as AMD hasn't announced any products that would feature the new technology or delivered a timeline for incorporating it into the AMD64 standard.

Jon Stokes contributed to this report.

0 Comments

Comments are closed.