Home > Articles > Hardware

The Future of CPUs: What's After Multi-Core?

Oct 27, 2006

📄 Contents

␡

⎙ Print

< Back Page 8 of 9 Next >

Like this article? We recommend 

Upgrading and Repairing PCs, 17th Edition

Learn More Buy

Hardware IPC

One problem with all of these execution units is communicating among them. Moving data between the SIMD unit and the scalar part of a modern CPU is relatively expensive; moving data between the CPU and the GPU, even more so. Communicating between cores typically involves going via a shared cache, or via main memory if the cores don’t share a common cache.

The Transputer, produced in the 1980s, faced a similar problem. It had a large number of (cheap) relatively independent computing units. Each one had four interconnects, allowing it to talk to other processing units in close proximity very quickly. AMD’s HyperTransport is similar, although it’s generally used to implement shared memory, rather than as a message-passing interface.

The closest descendent of the Transputer these days is the Cell. This design has a set of synergistic processing units (SPUs). Apart from having the highest buzzword density of any processor to date, these are interesting in the way that they process data. Most CPUs have a very fine-grained load-and-store mechanism. They load a word (typically 64 bits of data these days) from memory, process it, and write it out. This is a simplification; in practice, they’ll typically interact with a layer of cache, which will look to a lower layer if it can’t provide the data required. The Cell is different; rather than providing a transparent cache, the Cell has a small amount of local memory. It loads a large chunk of data into this space in a single DMA transfer from main memory, and then processes it. On the plus side, this means that you never get a cache miss because all of your data is in "cache." The only difficulty is that you have to work out a way of partitioning your problem to allow it to be solved one small block at a time.

Once an SPU has completed processing a block of data, it might send it back to main memory. Another option is to pass it on to another SPU. This approach is potentially very interesting, but it creates some significant problems for layout when you try scaling it to large numbers of cores. Each core will both consume and produce data. Most will then pass on their output to another core for further processing. The problem comes from the fact that the number of potential recipients is the number of cores. While it’s easy to send a message to the nearest say, four cores, sending it any further away is more difficult. This is even more complex in a system with heterogeneous cores, because some processes are going to need to be run on specific areas of the chip.

< Back Page 8 of 9 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Email Address

The Future of CPUs: What's After Multi-Core?

Like this article? We recommend

Like this article? We recommend

Like this article? We recommend 

Hardware IPC

InformIT Promotional Mailings & Special Offers

Overview

Collection and Use of Information

Questions and Inquiries

Online Store

Surveys

Contests and Drawings

Newsletters

Service Announcements

Customer Service

Other Collection and Use of Information

Application and System Logs

Web Analytics

Cookies and Related Technologies

Do Not Track

Security

Children

Marketing

Correcting/Updating Personal Information

Choice/Opt-out

Sale of Personal Information

Supplemental Privacy Statement for California Residents

Sharing and Disclosure

Links

Requests and Contact

Changes to this Privacy Notice