Thursday, March 03, 2005

On multi-core CPUs in the x86+ world

I believe that we're going to see a performance plateau with processors and raw CPU power for the next 5 years or so. The only way CPU manufacturers are going to get more *OPS in the future is with many cores, and that's going to require either slower or the same kind of speeds (GHz-wise) as things are today. To get programs to run faster under these circumstances you need some kind of explicitly parallel programming. We haven't seen the right level of parallelism yet, IMHO. Unix started out with process-level parallelism, but it looks like thread-level paralellism has beaten it, even though it is much more prone to programmer errors. On the other end of the scale, EPIC architectures like Itanium haven't been able to outcompete older architectures like x86 because the explicitly parallel can be made implicit with clever run-time analysis of code. Intel (and, of course, AMD) are their own worst enemy on the Itanium front. All the CPU h/w prediction etc. removes the benefit of the clever compiler needed for EPIC. Maybe some kind of middle ground can be reached between the two. Itanium instructions work in triples, and you can effectively view the instruction set as programming three processors working in parallel but with the same register set. This is close (but not quite the same) to what's going to be required to efficiently program multi-core CPUs, beyond simple SMP-style thread-level parallelism. Maybe we need some kind of language which has its concurrency built in (something sort of akin to Concurrent Pascal, but much more up to date), or has no data to share and can be decomposed and analyzed with complete information via lambda calculus. I'm thinking of the functional languages, like ML (consider F# that MS Research is working on), or Haskell. With a functional language, different cores can work on different branches of the overall graph, and resolve them independently, before they're tied together later on. It's hard to see the kind of mindset changes required for this kind of thinking in software development happening very quickly, though. We'll see. Interesting times.


Rick Byers said...
This comment has been removed by a blog administrator.
Rick Byers said...

I agree completely. There is a lot of interesting research in this area, including at Microsoft. I'm still trying to learn a lot more about it, but one thing that looks very promising to me is software transactional memory systems. Tim Harris at MSR Cambridge has done some very interesting research here. From a programmers perspective you just put code in blocks labeled "atomic" and all memory updates (including those done by method calls) happen atomically at the end of the block, and there is no locking so deadlock is impossible. Transactional memory was a little disturbing to me at first, but the benefits are immense. The biggest drawback is that it performs poorly on 1 or 2 CPUs, but since it scales better than almost anything else it is perfect for future multi-core CPUs. I went to a talk about this by Simon Peyton Jones (of Haskell fame), who has worked for a long time on Concurrent Haskell, and he was extremely excited about it. I think that says a lot.