There's been a minor scuffle going on between Ted Neward and Steve Vinoski over the wisdom of Erlang's approach to concurrency: whether it should be baked into the language or not on one hand, and whether it should be running on the JVM or CLR on the other.
I've already articulated my position on VMs, and I think it makes a lot of sense, particularly for prototyping, to build a VM specifically for a language implementation, particularly if the language has some primitives that are not normally available in commodity VMs. And to be frank, if one's language doesn't have some interesting new primitives or combination of primitives, it is unlikely to be moving the state of the art forward.
Erlang uses the actor concurrency model combined with lightweight aka green threads, though the execution engine may spawn as many threads as needed in order to get genuine concurrency from an underlying parallel architecture, such as multiprocessor or multicore. Erlang works under a shared-nothing model, however, so the "green threads" are more like "green processes".
I think Ted misses some appreciation of the power of the Erlang model, and in particular, its choice of primitives. Ted points to an implementation of the Actor model using Lift (written in Scala), and in particular some ballpark performance numbers:
We also had an occasion to have 2,000 simultaneous (as in at the same time, pounding on their keyboards) users of Buy a Feature and we were able to, thanks to Jetty Continuations, service all 2,000 users with 2,000 open connections to our server and an average of 700 requests per second on a dual core opteron with a load average of around 0.24... try that with your Rails app.
One of the obvious problems with this comment is that it doesn't sound very impressive when actually compared with Yaws, implemented in Erlang:
Our figure shows the performance of a server when subject to parallel load. This kind of load is often generated in a so-called "Distributed denial of service attack".
Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.
What this disparity tells me is that the JVM and CLR are likely lacking some primitives that help Erlang achieve this kind of result. For straight-up processing code, Erlang isn't terribly fast, by all accounts. The reason it wins is likely because it is avoiding the context-switching overhead through the use of lightweight processes. This in turn suggests to me that the equivalent of green threads, or some kind of automatic CPS transformation or native support for continuations is needed for CLR and JVM to be credible target platforms for Erlang. Lift is currently using Jetty Continuations, and when you read up about its implementation:
Behind the scenes, Jetty has to be a bit sneaky to work around Java and the Servlet specification as there is no mechanism in Java to suspend a thread and then resume it later. The first time the request handler calls continuation.suspend(timeoutMS) a RetryRequest runtime exception is thrown. This exception propagates out of all the request handling code and is caught by Jetty and handled specially. Instead of producing an error response, Jetty places the request on a timeout queue and returns the thread to the thread pool.
When the timeout expires, or if another thread calls continuation.resume(event) then the request is retried. This time, when continuation.suspend(timeoutMS) is called, either the event is returned or null is returned to indicate a timeout. The request handler then produces a response as it normally would.
Thus this mechanism uses the stateless nature of HTTP request handling to simulate a suspend and resume. The runtime exception allows the thread to legally exit the request handler and any upstream filters/servlets plus any associated security context. The retry of the request, re-enters the filter/servlet chain and any security context and continues normal handling at the point of continuation.
... you can see that it's clearly a hack to work around the limitations of the JVM - i.e. the fact that it doesn't have user-schedulable green threads (on top of native threads, not as a replacement, a bit like fibers on Windows), or an automatic CPS with some AOP-style weaving, or native continuation support.
In conclusion, the primitives in the VM matter a great deal. With the right primitives, wholly different styles of application become possible, because of the radically different performance profiles.