joseluis_estebanaparicio: paralelismo erlang y scala (eng)

http://www.infoq.com/news/2008/06/scala-vs-erlang

There has been a somewhat heated debate about Scala vs. Erlang on the blogosphere recently. The future will be multi-cored, and the question is how the multi-core crises will be solved. Scala and Erlang are two languages that aspire to be the solution, but they are a bit different. What are the pros and cons with their approaches?

The problem

Moore’s law has changed. We don’t get the same increase in clock frequency as we used to. Instead, we get more cores. Today, even your laptop probably has two cores.

To utilize more than one core, your application has to be concurrency-aware. If your customer bought an eight-core machine, you will have a hard time explaining to them that it’s normal that the application only uses about 12% of the CPU capacity, even if the machine is dedicated to that particular application.

In the future, this will be even worse. Not only will your sequential code not run faster, it will actually run slower. The reason is that the more cores you get, the slower each core will run for power and heat reasons. In a couple of years, Intel will give us 32 cores, and there are trends that suggests that will have thousands of cores before we know it. But each core will be very slow compared to todays cores.

Concurrent code

One obvious way to solve the problem is to write (and rewrite) software to be concurrent. The most common way to do that is to use threads, but most developers consider thread-based applications particularly hard to write. Deadlocks, starvation and race conditions are concepts that are way too familiar for a majority of developers doing concurrency. Both Erlang and Scala take away a lot of that pain.

An brief overview of the languages

Scala is sometimes seen as the next big JVM language. It combines the object-oriented paradigm with the functional paradigm, has a terse syntax compared to Java, is statically typed and is as fast or sometimes even faster than Java. There are numerous reasons to take a serious look at Scala.

Erlang is a language designed for robustness, but because of its design, it is thereby a language that scales well. It predates Java, but it is often considered to be a language of the concurrent future. It is a dynamically typed, functional language, with some remarkable examples of uptime.

The debate

So what’s the Scala vs. Erlang debate all about? In the end, performance and scalability, but the debate includes other things like style, language features and library support as well. The debate was started unintentionally when Ted Neward gave his opinions about a number of languages and that “the fact that [Erlang] runs on its own interpreter [is] bad”.

Steve Vinoski and Ted then had a couple of rounds of debate, but the discussion then moved to a couple of other blogs where it highlighted interesting differences and similarities between Scala and Erlang. We’ll summarize each interesting point to show pros and cons of each language and the different views on some problems.

Reliability

Steve Vinoski wrote a response to Ted’s post where he gave his take on that Erlang runs its own interpreter:

The fact that it runs on its own interpreter, good; otherwise, the reliability wouldn’t be there and it would be just another curious but useless concurrency-oriented language experiment.

Steve is talking about the problem that even if a language itself is reliable, everything it stands upon has to be reliable too. Since Erlang is designed from the bottom up for reliability and thereby concurrency, it doesn’t suffer from usual problems when it comes to concurrency, primarily that the underlying libraries needs to play well in a concurrent setting.

Scala on the other hand lives on top of the JVM and one of the most important selling points is the potential usage of all existing Java code. However, a lot of Java code is not designed for concurrency, and Scala code needs to take this into account.

Lightweight processes

To run massively concurrent applications, you need a lot of parallel execution. This can be done in several ways. Using threads is one common way, using processes is another. The difference is that a thread shares memory with other threads, processes share nothing. That means that threads needs locking mechanisms like mutexes to prevent two threads from manipulating the same memory at the same time, but processes don’t suffer from that problem and instead uses some kind of message passing to communicate with other processes. But processes are normally expensive regarding performance and memory, and that is a reason why people often choose thread based concurrency, even though it’s a harder programming model.

Steve Vinoski writes:

Massive concurrency capabilities become far easier with an architecture that provides lightweight processes that share nothing, but that doesn’t mean that once you design it, the rest is just a simple matter of programming.

Erlang takes this approach to concurrency. An Erlang process is very lightweight, and Erlang applications commonly have tens-of thousands of threads or more.

Scala on the other hand does the same thing with event-based actors. Yariv Sadan explains how:

Scala has two types of Actors: thread-based and event based. Thread based actors execute in heavyweight OS threads. They never block each other, but they don’t scale to more than a few thousand actors per VM. Event-based actors are simple objects. They are very lightweight, and, like Erlang processes, you can spawn millions of them on a modern machine.

Yariv explains that there is a difference though:

The difference with Erlang processes is that within each OS thread, event based actors execute sequentially without preemptive scheduling. This makes it possible for an event-based actor to block its OS thread for a long period of time (perhaps indefinitely).

Immutability

Erlang is a functional language. This means that data is immutable, like Java’s strings, and there is no risk of side effects. Any operation on some data will result in a new modified version of that data, but the old one stays the same. Immutability is a highly regarded ingredient when it comes to robustness, since no code can unintentionally change data that someone else is dependent upon, but from a concurrency point of view, it is also an important feature. If data is immutable, the risk of it being changed by two parallel execution paths doesn’t exist and data can even be copied to other machines since there is no way for it to change and nothing to keep in sync.

Since Scala is based upon the JVM and combines an object-oriented and functional approach, there are no guarantees of immutability like in pure functional languages. However, in the comment section of Yariv’s post, a very interesting discussion between Yariv and David Pollack on interesting differences between the languages took place. David, who is the creator of the Scala web framework Lift, gives his views on immutability.

Immutability — Erlang enforces this and there’s almost no way around it. But you trade the rest of Scala’s amazingly powerful type system for enforcement of this single type. I do my Scala Actor coding with immutable data and I have Scala’s type system to enforce the rest of my types.

Yariv asks:

Doesn’t sending only immutable types a big limitation? It means you can’t, for example, load a simple bean from Hibernate and send it to another actor.

David answers:

I’ve built a number of production systems based on Scala Actors. There’s very little work actually required to deal with the immutable issue. You just define your case classes (the messages) to be immutable and away you go.

Type systems

Erlang is dynamically typed. Scala is statically typed and has a stronger type system than Java. However, a big difference compared to Java is that Scala has type inference. This means that you can omit a lot of type annotations, which makes the code cleaner but the compiler will still do all checks.

The debate about pros and cons of dynamic and static type systems will likely never come to an end, but that is a noticeable difference between Erlang and Scala.

Tail recursion or loops

Yariv again:

Functional programming and recursion go hand-in-hand. In fact, you could hardly write working Erlang programs without tail recursion because Erlang doesn’t have loops — it uses recursion for everything (which I believe is a good thing :) ).

This is definitely something that differs Erlang a lot from Scala. Scala has a much more traditional style of iterations, but David Pollack doesn’t see an advantage for tail recursion in this context:

Tail recursion — It’s a non-issue for event-based actors.

In that case, it all comes down to preference and style.

Hot swapping code

Since Erlang was designed for reliability, hot swapping code (replacing code in runtime) is built in.

The JVM has some support for hot swapping code. Classes can be changed, but due to the static type system, method signatures can not be changed - only the content of a method. There are third party tools to get around that, and there are frameworks that promotes a programming style that makes it easier to swap classes in running systems, but because of how a Scala Actor is built up, how swapping works even though it runs on the JVM. Jonas Bonér gives a thorough example of how to do it.

Summary

Both Scala and Erlang are languages that target the multi-core crises. They come from different background and eras and thereby approach some problems differently, but in many ways they have more in common than they differ, at least when it comes to the concurrency issues.

Erlang has been around for a couple of decades, and has proved itself in many critical real-world systems. One of it’s drawbacks is that it is a bit of an island and the recent polygot programming trend is not likely to affect Erlang community that much.

Scala on the other hand is the new kid on the block for the same type of applications. There are real world applications just coming out the door, and there are companies that are betting their future on it. Scala’s biggest advantage compared to Erlang is that is runs on the JVM and can use all existing Java code, frameworks and many of the tools. That said, that power comes with great responsibility, since most Java code don’t automatically fit well the Scala’s Actor model.

Both languages offer similar ways to solve a increasingly pressing problem that mainstream languages don’t help developers with very well. Hopefully you have a slightly better insight about which language to take a closer look at for your particular situation after reading this debate summary.

The future is multi-cored. Scala and Erlang is likely to increase in popularity.

joseluis_estebanaparicio

martes, junio 24, 2008

paralelismo erlang y scala (eng)