-
-
Save shinnya/fead52c14f40efab7e881914973a383b to your computer and use it in GitHub Desktop.
Interview with Joe Armstrong & Simon Peyton Jones
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| I'm Sadek Drobi. I'm here at Erlang Factory with Simon Peyton Jones and Joe | |
| Armstrong. Can you please tell us about yourselves and what you've been busy | |
| with lately? | |
| JA: I'm Joe Armstrong and I'm at Erlang Factory. I've just been to a very nice | |
| talk where Simon has told us about the birth of Haskell and Erlang and how they | |
| point along parallel routes solving the same problems. I think we can talk a | |
| bit about that, because in your lecture you said things about "We tried that | |
| and it was an absolute disaster". We also tried - you know this making | |
| everything parallel - we did that as well. | |
| SPJ: Was it a disaster? | |
| JA: Yes. You know this bit about you deleted all the people's code? We've done | |
| that as well. | |
| SPJ: I'm Simon Peyton Jones, I work at the Microsoft Research in Cambridge at | |
| the moment. Previously I was a professor at Glasgow University. I've been doing | |
| functional programming now for 30-odd years, because I got addicted to it when | |
| I was at university and never been able to give it up. It's like a drug, but a | |
| good drug. Haskell is a programming language evolved because a bunch of | |
| different academics were working on lazy functional programming languages | |
| around the world, but they were all different languages and we realized that if | |
| we'd just agree to a common language, then we would be able to share a lot more | |
| work together, so we got together and formed a committee, which wasn't | |
| necessarily a very promising way to go about designing an elegant and beautiful | |
| language. But actually the committee worked very well, because we had enough | |
| common goals. That was in the very early '90s. Haskell has come a long way | |
| since then. Many languages die fairly quickly after they've been designed, but | |
| Haskell has lasted 20 years, as has Erlang. | |
| JA: Erlang had a different start than Haskell. It started in an industrial | |
| environment. It started in the computer science lab at Ericsson and the | |
| original goal was to find a better way of programming. It wasn't actually to | |
| make a programming language at all, it was somewhat after the event that we | |
| discovered we created a programming language - that was kind of accidental. | |
| SPJ: But you also had a very specific application at least, didn't you? You | |
| really were telecoms. | |
| JA: We wanted to just program a small telephone exchange in the best way | |
| possible and we wanted to just steal our ideas from different programming | |
| languages and put them together and just try them out and see if they work. | |
| SPJ: You knew the application you needed and you knew that concurrency was key | |
| - concurrency and robustness was key. | |
| JA: I remember back in the '80s going to conferences and talking about shared | |
| memory systems, and I always poked up my hand and asked the same question over | |
| again - "Well what happens if one of these shared memory things fails?" and | |
| they would say "Well it all fails". That's what we didn't want, you know. We | |
| make products with... a standard Ericsson product should have a downtime of | |
| four minutes per year or something like that; or better than that. Our key | |
| problem is handling failure. Right from the early days, we thought "Well the | |
| only way to handle failure is no shared memory and no transaction memory or | |
| anything that locks things" and that's why we copy everything. That's why we | |
| have a sophisticated error handling. So, here we are back in the '80s, about | |
| 1984 - 1985, just thinking how we can put these things together and it was only | |
| later that we realized "Oh, we've made a programming language and better give | |
| it a name" and people started using it and such. | |
| SPJ: It never struck me quite as positively as this before, but Erlang was born | |
| very much out of a demand pool. You had a telephone exchange, you wanted to | |
| program it and the language happened to come out of that. Haskell was the exact | |
| reverse. We were on a mission from God to explain why laziness was fantastic | |
| for the world, so it was very much technology push. We wanted to explain | |
| functional programming, lazy functional programming in particular, through the | |
| medium of a language. You were at this end and we were at this end. | |
| JA: But in the same point in time we were doing the same things. I remember one | |
| of the things we did in Erlang, we... First of all we implemented it in Prolog | |
| and I did that. Later, Robert Virding came along and Robert was very into | |
| parallel concurrent languages. We made one of our first big mistakes, which was | |
| announcing what the performance of something will be before you've done it. | |
| You've made that one! | |
| SPJ: No, no, no! We'd never do that! | |
| JA: We cross-compiled Erlang to Strand, which was a parallel language and we | |
| implemented a language like that and we happily told everybody "It will be 6 | |
| times faster" before we'd actually done this. | |
| SPJ: Was it? | |
| JA: No. It didn't work. | |
| SPJ: Didn't work at all! Naught times faster! | |
| JA: The first thing we did, we wrote an Erlang program, we cross-compiled it to | |
| all this parallel stuff and we had 3 Erlang processes and it was as parallel as | |
| possible and suddenly there were 6 million processes in the machine, 6 million | |
| threads - we got far too much parallelism. At that was the same time you said, | |
| in your talk, "Well, we tried to make everything parallel and it was just far | |
| too much parallelism." We did the same stuff at the same time. | |
| SPJ: A nice remark I remember you making about Robert was that he came to you | |
| one day and asked to make a few small changes in the compiler. Then, the next | |
| line in your paper was "Robert is incapable of making a small change to | |
| anything, so he completely rewrote the whole thing." | |
| JA: He left a little comment at the top, "Joe first wrote this", and then | |
| everything is completely different. We used to rewrite each others' code. | |
| Haskell and Erlang have 2 distinct models of concurrency, right? Haskell is | |
| going to side-effect free, Erlang is about messaging. Can you talk a bit about | |
| this in contrast to both models of concurrency? | |
| SPJ: I suppose Haskell initially wasn't a concurrent language at all. It was a | |
| purely functional language and we had the idea from the beginning that a purely | |
| functional language was a good substrate for doing concurrency on. But it | |
| turned out to be a lot harder to turn that into reality than I think we really | |
| expected because we thought "If you got e1+e2 then you can evaluate e1 at the | |
| same time as e2 and everything will be wonderful because they can't affect each | |
| other because it's pure." | |
| But it turned out to be very hard to turn that into actual wall clock speedups | |
| on processes, leaving aside all issues of robustness or that kind of stuff, | |
| because if you do that style of concurrency you get lots of very tiny | |
| fine-grained processes and you get no locality and you get very difficult | |
| scheduling problems. The overheads overwhelm the benefits, in short. From | |
| there, we've evolved generally towards giving the programmer some control over | |
| that kind of concurrency. That was the first form of concurrency that Haskell | |
| grew. It was a controlled form of this implicit level of concurrency. We added | |
| other things afterwards that we'll talk about later, but it was very different | |
| to Erlang's model of concurrency. | |
| JA: Erlang started with just this pure process and copying everything. The | |
| reason for copying was error handling. Error handling was central. In my view | |
| of the world, the view of the world I've always had was all these little | |
| processes all talking to each other in a global namespace. | |
| SPJ: But that already, as you know, was completely alien to Haskell, which is | |
| just a functional program which is evaluated, so a totally different model. | |
| JA: In the first incarnation of Erlang, it was just... little black boxes were | |
| communicating, copying their messages - it's a mailbox model, copying to a | |
| mailbox. What we were doing was relational. We had Prolog processes inside the | |
| black boxes. | |
| SPJ: But you saw the light. | |
| JA: We had to. You are launching a rocket program, because... We wanted from a | |
| black box perspective, if you send a certain sequence of messages in, you want | |
| the same sequence of messages to come out. You want that to be reproducible and | |
| deterministic and Prolog isn't like that. It backtracks, and it has things like | |
| that. It just sort of became more natural to make it functional. We never made | |
| a decision about having types or not having types. That wasn't an issue. We | |
| started with Prolog. Prolog didn't have types, so we got the dynamic type | |
| system that Prolog had. The issues we were interested in were limiting errors, | |
| propagation of errors, restarting things, restarting bits of the system without | |
| taking down all the system, having things which may appear inconsistent while | |
| you are upgrading them, continuously evolving systems, not systems you stop and | |
| restart. | |
| We can't stop our systems and globally check they are consistent and then | |
| relaunch them. We incrementally change bits and we recognize that they are | |
| inconsistent under short time periods and we live with that. Finding ways of | |
| living with failure, making systems that work, despite the fact they are | |
| inconsistent, despite the fact that failures occur. Error models are very | |
| sophisticated. When I see things like Scala or I see on the 'net there's this | |
| kind of "Erlang-like semantics", that usually means mailboxes and message | |
| boxes, it doesn't mean all the error handling, it doesn't mean the live code | |
| upgrade. The live upgrade of code while you are running a system needs a lot of | |
| deep plumbing under the counter - it's not easy. | |
| SPJ: I still think it's amazing you can make that work at all. | |
| JA: Phil Wadler saw it... You know, you can have two versions of the same model | |
| with all the same names, he didn't believe it was possible until we showed it | |
| to him. | |
| You mentioned some other language got inspired from Erlang and from Haskell, | |
| also about concurrency. For example, Scala. The dynamic part of Erlang actors | |
| in the same way that languages like F# and C# implement features coming from | |
| functional programming, more specifically Haskell. Can you talk a bit about | |
| this and how things get implemented in the mainstream or in other languages? | |
| JA: I don't know. People say "What will happen to Erlang?" and I have no idea. | |
| There are certain languages which are very influential, but don't get widely | |
| used, like Smalltalk, for example. Smalltalk is the ultimate object-oriented | |
| language and it has influenced Objective-C and all sorts of languages. It has | |
| this core band of pure Smalltalk programmers who don't believe in anything | |
| else, but it wasn't destined to take over the world. Erlang is probably... I | |
| don't know, maybe it's in that category, it influences other languages... It | |
| comes from a funny place, it comes from Ericsson, it comes from a telecoms | |
| company and that's really not our main business, to make languages. | |
| Microsoft's main business is to make languages, but you have the sort of muscle | |
| to support a language. We have languages supported by companies like Microsoft | |
| and Sun and Java and C# and things like that. Then, you have languages not | |
| supported by big companies like Ruby and Perl, and they have their own | |
| communities. Erlang is in between - it's not in the sense of Microsoft or Sun | |
| being supported in the big scale, but it's not living all by itself in an open | |
| source community. It has the financial resources to do the compiler and keep | |
| the core clean, which is very good. Where it goes from there, I don't know. | |
| SPJ: But nevertheless, there is this company that supports it - there isn't for | |
| Haskell, curiously. Microsoft is generously supporting myself and Simon Marlow, | |
| but they are not supporting us to work on Haskell. I'm hired as a researcher | |
| and I happen to work on Haskell. | |
| JA: It's a good change to work on C#. | |
| SPJ: In principle I could. I don't think it's going to any time soon. I suppose | |
| Microsoft could turn around and tell me "Please don't work on Haskell any | |
| more." | |
| Like they did with Erik Meijer, right? | |
| SPJ: Erik is a whole different ball of wax. Erik works in the developer | |
| division in Microsoft and that's much more product focused, but amazingly Erik | |
| has brilliantly found a way to meld ideas from functional programming in quite | |
| a product-focused kind of way and build a little research bubble in the | |
| developer division that they perceive as being actively valuable to them. I | |
| think that's amazing! Whereas I'm a mere parasite. I'm a researcher who isn't | |
| required on a month-by-month basis in the way that Erik is to produce immediate | |
| value to the company. | |
| But I think Haskell is also a bit different to Erlang. | |
| I think of Haskell as a... Its principle value is a kind of ideas factory. It's | |
| a laboratory in which ideas can grow and flow. It has a very active user | |
| community and a very active research community. Lots of research papers use | |
| Haskell as their substrate and indeed use GHC, the compiler that Simon and I | |
| built, as a substrate for their work. It's a laboratory in which ideas can go | |
| in that may then get transferred elsewhere. We've seen lots of examples of | |
| that, not perhaps specifically for Haskell or for functional programming in | |
| general, going right back to garbage collection, which is now taken as "Of | |
| course you have garbage collection", but it was a long time before that was | |
| taken as wrote, it was a long time that people thought it would never be taken | |
| as wrote, but that grew up in the functional programming community. | |
| Generics in Java and C# grew up in their functional programming type system | |
| world and are now taken for granted. LINQ, the language integrated query system | |
| in C#, is heavily informed by ideas of functional programming. I'm actually | |
| quite happy if Haskell serves as a role of generating ideas then move into the | |
| mainstream. Whether Haskell will itself ever become a truly mainstream language | |
| - it has hundreds of thousands of users, but not hundreds of millions of users, | |
| so it's a completely different kind of scale than Microsoft's product | |
| languages. | |
| Whether it will ever become a language on that scale, I don't know and I | |
| wouldn't want to give up being an ideas factory in order to get that. I'm just | |
| delighted that my colleague, Don Syme, who is 3 doors down the corridor, has | |
| developed F#, in which now actually there's a kind of pipeline, so now he can | |
| get pipeline ideas out of Haskell into F# and out of F# into C#. And I'm also | |
| delighted that he successfully made the case to Microsoft to turn it into a | |
| product that the company does support in a way that Erlang is supported and | |
| Haskell is not. F# really is a Microsoft product. That's a huge breakthrough | |
| for a big mainstream company like Microsoft to support a functional language. | |
| But the ideas factory is the bit that I... that is the most important thing, | |
| the high order bit. | |
| JA: That's fun, isn't it? It's the same in Erlang. I think in a way you know | |
| your graph, and it's like you have this initial enthusiasm, you get up to there | |
| and there's this sort of plateau where not much happens. I think that happened | |
| in Erlang as well. I don't understand why, it seemed to be very stop-go. You | |
| work on things, then suddenly something happens and then I think it's a time | |
| taken to ingest the ideas, these ideas are fermenting in people's brains for a | |
| long time. | |
| When people started being permanently connected to the Internet with broadband | |
| connections - that's a quantitative change. You saw the file-sharing networks. | |
| It's only about 4-5 years ago that people started deploying distributed | |
| algorithms. In the first few years of that they said "What shall we do?" Nobody | |
| knows so they invented Bittorrent and things like that. You have social | |
| networking programs, you have things like that. | |
| Google Wave will come along and replace email in a few years time. Suddenly | |
| people want to know how to write distributed programs, which they never wanted | |
| to do before. So, how do you do distributed transactions? How do you do | |
| consistency? It was always a pretty obscure part of computer science, these | |
| distributed algorithms. Then we said "We've got these to do parallel. How the | |
| heck do we program these things?" In Erlang or Haskell, these algorithms are | |
| just difficult, but in other languages they are... | |
| SPJ: Downright impossible! | |
| JA: Yeah, right! I mean even if you have a very clean, pure programming | |
| language and you take Paxton algorithms or something like that, they are | |
| complicated things. They make your head hurt. There is a whole branch of | |
| mathematics, a whole branch of computer science to understand distributed | |
| algorithms and they live at the bottom of these social networks and things, | |
| ticking along. So I don't think you are going to see Erlang replacing | |
| Javascript or anything like that in browsers, but where you are seeing them | |
| being deployed is in the infrastructures, in the clouds and things like that, | |
| to glue the bits together properly, because that's where you need the complex | |
| algorithms. We are seeing an infrastructure-building things, cloud computing | |
| and what you are doing inside modern massive multicores, how do you organize | |
| computations. There where it's being used a lot. | |
| SPJ: Whenever you've got concurrency and multiple processes working, you need | |
| to be very careful about side effects. Otherwise it just does your head in. | |
| Something that Haskell and Erlang both share is being careful about effects. | |
| Haskell is sort of super-careful and Erlang is merely careful, but in both | |
| cases, we don't have this, unrestricted side effects all the time, the | |
| computational fabric being effectful. It seems to me it makes it jolly hard to | |
| write programs that exploit multithreads. | |
| JA: I didn't really know what thread safety was in Java, so I wrote a little | |
| Java Swing thing and of a Java friend I asked: I wrote this Java process and it | |
| worked fine. I could create one window, and then I created 2 windows in a | |
| graphical program and I drew a rectangle in one and I drew a rectancle in the | |
| other and it crashed. And I said "Why did it crash?" And he said "Well the | |
| Swing library's not threadsafe". Now, what does that mean? It means if you got | |
| one thing that works, you do 2 of them in parallel, they interact in strange | |
| ways. I thought "How can you program like that? It's impossible to program!" | |
| If you got this non-thread safety, maybe you got this code that's threadsafe | |
| that reads stuff and this code that's threadsafe that writes stuff, you try to | |
| compose into a program that reads and writes and then you go "Oh, it didn't | |
| work!" Then you are scratching your head and thinking "Maybe something is | |
| wrong. Let's put a mutate or a lock around the whole thing" and it works for a | |
| while and then somebody else has forgotten to lock it. What happens when it | |
| crashes? How is the failure model integrated with the locking models? Because | |
| if you lock stuff it seems you don't fail and you release the lock and then you | |
| hide that in the libraries and nobody can see and you inherit stuff from there | |
| and you've got a mess. | |
| SPJ: One of the ways that I speculate that functional languages may end up | |
| influencing the mainstream, is that mainstream languages will gain a larger and | |
| larger subset of purely functional programming, that will make it easier to | |
| write functional programs that don't use side effects a lot. So, you can get | |
| more and more computation done, useful and complicated computation, like the | |
| algorithms you were describing, without using side effects. I think that, as | |
| they get richer subsets - F# is an example of this because it's built on a | |
| substrate that is completely imperative. The .NET system is an imperative | |
| system, but F# makes it easier - it reduces the barrier to entry for writing | |
| functionally. I think that's a trend that we will see continue. | |
| JA: What I noticed is that the Erlang libraries, they are pure libraries, and | |
| the pure functions are incredibly reusable. They just work forever. You do | |
| them, you test them and forget about them and you can reuse them in many | |
| different contexts that you hadn't thought about. The goal in system design is | |
| to divide your system in such a way that as much as possible of it is pure and | |
| the messing stuff's on the side. | |
| SPJ: That's why you need those types, Joe! | |
| JA: No, that's why you need good error recovery mechanism, because even with | |
| your types, you need to test it. First, I want to write a factorial program, my | |
| factorial program will say "factorial of n is 42" and the type inference then | |
| will say "This is a type int to int", so it's OK, it's correct. You have to say | |
| factorial 3 is 6. | |
| SPJ: You are misrepresenting me! I never said that types guarantee your program | |
| is correct, but you were talking about segregating events and types are a very | |
| good way to segregate things. | |
| In the same topic, in Scala, which we say got inspired by Erlang, for the actor | |
| model, actually they don't copy memory, they don't copy data structure, but | |
| they rather share them and there are no constraints on the data structure, so | |
| it could be mutable and it could be shared. Martin Odersky, father of Scala | |
| wrote a paper about using types, be it inferred or not, type annotations for | |
| guaranteeing that the reference is not consumed concurrently. So, you can use | |
| mutable data structure, which is very nice for performance reasons, yet you get | |
| some guarantees of not having race conditions. Can you talk a little bit about | |
| that from the two perspectives? | |
| SPJ: First thing to say, Scala is not just a knock-off of Erlang. It's a whole | |
| sophisticated language which rather elegantly combines functional and object | |
| oriented systems, in a rather innovative and unusual way. But I think you are | |
| right - one thing you can do with Scala is implement an Erlang-like model. Once | |
| you've got an imperative system, in which you can mutate things, it is possible | |
| to use type systems to constrain exactly how you mutate them so that, for | |
| example, ownership types in object oriented languages are an example of doing | |
| this. | |
| I don't know about Scala's... I don't know the paper that you are describing of | |
| Martin's, but systems that allow mutual objects to flow around and be used once | |
| and be owned by different threads can work, but they tend to be rather | |
| complicated. My gut feel is that you need type systems that tread this narrow | |
| line between being expressive enough to say what you want and not being so | |
| complicated that nobody can understand them anymore. | |
| Ownership types for a conventional object oriented language are a way of | |
| controlling some of these things, but that feels to me like taking a system | |
| that's already using too many effects and trying to retrospectively patch it | |
| up, is better to start - from my preferences - would be to start with a system | |
| in which effects are (a) few, because the functional subset is very big and (b) | |
| rather constrained. Now, Haskell has a rather crude distinction between | |
| effectful programs and not. That's not enough to just say it has some effects, | |
| might not be enough. That gets you back into the land of controlling effects | |
| more precisely, but at least it's a smaller piece of your program. | |
| At the moment I'm agnostic about whether types are the right mechanism for | |
| trying to do this much more fine-grained control of effects in which you are | |
| saying "This reference is owned by that thread for this period and then the | |
| ownership moves to that." That's a rather sophisticated thing to do. It's very | |
| clever, I'd be very interested in papers written about it. | |
| JA: I don't like the idea of mixing things too much. | |
| SPJ: What sort of mixing? | |
| JA: If you look at Haskell, Erlang, Scala and F#, what do you see? If you look | |
| at Haskell, you see something which, within its context, within the little | |
| framework it sets up for its sandbox, it's very consistent, it's very | |
| beautiful. You look at Erlang, it kind of fits, the bits fit together nicely. | |
| Of course, they don't fit together nicely with the JVM or with .NET or anything | |
| like that. If you want to use all the nice things that are there, you can't use | |
| them, or you can use them, but it's difficult. So the other approach is, you | |
| say "Let's use the JVM and target lots of different languages, so that the | |
| different languages can use each other" or you can do that within the .NET | |
| framework, you get Scala and you get F#. | |
| The benefit there is you can use all these other things that are available, but | |
| in order to do them, you have to break them, massively corrupt and break these | |
| abstraction boundaries and I don't like that. I think you are breaking | |
| abstraction boundaries in the wrong place. How I would like to see systems | |
| built is through communicating back boxes. And I would like to see the type | |
| systems applied to the definition of the protocols themselves and I haven't | |
| seen that done. | |
| SPJ: That's been lots of work on that kind of stuff in the type systems for | |
| contracts. | |
| JA: Yes, I've done some stuff myself, but it seems to have very little impact | |
| on how protocols are designed. Protocols are designed, typical ways are, if you | |
| write RFCs and things like that and they are defined in English with text and a | |
| lot of people think they are defining protocols, but they are in fact defining | |
| message structures and they don't say anything about ordering... | |
| SPJ: You must send an A and then a B and they must be all applied. | |
| JA: Yes. Classically, an API for files, say, you can open a file and you can | |
| close a file and you can read a file and you can write it. They don't tell you | |
| that you have to open it before you can read it, do they? | |
| SPJ: The very idea that there is a thing that you do things to it is part of | |
| what gives rise to that problem. | |
| JA: In a lot of programming languages, if you have a file open and read and | |
| write, the programmer is supposed to know just by magic that you should open it | |
| before you read it. That makes sense on one processor - what does it mean on a | |
| parallel program? What happens when you have 2 things that try to open the file | |
| at the same time? Do you get 2 copies and one of them fails? Is it like a | |
| transaction? Is the file put back to the old state? If you start trying to read | |
| the APIs, you can't find the answer in the API. John Hughes was saying "Did you | |
| know that if you open a file and write it and rewind it and do this and do this | |
| then this happens, which is very counterintuitive" and you try and look in the | |
| POSIX specs, and you can't find any reference to what they should do. This good | |
| check threading discovers things like this. If you are doing I/O properly, in | |
| the sense that you are doing it, you don't do it at all and save it all up to | |
| the end, then do it! A lot of those problems go away. | |
| SPJ: We don't actually. It just looks like it. We certainly don't save it all | |
| up to the end! | |
| JA: You're fibbing! | |
| SPJ: We are tackling the awkward spots. There are very detailed operational | |
| semantics given that explains that I/O happens as you go. It's no good if | |
| you've got to print the message and then read the response and then print | |
| something else. You can't save that up to the end! | |
| That takes us to a discussion about user's model. For me, as a user of Haskell, | |
| I have a mental model that everything is happening by the end. That's what the | |
| type system tells me - that everything is saved until the end. | |
| SPJ: When you mean the user model, you mean the programmer? Not the user user? | |
| Yes. This is in sync with how things happen, but sometimes it gets out of sync. | |
| It's the same thing with when you are dealing with processors. You are sending | |
| messages, but sometimes it's maybe not that cheap to send a message and also it | |
| hits barriers of using this mental model that you established. What do you | |
| think of this problem? | |
| SPJ: I don't think I understood the question. | |
| With any feature of the language, you define a mental model for the programmer, | |
| so that the programmer can reason about this feature. The implementation tries | |
| to get in sync with this mental model, but it's not always true. In some cases, | |
| it gets far from it. It's the same thing with processors, when you are sending | |
| messages. It's not exactly the same thing, because you need to copy or you | |
| don't need to copy. Sometimes it's not performant to do it that way, you do it | |
| another way. What do you think of this problem? The same thing with I/O, that | |
| as a programmer I think that is happening by the end, but it is actually | |
| happening on the fly, it's happening right away. | |
| SPJ: It is happening right away and by design. I think the presentation that I | |
| gave today which said the main program is an I/O action, which is gotten by | |
| stitching together other things, that's a first approximation. I've taken quite | |
| a lot of trouble to add a tutorial that explains in rather more explicit | |
| details. I mean, that model doesn't work at all for a concurrent program, for | |
| example, when you are spawning threads that have got to talk to each other, of | |
| course you aren't going to save all that up to the end. | |
| So no, the I/O monad is an action that runs in an interactive fashion with | |
| other I/O threads that are going on at the same time and it's possible give a | |
| much more precise model for what's happening than say you save up one big thing | |
| that's performed at the end. I think the truth of it then, the one big thing | |
| you do at the end is just an oversimplification you might do for your first | |
| program that just printed 3 things, but then you have to get a slightly more | |
| sophisticated understanding of what's going on. | |
| Or, alternatively, just bail out into "well it looks like a C program and | |
| actually it runs like a C program", you say "Do: print this, read that, send | |
| this message to this other process" and that's exactly what happens, just as if | |
| it were a C program. You can look at a monadic I/O Haskell program with 2 sets | |
| of spectacles. You can either think of it very much like an imperative program | |
| and that's a very accurate model actually, or you can think of it as the | |
| compiler does in this more functional way and that's what means that the | |
| compiler's optimizations are actually robust. I suppose that's the way I deal | |
| with the dissonance, is to try to say "that was an oversimplification to start | |
| with". | |
| JA: You don't tell all the truth. | |
| SPJ: That's right, the truth in bits. | |
| JA: For example, it might sound that when you send messages between Erlang | |
| processors, your mental model of what's happening is you copy everything. And | |
| indeed we do copy everything, apart from - now there's small print - if it's a | |
| big binary and you are on the same processor, then it's in a shared heap with a | |
| reference counting garbage collector. | |
| SPJ: And it's immutable, of course. | |
| JA: Yes, but it's like... What's it called in operating systems, when you have | |
| a process, if you don't write to the memory, you've got the same memory, but | |
| you do copy on write. So, when you write to the memory, you split it into 2. | |
| The binaries behave like that and they are in a shared heap. | |
| SPJ: You have to be careful about your cost model. If somebody suddenly | |
| redistributed that, so it suddenly was on a different processor... | |
| JA: If you distributed, then it is copied. | |
| SPJ: That's what I mean - suddenly the program would run at a very different | |
| speed. | |
| JA: Even that statement, that it's on a shared heap and binaries are not | |
| copied, is not true for small binaries because small binaries are copied, | |
| because it's just not worth sharing them and doing things. Even that statement | |
| is not quite true, because sometimes you keep fragmented ones - there is an | |
| awful lot of small print. | |
| SPJ: It's a business of a runtime system to provide a simple abstraction for | |
| complicated stuff. | |
| JA: Should a programmer know about that? No. | |
| SPJ: No. But we are not talking about the cost model, right? Whereas you were | |
| talking about the understanding of what semantics of the program is. | |
| Yes. When performance is a question, you need to think about performance. In | |
| the same sense, how much should the programmer know about what's actually | |
| happening with processors or processes? What we saw in your presentation is | |
| that you started with a quite transparent system where the programmer doesn't | |
| know on which thread things are executing, and you ended up with something very | |
| explicit saying "Your processor executed this, the other processor executed | |
| something that is similar to Erlang". What do you think? | |
| JA: I think when you are saying the programmer needs to think about | |
| performance, I wouldnt' say that - I'd say the programmer needs to measure the | |
| performance of the programs, because the one area where my intuition is really | |
| wrong is in guessing performance, apart from the fact that I know the input. | |
| Parsing stuff is usually the big bottleneck apart from that. When you put | |
| these, when you have virtual machines running on top of multicores with core | |
| affinities and you put virtual machines inside the virtual machines, your | |
| intuition about performance goes out of the window. | |
| Sometimes you give some hints to the compiler like "This is a future, this is | |
| delay" or whatever it is and sometimes you are really explicit and say "I need | |
| a process to execute this". Or, the same thing with data parallelism - I say "I | |
| want this list to be executed in several processors", so I'm very explicit | |
| about this. | |
| JA: I think that annotating your program with parallel things is very very | |
| difficult. Your intuition would probably be wrong. | |
| SPJ: What you're saying is you need good profiling tools. Your own intuition | |
| about how sequential programs run is often wrong, too. It's silly to spend time | |
| optimizing code that isn't going to run very much, so you need good profiling | |
| tools. One of the things that GHC, our compiler, has not been good at until | |
| really quite recently is having good profiling tools for parallel programs. | |
| Because you really want to know, why isn't that concurrent? Or maybe it is | |
| concurrent but the granularity is too small, so why? You need good insight into | |
| what's going on and that's quite hard to do. My colleagues Satnam Singh and | |
| Simon Marlow are working on a device called ThreadScope, that for the first | |
| time gives some insight into that. Lots of other people have built profiling | |
| systems for parallel systems - we've been lagging a bit there, really. | |
| JA: In Erlang everything is parallel because we start up with lots of | |
| processes. SPJ: Do you have any profiling tools? | |
| JA: Yes, sure. | |
| SPJ: So, you can see "Oh, this process is stalled." | |
| JA: What's interesting is when you are finding the sequentiality. Sometimes the | |
| system just degenerates to one sequential process and you think "Why? What's | |
| going wrong here?" That goes back to algorithms, because, a lot of the | |
| sequentiality is in this one place where you have to do allocation. If you're | |
| booking seats for an aircraft or something, you've got one allocator, but if | |
| you wanted to parallelize that, then maybe you've got 2 allocators and you have | |
| all the odd seats and you allocate all the even seats from there. You can't sit | |
| next to your wife or something. We might give half the seats to one guy and | |
| another, and you want better than that so you split it into 4. So probably you | |
| end up with this logarithmic allocator that's message passing to get rid of | |
| these bottlenecks. | |
| SPJ: You've got to narrow it down in the first place. That's why you need the | |
| profiling for. | |
| JA: I think the things you see in clusters - in a cluster you have variable | |
| latency, you have load balancing and things like that - you are going to have | |
| to put those in the underlying runtime system, sort of hide them under the | |
| counter and do exactly the same thing that people have done in clusters for | |
| years. Because a big multicore is a cluster and it's on a chip and the | |
| latencies are different - the different cores have different caches, so you can | |
| send messages between 2 cores very quickly and between 2 other cores very | |
| slowly, because of the memory cache architecture. I think that needs to be | |
| reflected in the internal load balancing and those sorts of things. A | |
| programmer shouldn't see that! | |
| SPJ: They are not new problems. People have been working on them for quite a | |
| long time, but I think the difference is, in languages like Haskell and Erlang, | |
| is that the programming abstractions are rather high-level. The high-level | |
| programming abstractions are good for the programmer, because it means that | |
| they don't have to worry about things, but it means that the runtime system and | |
| so forth has to cross a bigger abstraction gap. So the programmer in effect has | |
| less control. The more control a programmer is prepared to take, well then | |
| everything becomes easy. Not only does the programmer make better decisions, | |
| perhaps, at the cost perhaps of a lot of work, but also if you do have | |
| profiling tools at a lower level, then those things are easier to measure and | |
| present to the programmer. At a high level it's quite hard to think what to | |
| measure or how to present it. I think we are still finding out how cross that | |
| abstraction barrier. | |
| JA: With each new generation of chips, it's... We don't know what's going to | |
| happen. | |
| SPJ: How you measure abstraction withing the hardware, yes, as well. You have | |
| much less control than you think, really. | |
| JA: It's quite funny, because we got these new Nehalem processors and boards | |
| and I don't know what's going to happen. We'll make measurements on it. I think | |
| it's rather exciting now, because the last 10 years I knew what was going to | |
| happen more or less, I could predict, and then the big multicore start coming | |
| out and the Tilera chips and things like that. What is going to happen? I don't | |
| know, it's back to experimentation. Look at the Intel Polaris chips! 80 cores | |
| or so! A gigaflop at 11 watts! | |
| SPJ: Some of it's a big toss-up though, because these machines have lots of | |
| layers of abstractions in them and they are all not quite under your control. | |
| So, you spend a lot of time measuring things and sometimes it goes fast and | |
| sometimes it doesn't, so you might discover that the reason it's going slowly | |
| is because 2 unrelated variables happen to be allocated in a place that was in | |
| the same cache line. There are better things to do with your life than figure | |
| out the 2 variables! That kind of work doesn't feel productive to me, but the | |
| more sophisticated the underlying hardware gets, the more you have to do that. | |
| So, I don't know quite what to do about that, except those systems that have | |
| some kind of built-in locality, that's the data parallelism stuff I was talking | |
| about. That's my best hope for dealing with these machines with very nonuniform | |
| memory access. | |
| JA: In Erlang we might start migrating processes, you find that these 2 | |
| processes are talking to each other and just migrate and put them in the same | |
| core. | |
| SPJ: That'd be lovely! It's difficult to do well. We have our own stacks and | |
| heaps, of course, so we can move things around very freely, but then we find | |
| out, "The reason things are going so slowly is because this processor keeps | |
| migrating around, it's never in the right place. It keeps migrating where it | |
| thinks it's meant to be, but actually it turns out it's always in the wrong | |
| place. It's really annoying when that happens and it always happens on the test | |
| program that the first student writes. Real systems, it's often more OK, it's | |
| like factorial. You write factorial and it doesn't perform well. | |
| SPJ: So, I've told you what I most envy about Erlang. What do you most envy | |
| about Haskell? | |
| JA: All the types. I mean they're very nice. I wish we had them. On the other | |
| hand, wouldn't you love to have all these generic turn-to-binary, these sort of | |
| things? How can you live without them? | |
| SPJ: I have a little bit of residual envy about generics. | |
| JA: You just take anything and compare it to the serializer and then send it? | |
| SPJ: That's sinfully easy, and shouldn't be allowed. | |
| JA: And the garbage collector doesn't need to know about what types are | |
| generated. | |
| SPJ: Oh the garbage collector, that's not a problem. Our garbage collectors are | |
| complicated, not because of types, it knows nothing of types. Garbage | |
| collector's fine, but I have to admit that, just walking over trees of other | |
| trees, generic programming, lack of types is very convenient. Haven't you | |
| enjoyed the whole cottage industry of papers, about typed generic programming, | |
| which is being extraordinary creative, people have done extremely clever | |
| things, so that would have never happened. Necessity is the mother of | |
| invention. | |
| JA: I read "The Real World Haskell", very nice. When I'm reading "The Real | |
| World Haskell", I'm started off reading the simple programs - they were | |
| obvious. The more complicated programs - I was writing the Erlang program in a | |
| margin next to it to understand it. And at that point I saw how similar they | |
| are. The other thing is, you know you talk about the first thing a Haskell | |
| programmer must do is write down all the types, and then they're doing the | |
| design. That's exactly the first thing I do, in my head. And I write the odd | |
| comment down now and then. | |
| SPJ: I just want you to have a way to do that explicitly! | |
| JA: When you're developing Erlang, you don't have to write the entire program | |
| to get it to work, and you're not quite sure what the types are. In Erlang you | |
| just write some program here, and you evaluate something and it throws out this | |
| great big thing and you say: "Oh, that's what the type was". Then you can say: | |
| "Well that wasn't quite right, let's change it to that". But you're not | |
| restricted in having to get all the bits working before you can just do those | |
| experiments. Wouldn't you miss something like that? | |
| SPJ: You still don't have to get all the bits working, you have to think of | |
| what type you might have, but you might evolve that type. | |
| JA: Or you can just run it and see. We have things called parse transforms | |
| which allow you to change them. In a module you can say there's a parse | |
| transform, and the parse transform is given the parse tree of the program and | |
| it can turn it into another parse tree before it's given to the compiler. | |
| That's a way of doing deeply sinful things that you normally can't do. And the | |
| nice thing about that is, somebody has asked me "How do I get started with | |
| this?" | |
| Well, there is a function, epp parse_form(), which just takes an Erlang module | |
| and it produces all the abstract forms in it. And then you dump it into a file. | |
| Then you just open it up in a text editor; you don't need to know anything | |
| about the types, you just read it. It's pretty obvious - it says "This is the | |
| function" and you can sort of guess what the stuff does. | |
| SPJ: You're looking at the parse tree of a particular component... | |
| JA: Of a module, yes. And because it's got this generic sort of structure, it's | |
| dumped it all out and you can look at it. You don't really need to know what | |
| the types are. You can guess "Well, I can see that's a function application so | |
| I can start transforming it. You write generic functions that walk down the | |
| whole parse tree and extract all the variables from it without knowing what | |
| form of tree it is - tiny little functions. Don't you miss things like that? | |
| SPJ: I do, actually. | |
| JA: Actually, we're biased. We should ask John Hughes or Simon Thompson who | |
| have more experience of both worlds. | |
| SPJ: So John has promised me a paper on why static typing is harmful. Because | |
| he's someone who grew up with static types and he's done a lot of Erlang | |
| programming now, he has quite a visceral understanding of both, and he | |
| actually, he's told me that he will write a paper about this. | |
| JA: So, to me, good functional programmers have come to Erlang. Both Phil | |
| Wadler and John Hughes, when they came to Erlang, wrote the most bizarre Erlang | |
| programs I've never seen. The first thing that they do is they define a macro, | |
| which is "freeze" or "force", which makes it lazy because it just returns the | |
| func, and they are forced to evaluate it. And then they write their entire | |
| program which looks to be, from an Erlang point of view, like a Klein Bottle - | |
| the entire program is inside out, and it just sort of ends with this "force" or | |
| "freeze". And they write completely incomprehensible Erlang code. It all works | |
| beautifully. | |
| SPJ: I don't think that has anything to do with types though. | |
| JA: No, it has to do with lazy evaluation, they force you to think in a | |
| different way. | |
| SPJ: They want to write lazy Erlang programs. | |
| JA: Yeah, they do. That's the first thing they do. It's really weird Erlang. | |
| But it works! | |
| SPJ: When you are going to write a lazy program, you probably wouldn't choose | |
| Erlang as your first language. Going back to what you were saying, when you | |
| start thinking about your Erlang program, you do think types in your head, so | |
| the ability to write down in a formal way and have them checked... Perhaps, | |
| while not denying an exploratory program, but in fact several people, more than | |
| one of us, like Phil in the first place and Kostis or some of his colleagues | |
| have designed soft type systems for Erlang, but they never really caught on, | |
| and I'm never quite sure why. | |
| JA: If you're talking about the Dialyzer, we do run that. | |
| SPJ: You do use it on a regular basis. | |
| JA: Absolutely, especially in big projects. It's even part of... We have | |
| automatic testing, we run through these, and we look through. But they | |
| actually... By the time they've been through the test suites, it picks up very | |
| few errors. | |
| SPJ: But you use it as an error checker rather than a design tool. I mean you | |
| don't use it to write down your types is what I mean, you don't use it as part | |
| of your design flow. Because I think that's probably the most powerful thing | |
| about types is to use it as a design tool. | |
| JA: I think we should have gone down type checking rather than type inference, | |
| where we start by writing the type signatures and then type check, rather than | |
| just inferring what we've got. | |
| SPJ: You are saying that writing a type doesn't mean the program is correct, | |
| but in some says for me, that's the whole point. A type is a weak | |
| specification, it's a partial specification which means that it can be small, | |
| so I can say in one line something that is meaningful to me as a programmer, | |
| and that helps me glue my program together without having to specify lots of | |
| details about exactly what it does - it's called "lookup", so I know what it | |
| does, more or less. Its type tells me a lot of detail. | |
| JA: I must admit to a secret. When I'm thinking about a program, you know | |
| you've got all these mental dialogues before you write the program, and my | |
| mental dialogue is as if I'm writing a blog, a blog software. A blog is a | |
| sequence of chunks, and a chunk is a title or a paragraph and a paragraph is a | |
| something or a something. I'm very strictly having this little dialogue. | |
| SPJ: So secretly you're writing a Haskell program in your head, but it comes | |
| out in Erlang. | |
| JA: Then I type it in. And actually for difficult problems, I write these down | |
| in English or in formal notations. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment