From wilson@cs.utexas.edu Mon Apr 7 16:05:56 1997 Date: 7 Apr 1997 04:47:45 -0500 From: Paul Wilson To: python-list@cwi.nl Subject: Reply to Ousterhout's reply (was Re: Ousterhout and Tcl ...) In article <5i7euq$cmg@engnews2.Eng.Sun.COM>, John Ousterhout wrote: >Wow, there's been quite a party going on over here on comp.lang.scheme! >I'd like to respond to a few of the comments about my white paper on >scripting, but first a couple of introductory remarks: > >1. The paper is not intended to be a complete taxonomy of all programming > languages nor is it intended to discuss every factor that contributes > to the usefulness of a language. The paper has a very narrow focus, > namely to explain what scripting is and why it's important. I suggest saying this right up front in the paper. You could say that you're using Tcl as an example, but that many other languages, wit minor modifications, might be comparably suitable for scripting, or even more so. As it stands, it gives the impression that you're either very ignorant of programming languages, despite having designed a popular one, or just speciously hyping Tcl at the expense of languages that developed many of the same concepts decades ago (and in my opinion, generally did them better). I'll grant that this wasn't your intention, but it is the way it comes across. This is partly due to the way you give a history of programming languages, without even having mentioned the most directly relevant early programming language---Lisp. Lisp has been around longer than any other noticeably living language except FORTRAN. (The lack of reasonable discussion of prior work is particularly striking for a former Distinguished Professor of CS from Berkeley. Maybe it's not fair because you're in industry now, but a lot of people respect your technical opinions because they know something of your track record. And you and Sun don't hide your academic credentials.) > I intentionally limited the discussion to a few issues such as system > programming vs. scripting, components vs. glue, and strongly typed > vs. untyped. These dichotomies have something to them, but the paper makes it sound like the issues are very simple and clear-cut. They're not. For example, Lisp- and Smalltalk-style dynamic typing have big advantages over "untyped" conversions back and forth through a uniform string representation. For common cases where you want coercions, you could code them into the primitive operations. For uncommon cases, it's probably better to signal type mismatches, so that bugs are easier to track down. Avoiding continual coercions could also help keep a language from being dog slow. > Of course there are other issues in programming language > design. At the same time, I think that the issues in the paper explain > a lot about what's going on in real-world programming. Perhaps. But for sophisticated audiences, there's little that's new, and much that's grating and seems wrong. For unsophisticated audiences, it's heavily biased---given your fame, Joe Programmer may take this kind of thing a little too seriously, and come up with the wrong impressions. >2. Many people objected to the fact that their favorite programming > was left out of the white paper. Yes, I have heard of Scheme, > Smalltalk, ML, etc. I left these languages out because they > didn't seem particularly relevant for the discussion. No offense > intended... Of course they're relevant, and you probably should know that by now. (Surely people have complained about this kind of thing before. I know *I* have heard it before, so I assume you have too.) For example, a simple Scheme-like language might make a much better scripting language, with a few minor changes. (E.g., terser names for built-in procedures and special forms, some important automatic coercions, and a foreign function interface.) >3. It's very hard to settle arguments about programming languages > because it's hard to produce meaningful quantitative evidence about > things like programmer productivity. I tried to illustrate my points > with historical examples and a few quantitative anecdotes, but I > admit that these are soft. I'd be delighted to see better > quantitative evidence either supporting or contradicting my > arguments. For example, if you know of any quantitative measurements > of productivity improvements caused by object-oriented programming, > please let me know. I can't help here. >When Alaric Williams told me about the flame-fest on comp.lang.scheme, >he proposed a set of counter-arguments for me to respond to. Here >they are, along with my responses. > > - Typlessness, as evident in TCL, is not necessarily the best solution. > Dynamic typing is generally agreed to be far more powerful and safe. > >Actually, I think Tcl is dynamically typed: everything is checked >at runtime for at least syntactic validity. I think it's somewhere in between untyped and dynamically typed. With dynamic types, every value has a well-defined type, rather than just "string" or whatever. Put in enough automatic (and unsaafe) coercions, though, and the difference gets blurry. >I used a slightly offbeat >definition of "typing" in the paper: by my definition, "typing" means >declaring the nature of something in advance in order to restrict its >usage. You would probably call this "static typing", no? Probably. But in a dynamically typed language, you do associate types with values. By creating an object that's a general array, or one that's a string of characters, you may restrict the ways that object can be used later, without explicit coercions. This is often a good thing, amking the code clearer. > - TCL does not scale well to large systems; it is fine for small > "glueing" applications, but in the "real world", such applications > are expected to grow with time, and soon proper typing becomes > necessary, more efficiency becomes necessary, etc. > >When I started on Tcl I thought this would be true, but in fact many >people have built surprisingly large programs in Tcl. I know some people who have scores of thousands of lines of Tcl in their applications, but their code is a mess. Tcl simply doesn't have good abstractions for programming in the large, or even in the medium. Of course you can do it, given enough patience and care, but I don't think it proves much in terms of the actual language design. I'd bet that if those programs were written in a well-designed general-purpose language, they'd be easier to maintain, and tons faster to boot. (No, I don't have proof. But I don't think that large Tcl programs are good evidence for the points you're making. Often they're evidence against some of your points.) The main benefits that the Tcl hackers I know get form Tcl are an interactive command loop, and a standard way of gluing together code in other languages, and a standard, fairly functional graphics toolkit. Those are great things, and you're to be applauded for realizing they're crucial for writing glue code before most other people did! (It certainly should embarrass the hell out of the programming languages community, of which I'm a part.) An interactive command loop is incredibly valuable for increasing productivity over the usual compile-link-run-crash cycle. What's sad is how many Tcl programmers there are now who've never used any other interactive language, and think Tcl is great because it has that huge advantage over C++. Other aspects of the language seem poorly motivated. The syntax is bizarre, the type system (such as it is) is weak and kludgey, and the idiomatic use of a general evaluator seems like a big mistake. (You hardly ever need a general evaluator except at the top-level command loop. Having normal code use it inhibits the ability to compile to good code, and is error-prone. You're generally better off using macros for most things that people try to do with eval If you have hygienic macros a la Scheme, it's much safer, and macros don't incur runtime overhead in compiled code.) The lack of real data structures and garbage collection seems like an equally big mistake. You these are lessons that Lisp people learned a long time ago, and many language designers have built on those lessons. >For example, >there is a real-time Tcl application containing several hundred thousand >lines of code that controls a $5 billion oil well platform and (much to >my shock) it seems to be quite maintainable. Sybase has something like >a million lines of Tcl code in their test suite. I pity them. :-) >I think it depends a lot on the application. The oil well application >actually subdivides into a whole bunch of small tasks, so it's really >more like 500 smaller programs. I find this hard to believe. Maybe the oil well people are incredibly lucky and their application decomposes wonderfully, but I've never seen and interesting large application that did that. You generally need some abstraction mechanisms to manage large programs, whether they're provided by the language, or hand-kludged for the app. I suspect that this app is divided into too many too-short Tcl scripts, with too much flattening of data structures into strings and reconstructing the structures later. Or worse, it doesn't use interesting data structures, and is either much kludgier or vastly slower (or both) than it would be if it did. And speed is important. I believe that you point out in the intro to your book that most Tcl programs, contrary to your expectations, are coded entirely in Tcl. My understand is that interactive program development is so much easier (and so much more fun) than using batch languages that people get addicted to it---they're often willing to sacrifice a lot of performance and important code- and data-structuring facilities to be able to do it. >Also, if the application is >fundamentally gluing (i.e. the complexity is in the interconnections) >then switching to a more strongly typed language will just make things >worse. Maybe yes, maybe no. At the very least, true dynamic typing makes communicating between modules much less error-prone---you'll usually get a clear error message sooner than if you're just passing garbled strings around. And have a look at ML. Its polymorphic type system LOOKS dynamically typed, but actually infers types automatically, and type-checks the program automatically. Pretty spiffy. Most of the advantages of static typing and most of the advantages of dynamic typing, too. >One final argument: suppose that Tcl code is harder to maintain, >line for line, than code in a more strongly typed language (I suspect >this is true). But if a Tcl application has only 1/5 or 1/10 the lines >of code of the equivalent program in a strongly typed language, it may >still be easier to maintain overall. I think that you're oversimplifying here. Many of us will grant that scripting languages are good to have, and many of us will grant that something like dynamic typing is nice, at least compared to having to declare everything all the time. But there's no reason why a scripting language can't be a reasonable subset of a general-purpose language (e.g., Scheme with objects), so that you can just COMPILE your code to make it run pretty fast, rather than having to REWRITE it in a different language. There's no reason why a straightforward Scheme-style interpreter for a Tcl-sized subset of Scheme can't have a similarly small footprint and run much faster. And certainly you can compile it to vastly faster code. >That said, I still suspect that as scripting applications grow it makes >more and more sense to implement parts of them in a system programming >language. The great thing about scripting languages is that this is >easy to do. Why not just write it in a good all-around interactive, compiler-friendly language in the first place? >You can take the performance-critical kernel of a Tcl >application and implement it in C or C++; ditto for any complicated data >structures or algorithms. The simple, non-performance-critical parts >can be left in Tcl. I think you're really pushing a false dichotomy here. There's big difference between simple and non-performance-critical, and each of those is a spectrum. Tcl is not good for non-simple OR for performance-critical programming. It's so incredibly slow that you're forced to write many simple things in a different language to get decent performance. It's so incredibly limited (e.g., in its datatypes) that it's awkward to write code which may be somewhat sophisticated, e.g, using sophisticated data structures, but is not time-critical. > I knew when I started on Tcl that it wouldn't be >appropriate for all problems, so I designed it to work smoothly with >other languages. This is incredibly reasonable, but the scope of things for which Tcl is good is way, way smaller than it needs to be. You originally thought that Tcl scripts would be a few lines, or tens of lines, and designed the language accordingly. But now people are writing things orders of magnitude larger than that. Either you made a mistake early on, or they're making them now. >In contrast, most languages are egotistical: they >expect you to do *everything* in that language and make it very hard to >split the functionality of an application between multiple languages. >For example, I've been involved with several attempts to make C and Lisp >work together, and they all failed. > > - It is possible to make languages with execution speeds like C or C++, > that use dynamic typing successfully, whilst being high-level enough > in the creation of abstractions to "glue" things together quite > nicely and easily. > >Can you point to a specific language and identify a large community of >users who agree with this assessment? Depends on exactly what you're asking. I'm not saying there's a single implementation that gives you everyting you want for Tcl-style apps, but there's NOTHING hard about it. Consider SIOD, George Carette's small interpretive implementation of most of Scheme, which is is embeddable as a scripting language. You can keep the footprint small by running the GC at a high rate, which will slow things down a bit, but it'll still be a lot faster than Tcl. You can write portable Scheme code using the SIOD interpreter, and then run it through the Marc Feeley's Gambit compiler to get fast code. Maybe not as fast as C on average, but sometimes faster, and generally within spitting distance. Similarly, if you want more scripting and process-control features, you can use scsh, the Scheme shell. (Unfortunately, scsh is currently a bit of a pig, because it's built on a fat implementation of Scheme, but it doesn't have to be---somebody could extend SIOD a little and port scsh to it, and you'd get the benefits of both. scsh would run more slowly than it does now, but again, way faster than Tcl.) >Many people have made claims like >this to me, but no one has been able to point to a good real-world >example. The white paper argues that you can't have a jack-of-all-trades >language. Either you have a strongly typed language, which gives high >speed and manageability but makes gluing hard, or you have a weakly >typed language with the opposite properties. I guess you haven't read the literature on Lisp and Scheme. Don't be misled by the big-bag-of-features that Common Lisp became, or the pigginess of the implementations of it. Lisp itself has always been amenable to tiny implementations, if you care more about footprint and startup times than running speed. And even a small Lisp or Scheme is powerful enough to let you implement an object system or a module system from within the language. You just need a macro facility, which isn't much code. > - Do you really think that object orientation has failed? C++ is a bad > OO > language, indeed, but what about Self, Java, and other such OO success > stories from... Sun Labs? Do I detect interdepartmental rivalry? > >I overstated the arguments against OO programming in the paper and I'll >probably soften them a bit in the next draft. I actually think that >there are some good aspects of OO programming, and I use them myself >even when I'm not programming in an OO language. But I stand by the two >main points in the paper, which are that (a) OO programming hasn't >increased productivity dramatically because it doesn't raise the level of >programming significantly (it may improve things 20-30%, but I doubt >there's even a factor of 2, let alone 10) and (b) implementation >inheritance really truly is a bad idea that tends to reduce reuse and >productivity. I think you'll see substantial support for the second >claim even among OO enthusiasts. I don't think most of us would put it that way. Implementation inheritance CAN be a truly bad idea if you use it for the wrong things, but it's often a good thing if you know what you're doing and don't confuse interfaces with implementations. >As for Java, it's hard not to be envious of its success (aren't you >guys a bit envious too?), but Tcl is really symbiotic with Java, just >as Tcl is symbiotic with C. I look on Java as a better system >programming language that's particularly well-suited for creating >portable Internet components. Tcl is moving to the Internet itself, >and C isn't a good component language in that domain, so I'm delighted >to have Java around for implementing Internet components that Tcl >can then glue together. One of the saddest things about Java is that it screams for an interactive implementation, rather than batch compilation. (Unlike C++, it's got enough checking by default to build an uncrashable interaction environment.) Once there are interactive implementations, the dichotomy between scripting and systems languages will be less obvious. (And if you made type declarations optional in scripts, that would go much further.) > [ ... ] > > His arguments on "typeless" languages is useless. > You don't need a "scripting language" to > get usable abstractions without the need > to deal with low-level issues. > > button .b -text Hello! -font {Times 16} -command {puts hello} > > In Macintosh Common Lisp I'll write this as: > > (make-instance 'button-dialog-item > :dialog-item-text "Hello" > :view-font '("Times" 16) > :dialog-item-action (lambda (item) (print "hello"))) > >I think this example supports my claim that scripting languages are a >lot easier to use when you need to mix and match lots of things of >different types. The MCL example is a lot more verbose and complicated >than the Tcl example. The MCL example is mostly more verbose because MCL uses longer identifiers. If MCL were intended as a scripting language, the identifers could be changed and some of them could be made more intuitive for non-Lispers. You could also trivially define the instance-making macro (which I'll call new) to coerce a font name string into whatever representation it uses for a font spec.: (new button :text "Hello" :font "Times 16" :cmd (proc (item) (print "Hello"))) There's nothing about Lisp that makes this much more verbose than the Tcl version. It's just that MCL isn't mainly designed as a scripting language, and Lispers usually like to write readable code. This verbosity has little to do with anything deep about the language. The nice thing about Lisp is that you can have your cake and eat it too. You can define the things that are frequently used in scripts so that they're terse, and the other things so that they're clear. (You can do this from within the langauge, by defining procedures and macros that are really just aliases for standard things, using abbreviated names. -- | Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu) | Papers on memory allocators, garbage collection, memory hierarchies, | persistence and Scheme interpreters and compilers available via ftp from | ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)