From wilson@cs.utexas.edu Mon Apr  7 16:05:56 1997
Date: 7 Apr 1997 04:47:45 -0500
From: Paul Wilson <wilson@cs.utexas.edu>
To: python-list@cwi.nl
Subject: Reply to Ousterhout's reply (was Re: Ousterhout and Tcl ...)

In article <5i7euq$cmg@engnews2.Eng.Sun.COM>,
John Ousterhout <ouster@tcl.eng.sun.com> wrote:
>Wow, there's been quite a party going on over here on comp.lang.scheme!
>I'd like to respond to a few of the comments about my white paper on
>scripting, but first a couple of introductory remarks:
>
>1. The paper is not intended to be a complete taxonomy of all programming
>   languages nor is it intended to discuss every factor that contributes
>   to the usefulness of a language.  The paper has a very narrow focus,
>   namely to explain what scripting is and why it's important.

I suggest saying this right up front in the paper.  You could say that
you're using Tcl as an example, but that many other languages, wit
minor modifications, might be comparably suitable for scripting,
or even more so.  As it stands, it gives the impression that you're either
very ignorant of programming languages, despite having designed a popular
one, or just speciously hyping Tcl at the expense of languages that developed
many of the same concepts decades ago (and in my opinion, generally did them
better).

I'll grant that this wasn't your intention, but it is the way it comes
across.

This is partly due to the way you give a history of programming languages,
without even having mentioned the most directly relevant early programming
language---Lisp.  Lisp has been around longer than any other noticeably
living language except FORTRAN.  

(The lack of reasonable discussion of prior work is particularly striking
for a former Distinguished Professor of CS from Berkeley.  Maybe it's not
fair because you're in industry now, but a lot of people respect your
technical opinions because they know something of your track record.
And you and Sun don't hide your academic credentials.)

>   I intentionally limited the discussion to a few issues such as system
>   programming vs. scripting, components vs. glue, and strongly typed
>   vs. untyped.

These dichotomies have something to them, but the paper makes it sound like
the issues are very simple and clear-cut.  They're not.  For example,
Lisp- and Smalltalk-style dynamic typing have big advantages over 
"untyped" conversions back and forth through a uniform string representation.
For common cases where you want coercions, you could code them into
the primitive operations.  For uncommon cases, it's probably better to signal
type mismatches, so that bugs are easier to track down.  Avoiding
continual coercions could also help keep a language from being dog slow.

>   Of course there are other issues in programming language
>   design.  At the same time, I think that the issues in the paper explain
>   a lot about what's going on in real-world programming.

Perhaps.  But for sophisticated audiences, there's little that's new,
and much that's grating and seems wrong.  For unsophisticated audiences,
it's heavily biased---given your fame, Joe Programmer may take this
kind of thing a little too seriously, and come up with the wrong
impressions.

>2. Many people objected to the fact that their favorite programming
>   was left out of the white paper.  Yes, I have heard of Scheme,
>   Smalltalk, ML, etc.  I left these languages out because they
>   didn't seem particularly relevant for the discussion.  No offense
>   intended...

Of course they're relevant, and you probably should know that by now.
(Surely people have complained about this kind of thing before.   I
know *I* have heard it before, so I assume you have too.)

For example, a simple Scheme-like language might make a much better
scripting language, with a few minor changes.  (E.g., terser names
for built-in procedures and special forms, some important automatic
coercions, and a foreign function interface.)

>3. It's very hard to settle arguments about programming languages
>   because it's hard to produce meaningful quantitative evidence about
>   things like programmer productivity.  I tried to illustrate my points
>   with historical examples and a few quantitative anecdotes, but I
>   admit that these are soft.  I'd be delighted to see better
>   quantitative evidence either supporting or contradicting my
>   arguments.  For example, if you know of any quantitative measurements
>   of productivity improvements caused by object-oriented programming,
>   please let me know.

I can't help here.

>When Alaric Williams told me about the flame-fest on comp.lang.scheme,
>he proposed a set of counter-arguments for me to respond to.  Here
>they are, along with my responses.
>
>     - Typlessness, as evident in TCL, is not necessarily the best solution.
>    Dynamic typing is generally agreed to be far more powerful and safe.
>
>Actually, I think Tcl is dynamically typed: everything is checked
>at runtime for at least syntactic validity.

I think it's somewhere in between untyped and dynamically typed.
With dynamic types, every value has a well-defined type, rather than
just "string" or whatever.  Put in enough automatic (and unsaafe) coercions,
though, and the difference gets blurry.

>I used a slightly offbeat
>definition of "typing" in the paper: by my definition, "typing" means
>declaring the nature of something in advance in order to restrict its
>usage.  You would probably call this "static typing", no?

Probably.  But in a dynamically typed language, you do associate types with
values.  By creating an object that's a general array, or one that's
a string of characters, you may restrict the ways that object can be used
later, without explicit coercions.  This is often a good thing, amking
the code clearer.

>     - TCL does not scale well to large systems; it is fine for small
>    "glueing" applications, but in the "real world", such applications
>    are expected to grow with time, and soon proper typing becomes
>    necessary, more efficiency becomes necessary, etc.
>
>When I started on Tcl I thought this would be true, but in fact many
>people have built surprisingly large programs in Tcl.

I know some people who have scores of thousands of lines of Tcl in
their applications, but their code is a mess.  Tcl simply doesn't
have good abstractions for programming in the large, or even in the
medium.  Of course you can do it, given enough patience and care,
but I don't think it proves much in terms of the actual language design.

I'd bet that if those programs were written in a well-designed
general-purpose language, they'd be easier to maintain, and tons
faster to boot.  (No, I don't have proof.  But I don't think that
large Tcl programs are good evidence for the points you're making.
Often they're evidence against some of your points.)

The main benefits that the Tcl hackers I know get form Tcl are an
interactive command loop, and a standard way of gluing together code
in other languages, and a standard, fairly functional graphics toolkit.
Those are great things, and you're to be applauded for realizing they're
crucial for writing glue code before most other people did!  (It certainly
should embarrass the hell out of the programming languages community, of
which I'm a part.)

An interactive command loop is incredibly valuable for increasing
productivity over the usual compile-link-run-crash cycle.  What's
sad is how many Tcl programmers there are now who've never used
any other interactive language, and think Tcl is great because it
has that huge advantage over C++.

Other aspects of the language seem poorly motivated.  The syntax is
bizarre, the type system (such as it is) is weak and kludgey, and
the idiomatic use of a general evaluator seems like a big mistake.
(You hardly ever need a general evaluator except at the top-level
command loop.  Having normal code use it inhibits the ability to
compile to good code, and is error-prone.  You're generally better
off using macros for most things that people try to do with eval
If you have hygienic macros a la Scheme, it's much safer, and
macros don't incur runtime overhead in compiled code.)

The lack of real data structures and garbage collection seems
like an equally big mistake.

You these are lessons that Lisp people learned a long time ago, and
many language designers have built on those lessons.

>For example,
>there is a real-time Tcl application containing several hundred thousand
>lines of code that controls a $5 billion oil well platform and (much to
>my shock) it seems to be quite maintainable.  Sybase has something like
>a million lines of Tcl code in their test suite.

I pity them. :-)

>I think it depends a lot on the application.  The oil well application
>actually subdivides into a whole bunch of small tasks, so it's really
>more like 500 smaller programs.

I find this hard to believe.  Maybe the oil well people are incredibly
lucky and their application decomposes wonderfully, but I've never seen
and interesting large application that did that.  You generally need
some abstraction mechanisms to manage large programs, whether they're
provided by the language, or hand-kludged for the app.  

I suspect that this app is divided into too many too-short Tcl scripts,
with too much flattening of data structures into strings and reconstructing
the structures later.  Or worse, it doesn't use interesting data structures,
and is either much kludgier or vastly slower (or both) than it would be if
it did.

And speed is important.  I believe that you point out in the intro to
your book that most Tcl programs, contrary to your expectations, are
coded entirely in Tcl.  My understand is that interactive program
development is so much easier (and so much more fun) than using batch
languages that people get addicted to it---they're often willing to
sacrifice a lot of performance and important code- and data-structuring
facilities to be able to do it.

>Also, if the application is
>fundamentally gluing (i.e. the complexity is in the interconnections)
>then switching to a more strongly typed language will just make things
>worse.

Maybe yes, maybe no.  At the very least, true dynamic typing makes 
communicating between modules much less error-prone---you'll usually
get a clear error message sooner than if you're just passing garbled
strings around.

And have a look at ML.  Its polymorphic type system LOOKS dynamically
typed, but actually infers types automatically, and type-checks the
program automatically.  Pretty spiffy.  Most of the advantages of
static typing and most of the advantages of dynamic typing, too.

>One final argument: suppose that Tcl code is harder to maintain,
>line for line, than code in a more strongly typed language (I suspect
>this is true).  But if a Tcl application has only 1/5 or 1/10 the lines
>of code of the equivalent program in a strongly typed language, it may
>still be easier to maintain overall.

I think that you're oversimplifying here.  Many of us will grant
that scripting languages are good to have, and many of us will
grant that something like dynamic typing is nice, at least compared
to having to declare everything all the time.  But there's no reason
why a scripting language can't be a reasonable subset of a general-purpose
language (e.g., Scheme with objects), so that you can just COMPILE
your code to make it run pretty fast, rather than having to REWRITE it
in a different language.

There's no reason why a straightforward Scheme-style interpreter
for a Tcl-sized subset of Scheme can't have a similarly small footprint
and run much faster.  And certainly you can compile it to vastly
faster code.

>That said, I still suspect that as scripting applications grow it makes
>more and more sense to implement parts of them in a system programming
>language.  The great thing about scripting languages is that this is
>easy to do.

Why not just write it in a good all-around interactive, compiler-friendly
language in the first place?

>You can take the performance-critical kernel of a Tcl
>application and implement it in C or C++; ditto for any complicated data
>structures or algorithms.  The simple, non-performance-critical parts
>can be left in Tcl.

I think you're really pushing a false dichotomy here.  There's big difference
between simple and non-performance-critical, and each of those is a spectrum.
Tcl is not good for non-simple OR for performance-critical programming.
It's so incredibly slow that you're forced to write many simple things
in a different language to get decent performance.  It's so incredibly
limited (e.g., in its datatypes) that it's awkward to write code which
may be somewhat sophisticated, e.g, using sophisticated data structures,
but is not time-critical.

>  I knew when I started on Tcl that it wouldn't be
>appropriate for all problems, so I designed it to work smoothly with
>other languages.

This is incredibly reasonable, but the scope of things for which Tcl
is good is way, way smaller than it needs to be.  You originally thought
that Tcl scripts would be a few lines, or tens of lines, and designed
the language accordingly.  But now people are writing things orders of
magnitude larger than that.  Either you made a mistake early on, or
they're making them now.

>In contrast, most languages are egotistical: they
>expect you to do *everything* in that language and make it very hard to
>split the functionality of an application between multiple languages.
>For example, I've been involved with several attempts to make C and Lisp
>work together, and they all failed.
>
>     - It is possible to make languages with execution speeds like C or C++,
>    that use dynamic typing successfully, whilst being high-level enough
>    in the creation of abstractions to "glue" things together quite
>    nicely and easily.
>
>Can you point to a specific language and identify a large community of
>users who agree with this assessment?

Depends on exactly what you're asking.  I'm not saying there's a single
implementation that gives you everyting you want for Tcl-style apps,
but there's NOTHING hard about it.

Consider SIOD, George Carette's small interpretive implementation of most
of Scheme, which is is embeddable as a scripting language.  You can
keep the footprint small by running the GC at a high rate, which will
slow things down a bit, but it'll still be a lot faster than Tcl.

You can write portable Scheme code using the SIOD interpreter, and then
run it through the Marc Feeley's Gambit compiler to get fast code.
Maybe not as fast as C on average, but sometimes faster, and generally
within spitting distance.

Similarly, if you want more scripting and process-control features,
you can use scsh, the Scheme shell.  (Unfortunately, scsh is currently
a bit of a pig, because it's built on a fat implementation of Scheme,
but it doesn't have to be---somebody could extend SIOD a little
and port scsh to it, and you'd get the benefits of both.  scsh
would run more slowly than it does now, but again, way faster than Tcl.)

>Many people have made claims like
>this to me, but no one has been able to point to a good real-world
>example.  The white paper argues that you can't have a jack-of-all-trades
>language.  Either you have a strongly typed language, which gives high
>speed and manageability but makes gluing hard, or you have a weakly
>typed language with the opposite properties.

I guess you haven't read the literature on Lisp and Scheme.
Don't be misled by the big-bag-of-features that Common Lisp became, 
or the pigginess of the implementations of it.  Lisp itself has
always been amenable to tiny implementations, if you care more about
footprint and startup times than running speed.

And even a small Lisp or Scheme is powerful enough to let you implement
an object system or a module system from within the language.  You just
need a macro facility, which isn't much code.

>     - Do you really think that object orientation has failed? C++ is a bad
>    OO
>    language, indeed, but what about Self, Java, and other such OO success
>    stories from... Sun Labs? Do I detect interdepartmental rivalry?
>
>I overstated the arguments against OO programming in the paper and I'll
>probably soften them a bit in the next draft.  I actually think that
>there are some good aspects of OO programming, and I use them myself
>even when I'm not programming in an OO language.  But I stand by the two
>main points in the paper, which are that (a) OO programming hasn't
>increased productivity dramatically because it doesn't raise the level of
>programming significantly (it may improve things 20-30%, but I doubt
>there's even a factor of 2, let alone 10) and (b) implementation
>inheritance really truly is a bad idea that tends to reduce reuse and
>productivity.  I think you'll see substantial support for the second
>claim even among OO enthusiasts.

I don't think most of us would put it that way.  Implementation inheritance
CAN be a truly bad idea if you use it for the wrong things, but it's often
a good thing if you know what you're doing and don't confuse interfaces
with implementations.  

>As for Java, it's hard not to be envious of its success (aren't you
>guys a bit envious too?), but Tcl is really symbiotic with Java, just
>as Tcl is symbiotic with C.  I look on Java as a better system
>programming language that's particularly well-suited for creating
>portable Internet components.  Tcl is moving to the Internet itself,
>and C isn't a good component language in that domain, so I'm delighted
>to have Java around for implementing Internet components that Tcl
>can then glue together.

One of the saddest things about Java is that it screams for an
interactive implementation, rather than batch compilation.  (Unlike
C++, it's got enough checking by default to build an uncrashable
interaction environment.)

Once there are interactive implementations, the dichotomy between
scripting and systems languages will be less obvious.
(And if you made type declarations optional in scripts, that
would go much further.)

> [ ... ]
>
>    His arguments on "typeless" languages is useless.
>    You don't need a "scripting language" to
>    get usable abstractions without the need
>    to deal with low-level issues.
>    
>    button .b -text Hello! -font {Times 16} -command {puts hello}
>    
>    In Macintosh Common Lisp I'll write this as:
>    
>    (make-instance 'button-dialog-item
>      :dialog-item-text "Hello"
>      :view-font '("Times" 16)
>      :dialog-item-action (lambda (item) (print "hello")))
>
>I think this example supports my claim that scripting languages are a
>lot easier to use when you need to mix and match lots of things of
>different types.  The MCL example is a lot more verbose and complicated
>than the Tcl example.

The MCL example is mostly more verbose because MCL uses longer identifiers.
If MCL were intended as a scripting language, the identifers could be changed
and some of them could be made more intuitive for non-Lispers.  You
could also trivially define the instance-making  macro (which I'll call new)
to coerce a font name string into whatever representation it uses for a
font spec.:

 (new button :text "Hello" :font "Times 16" :cmd (proc (item) (print "Hello")))

There's nothing about Lisp that makes this much more verbose than
the Tcl version.  It's just that MCL isn't mainly designed as a scripting
language, and Lispers usually like to write readable code.  This verbosity
has little to do with anything deep about the language.

The nice thing about Lisp is that you can have your cake and eat it too.
You can define the things that are frequently used in scripts so that
they're terse, and the other things so that they're clear.  (You can
do this from within the langauge, by defining procedures and macros
that are really just aliases for standard things, using abbreviated
names.
-- 
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and  Scheme interpreters and compilers available via ftp from 
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)