John Haugeland on the failure of micro-worlds

One of the better books I read in college was Artifical Intelligence:
The Very Idea (1985) by philosopher
John Haugeland. One of the sections I found
most striking and memorable was about Terry Winograd’s
SHRDLU. SHRDLU, around 1970, could carry on a
discussion in English in which it would manipulate imaginary colored
blocks in a “blocks world”. displayed on a computer screen. The
operator could direct it to “pick up the pyramid and put it on the big
red cube” or ask it questions like “what color is the biggest cylinder
that isn’t on the table?”.

Haugeland was extremely unimpressed (p.190, and more generally 185–195):

To dwell on these shortcomings, however, is to miss the fundamental
limitation: the micro-world itself. SHRDLU performs so glibly only
because his domain has been stripped of anything that could ever
require genuine wit or understanding. In other words, far from
digging down to the essential questions of AI, a micro-world simply
eliminates them. … the blocks-world “approximates” a playroom more
as a paper plane approximates a duck.

He imagines this exchange between the operator and SHRDLU:

Trade you a squirtgun for a big red block.

SORRY, I DON’T KNOW THE WORD “TRADE”.

Oops, a vocabulary snag. Fortunately, SHRDLU can learn new
words; Winograd taught him about steeples, and now we’ll tell
him about trades (and squirtguns).

A “trade” is a free exchange of property.

SORRY, I DON’T KNOW THE WORD “FREE”.

A “free” act is done willingly, without being forced.

SORRY, I DON’T KNOW THE WORD “ACT”.

“Acts” are what people do on purpose, and not by
accident.

SORRY, I DON’T KNOW THE WORD “PEOPLE”.

Sorry, I thought you were smarter than you are.

SORRY, I DON’T KNOW THE WORD “SORRY”.

What does Haugeland say he would like to have seen?

Try to imagine, for instance, that instead of crashing on
the first word, SHRDLU would give our offer a typical, ordinary
response:

I ALREADY HAVE A WATER PISTOL, BUT I’LL GIVE YOU
TWO BLOCKS AND A PYRAMID FOR YOUR SLIMY OLD
FROG.

Childishly simple, no doubt; but look at how much is presupposed:

SHRDLU appreciates that a second squirtgun would be less valuable
than the first; …

On this standard, at least, an LLM is a smashing success. It does,
in fact, have a model of trading, acts, property, and water pistols.
We might criticize the model’s accuracy, or usefulness, but it
certainly exists. The large language model is a model of the
semantics of trading, acts, property, water pistols, and so on.

Curious to see how it would go, I asked Claude to pretend it had
access to a SHRDLU-like blocks world:

Let’s pretend that we have a blocks world and you have access to it,
the way SHRDLU does, except you are not SHRDLU, but still Claude.
So I could ask Claude “put the pyramid on the big red block ” and,
if there were several pyramids, you might reply “I don’t know which
pyramid you mean” or whatever Claude would actually say in such a
situation. Was that clear?

Yes, that’s clear. I’ll act as Claude with access to a blocks world
environment—able to manipulate blocks, pyramids, and other shapes on
a table, stack them, move them around, and so on.

I asked it a few SHRDLU-like questions about the blocks, then asked it
to put a block on a pyramid. It clearly understood the point of the
exercise:

I can’t do that—the blue block is much larger than the
red pyramid, and the pyramid has a pointed top. The block wouldn’t
balance; it would just fall off.

SHRDLU could handle this too, although I think
its mechanism was different: it would interact with the separate
blocks world subsystem and ⸢actually⸣ try to put the block on the
pyramid; the simulated physics would simulate the block falling off
the pyramid, and SHRDLU would discover that its stacking attempt had
been unsuccessful. With Claude, something very different is
happening; there is no physics simulation separate from Claude. I
think the answer here demonstrates that Claude’s own model includes
something about pyramids and something about physics.

Then I made the crucial offer:

Trade you a squirtgun for a big red block.

I appreciate the offer, but I don’t really have a way to accept a
squirtgun from you, nor do you have physical access to the blocks
world to hand me one.

That said—I could just add a big red block to the world if you want
one in here. Should I put one on the table?

Would Haugeland have been satisfied in 1985 if SHRDLU had said this?
I think certainly. Haugeland wanted SHRDLU to respond to the offer
directly, as the beginning of a negotiation. Claude’s response is one
level better from that: it not only recognizes that I negotiating, it
recognizes that actually negotiating for the squirt gun would not
make sense, and offers a sensible workaround.

I pushed it a little farther:

I have a squirtgun here. I can designate it as belonging
to you, if you agree that the big red block belongs to me.

Alright, you’ve got a deal. The big red block is yours.

I’ve placed a big red block on the table—it’s now designated as
belonging to you. And I acknowledge the squirtgun as mine, though I
confess I have no idea what I’ll do with it in a blocks
world. Perhaps threaten uncooperative pyramids into stacking
properly.

Mostly I just tried this for fun. The Haugeland discussion of SHRDLU
has been knocking around my head fo forty years, but now it has
knocked against something new, and I wanted to see what would actually
happen.

But I do have a larger point. Haugeland clearly recognized in 1985
that a model of the world was a requirement for intelligence:

The world of trading cannot be “micro.” … There are no plausible,
non-arbitrary boundaries restricting what might be relevant at any
juncture.

and later:

The world cannot be decomposed into independent fragments. Realizing
this amounts to a fundamental insight into common sense and mundane
intelligence — and therefore points the way for subsequent AI.

Are there are any people who are still saying “it’s not artifical
intelligence, it’s just a Large Language Model”. I suppose probably.
But as a “Large Language Model”, Claude necessarily includes a model
of the world in general, something that has long been recognized as an
enormous prerequisite for artificial intelligence. Five years ago a
general world model was science fiction. Now we have something that
can plausibly be considered an example.

And second: maybe this isn’t “artifical intelligence” (whatever that
means) and maybe it is. But it does the things I wanted artificial
intelligence to do, and I think this example shows pretty clearly
that it does at least one of the things that John Haugeland wanted it
to do in 1985.

My complete conversation with Claude about this.

Addendum

I don’t want to give the impression that Haugeland was scornful of
Winograd’s work. He considered it to have been a valuable experiment:

No criticism whatever is intended of Winograd or his coworkers. On
the contrary, it was they who faithfully pursued a pioneering and
plausible line of inquiry and thereby made an important scientific
discovery, even if it wasn’t quite what they expected. … The
micro-worlds effort may be credited with showing that the world
cannot be decomposed into independent fragments.

(p. 195)

[Other articles in category /tech/gpt]
permanent link

Source link

Addendum

Leave a Reply Cancel reply