SemanticsOfProgrammingEnvironments

ThoughtStorms Wiki

I started thinking about this when I first saw Struts.

Now I loath this framework with a passion, and the more I have to use it, the less I like it. But this isn't an anti-Struts rant. It tries to highlight one wrong assumption I think people make.

One of the alleged virtues of Struts is that it decouples the meanings of actions from their URI. Instead of building a system whos URI goes : that http://blah/ACTION.cgi that triggers ACTION ... you write http://blah/ACTIONSELECTOR where ACTIONSELECTOR is an index in a lookup table in the Struts server configuration file, which maps onto ACTION. The reason this is meant to be good is that if we suddenly choose to make http://blah/ACTION_SELECTOR perform ACTION2, we don't need to run through all the code in the system, replacing references to

ACTION with references to ACTION2. And if your mind is poisoned by too much exposure to computer science education, you'll agree that this is a very convenient thing.

Now, why do I (who was exposed to all this computer science education but somehow failed to believe it) think this is a damned stupid thing? And that the trouble it causes overrides the miniscule gain.

Well, because it ignores that fact that semantics is a social game. It's based on what's in people's heads as well as configuration files. So the first question is what's the relation between ACTION_SELECTOR and ACTION? In a sensible world, both these will have meaningful names. This helps developers and users of a site / service. Developers when they maintain the site, and users when they build systems on top of the service.

Ideally, the meanings of the ACTIONSELECTOR and ACTION should be so obviously associated with each other that the maintainer can infer between them automatically. But in Struts, you can never trust such an assumption. I've spend literally hours in reverse lookup between seeing an ACTIONSELECTOR in my browser and having to find the ACTION it corresponds to in order to find the bug. (This is a simplification, in struts an error indicated at an ACTION_SELECTOR can be due to errors in 4 or 5 different places.)

You could choose a naming convention where ACTION_SELECTOR and ACTION are derivably related. But then you've just made a social pact NOT to take advantage of the main justification of the indirect mapping in the first place. So you're stuck with the overhead of the scheme (creating the mapping table) without any benefit. A better system would be an ad hoc one which does let you redirect from one URI to another, but in an unprincipled way; only if it is necessary as an emergency hack. The advantage of such a scheme is you could document the few exceptional cases in a succinct list which would be easy to consult when you suspect a redirect is taking place, and short enough to glance at to find out the new mapping.

But in fact, without any sort of redirect, maintaining your system is easier. Sounds crazy huh? But it's true. Let's suppose you have a huge system of thousands of pages and scripts which refer to each either by hard-embedded URIs of the form. http://blah/ACTION all of which need to be changed to http://blah/ACTION2. Then a global search and replace on all those files will do the job beautifully.

Why might you disagree?

Well, you might think :

  • I can't trust a simple search and replace not to get it wrong / miss stuff
  • I'm relying on my platform to support global search and replace
  • What about links from outside that I have no control over?

OK, well in the case of 1, why not? Probably because you are already trying to do something clever with your URI processing. Perhaps some http://blah/ACTIONS which appear on a page are being generated dynamically? Why this is happening? Perhaps you have a place where it's necessary to dynamically generate the URI? OK, then here, in this dynamic generator is where to make some changes, so that this dynamic process spits out ACTION2. Otherwise you've got a piece of dynamic logic which is out of date, and are using a separate process (in another part of the code) to fix it. Imagine the surprise of later developers who come and fix this dynamic generator to spit out ACTION3 and then can't understand why the system breaks totally.

In the case of 2. This is a good thing. Big complex systems which work, are inescapably embedded in their environments. We discover that software can be abstracted away from this environment. That it can "run anywhere". This is a nice feature. But the downside, is we are tempted to want more of it. Sometimes more than is strictly or necessary good for us. For example when we have a fantastic operating system like Linux full of well written, well tested, small sort and search functions and we decide that we have to reimplement sorting and searching again on top of a virtual machine that can run anywhere. What we basically end up doing is reinventing the wheel, many times, at each layer of our virtual machine. Stop it. It's just NotInventedHere syndrome. Sometimes it's better to use good tools that are already available.

Now, onto 3. This is a good worry to have. It sucks to break links from outside. Especially from people you provide a service to. So now's a good time to ask yourself some questions :

  • why do I want to change ACTION into ACTION2?
  • will this break any assumptions that all those unknown users out there, who I don't know about, and whos minds I can't read, are making about the functionality of ACTION?
  • if it isn't, if it's a transparent and undeniably good improvement, why aren't I just making it hidden behind the existing abstraction of the API? ie why don't I just change the functionality of ACTION itself?

There'll still be occasions when having a quick indirection is handy. But I suspect once these alternatives are considered thoroughly, you'll find that such redirection was usually being used to cover or avoid rethinking some other problem. Now hacks and patchs are no bad thing; but the problem is that on the web you don't own the semantics of the URIs. You're tempted to think that like Humpty Dumpty, you have the right to remap words to mean what you want them to mean. But this breaks the implicit understandings over the whole network.

More discussion in DoesAbstractionScale?

Note this issue also crops up when trying to share work between programmers and graphic designers who are working on webpages.

Often the wiring between the designers and coders is in the form of coder defined tags. Once again, there's a lot of overhead keeping track of multiple layers of indirection which allow the names of the things in code to slide against the names of the things in the tag language. But once again inference between the two would be easier of there were constraints; and the semantics should be fixed by the real world conversation between the two communities.

(See also DecompositionByLanguageIsProbablyAModularityMistake)

Another Worry

I note the contrast here with LateBinding. I'm thinking how to square this circle. Maybe UsersFindAbstractionHard. See also SemanticWeb

Also how does this square with the ideas in PhenotropicProgramming where interaction between code chunks is fuzzy. Isn't the semantics of real language fuzzy? Weird to think of web services in a Witgensteinian language game with each other!

Counter

Duh! BridgePattern?

Struts is, of course, a horrible muddle due to it's naive understanding of ModelViewController

CategoryDesign, CategorySemantics, CategoryComputerScience, CategorySoftware