Stories

Mastering Mumps: Part 2 Basic Commands

If you haven’t seen our introduction on MUMPS, also known as M, please check it out here. It has links to resources we’ve used to learn MUMPS and to our other posts on MUMPS. We have relied heavily on the documentation and this MUMPS presentation to learn the basic functionality of the language.

We could spend many pages discussing the basic functionality of most of M’s commands. The official documentation does a good job of that, so in this post we’ll go over the nuances and pitfalls that we’ve encountered with some of the commands.

We will be using the full names of the commands in this post, but M programmers typically use an abbreviated form. The abbreviation will appear in square brackets. For example, we might write “[q]uit” to indicate that quit can be spelled “quit” or “q”. As far as we have seen, command case does not matter when executing M code.

[n]ew

After reading the docs, we still did not understand the implications of using the new command. We had observed that our M code would run successfully with or without it, but we hadn’t written very complicated code yet. In complex routines, the new command will help avoid unexpected behavior caused by name collisions and prevent temporary memory leaks.

Take the following example:

This program outputs “Bob”, not “Alice” as you might expect. firstTagnewed the firstName variable, making it available to all subroutines of the current job. secondTag is a subroutine of the current job, so it just has access to firstName. It can use and mutate the value of firstName as if it had defined it, which means we can’t be sure of the value of firstName in firstTag.

Adding new firstName to the beginning of secondTag saves us from this problem by temporarily storing the old value of firstName and unsetting it. At that point, secondTag is free to manipulate that variable without affect its value in firstTag’s scope. As soon as secondTag ends, M automatically sets firstName back to its original value.

The new command also ensures that M will automatically dispose of any variables created in a given scope. Without it, local variables will hang around taking up memory until the process ends. At Menlo, we made it policy to new all variables we need for each tag to avoid these issues.

[q]uit

In addition to triggering garbage collection, quit helps us control the flow of our program. Tags in M are not self-contained, they are only entry points. Once a tag has been called, M will simply continue executing lines, top to bottom, until it finds a 'quit' command, even if that quit command is in another tag in the routine. We have to make sure to define the end point so M knows when to return to the last thing in the call stack.

In the following snippet, if someone calls the doThis tag, doThat will also be executed because doThis does not end with a quit.

At Menlo, we would consider doThat to be dangerous, because it violates the “Separation of Concern” principle. Any tag or programmer that calls doThat has to know to set the name variable first, which should only be the concern of doThat. If doThat was called and name was not defined, an error would occur.

[s]et and [m]erge

set and merge are fairly straightforward, but if you mistakenly use set where you want to use merge, it can be difficult to track the problem down.

Remember: set copies only the value of a variable, while merge copies the top-level node and all of its children. merge is generally non-destructive. If the variable into which you are merging an array already has children, those children will remain following the merge operation. merge is only destructive in the event that the two arrays you are merging have the same children.

[d]o and $$

We use the do command to execute tags that do not return a value. If the tag you want to execute returns a value, it must instead be executed with $$. $$ can be thought of as “the return value of” a tag. To return a value from a tag, use the quit command followed by the value to return.

[i]f and Post-Conditionals

As a brief review, a post-conditional is a command, followed by a colon and a condition, followed by the argument to the command. write:1>2 “Math is broken” would not write the string “Math is broken” because 1 is not greater than 2.

The important difference between if and post-conditionals is how they affect the code that follows them on the same line. if commands determine whether all of the code following them will be run, while post-conditionals only pertain to a single command.

The line using ifs above would only write the string “first ”. When it evaluates the second if, it finds that bool2 is false and ignores all the following code.

M does not have a command for a while loop, but we can use a for loop that has a conditional quit instead of an end for its index incrementer. For the quit to work conditionally without affecting the execution of the loop, we must use a post-conditional instead of an if command.

In the following example, for i=1:1 tells the for loop to start at one and count up forever, because it doesn’t have a stop value. The “:” separates the start and step values of the incrementer and is unrelated to post-conditionals.

In the above example, quit:value=0 checks value at the start of each loop, and ends the loop only if value has been set to 0 during the previous iteration. If we had tried to use an if instead of a post-conditional, the loop would never have executed:

When the if evaluates False, it stops everything following it from executing. When it evaluates True, quit executes and stops the loop. There are many more use cases for the for command, so we will cover it in a future post.

[e]lse

Remember to put 2 spaces between else and another command. M always expects the format: . For commands that take no arguments, like else, we leave space for the non-existent arguments:

Something we found very interesting about the else command is that it checks a MUMPS variable called $TEST or $T to decide whether it should execute the code in the else block. $TEST stores the result of the most recently executed if, open, lock, job, or read (with timeout) in the current scope. We have tested it in Caché and found that $TEST in one scope does not affect $TEST in another scope. Take this example:

Our else command is not executed because $TEST was set to 1 by the first if command. It doesn’t matter that $TEST is 0 at the end of the if block. When execution moves back up to the higher scope, $TEST gets reset to its previous value.

A word of warning about if, else, and $TEST: else cannot take a condition, so it always checks $TEST. This means that you can use else without an if, and it will operate based on the current value of $TEST. if will also check $TEST if you don’t pass it a condition. As a global variable, $TEST already has a value when you connect to your MUMPS server. At Menlo, we won’t use if and else this way, because we have no way of knowing what $TEST is, especially at the beginning of a tag. To illustrate, the following code will execute all the write commands.

Standard MUMPS has a few more commands that we have not used yet at Menlo. We plan to cover goto and xecute in a future post. Until then, have you found any nuances not mentioned here among M’s commands?