Stories

Mastering Mumps: Part 3 Basic Operations

If you haven’t seen our introduction on MUMPS, also known as M, please check it out here. It has links to resources we’ve used to learn MUMPS and to our other posts on MUMPS. We have relied heavily on the documentation and this MUMPS presentation to learn the basic functionality of the language.

M has most of the operators we have come to expect from other programming languages, plus a few interesting ones that give M an edge. A lot of the operators you’re familiar with have different symbols in M, so we’ll mention those before diving into the details of the more unusual ones.

A note about code examples: we covered commands like write, read, do, and set in a previous post. In that post, we used the full name of each command in all the examples. However, M convention as we have learned it is to use the abbreviated form for commands. This post will use the abbreviated form of the commands in code examples, e.g. w instead of write. We have also put a ,! (newline) at the end of each write in the code examples so when you run them, each output will be on its own line.

What’s different, and what’s the same?

As you might expect, + is for addition, - for subtraction, * and / for multiplication and division, and ** for exponents. >, <, >=, and <= all behave like they do elsewhere. & represents logical AND.

Interestingly, in MUMPS math, strings that begin with numeric characters are treated as numbers even if they contain letters, while strings that begin with non-numeric characters are treated as strings, even if they contain numbers. Non-numeric strings are always equal to 0. The following code gives an example. The output is written in a comment following each line.

Officially, M considers the semicolon to be a command, but we thought it would be appropriate to discuss here. ; denotes a comment, so M will ignore anything that comes after it. Note that it is not required at the end of each line. Because empty lines are not allowed in M, we use ; on otherwise empty lines to give ourselves some whitespace to make our code more readable.

Similarly, the docs don’t mention ,!, which is used after a write command and its argument to append a newline character to the output. For example: write "Hello",! outputs the string “Hello” followed by a newline. We try to be in the habit of following every write command with ,!.

Other operators may look unfamiliar in M. # is the symbol for “modulo”, ’ is the symbol for logical NOT, and ! is logical OR. Depending on context, = is used both for assignment and for equality comparisons — there is no == in M.

Finally, _ is for string concatenation. M will coerce numbers to strings during concatenation, so write 1_". "_12_"eggs" will output 1. 12 eggs.

The \ operator

I over-simplified earlier: / is for decimal division. It will return decimal results if needed, so 1/4 is equal to 0.25. On the other hand, \ does integer division. You may pass it decimal numbers, but it will always round down to the nearest integer (behaving like a floor function).

The [ operator

Something we at Menlo have found useful in a handful of languages is a built-in way to check whether a string contains a substring. That’s what [ does for us in M. The lefthand operand is the string to search, and the righthand operand is the substring.

The ] and ]] operators

The official documentation refers to ] as the “follows” operator, and ]] as the “sorts after” operator but it doesn’t explain what those words mean. If we try them out, we see that 4]3 and 4]]3 both evaluate to True, so what’s the difference?

] will evaluate to true if the lefthand operand follows the righthand operand in M’s character encoding sequence. By default, this sequence is ASCII, so an ASCII table can help you determine what the outcome of a comparison with ] will be. M also has a built in function, $ASCII that will tell you the ASCII value of a character passed to it. Essentially, B]A is a shortcut for the expression $ASCII("B")>$ASCII("A"). Here are some more examples:

* Menlo uses the Intersystems Caché implementation of M.

]], on the other hand, evaluates whether the lefthand operand sorts after the righthand operand in the “subscript collation sequence” ¹. The subscript collation sequence is what M uses to sort the subscripts (child nodes) of an array. You may have used the $ORDER function (covered in another post). $ORDER returns the next node of an array using the subscript collation sequence. This sequence is slightly different from ASCII in that numbers come first. After that, ASCII values are followed. Some examples:

Order of Operations

Except for parentheses, M evaluates operators in a strict left to right order. This can cause unexpected behavior if you are used to languages that define their own order of preference for each operator. Here are some mathematical examples:

Left-to-right operation is particularly tricky with logical operators. In the following examples, we will use 1 to represent ‘true’ and 0 to represent ‘false’. Remember, each line ends with ,!. The exclamation mark in the newline character is unrelated to logical OR.

M does not use Short-Circuit Evaluation when evaluating logical operators. In some languages, if the first expression of a logical AND evaluates to false, the second expression will not be evaluated because the whole logical expression must be false. This is not the case in M. Both pieces of a logical expression are evaluated.

This is important to remember if you decide to use a tag with side-effects as the expression following &, The tag will always be executed, and the side-effects will always occur, as in the following example:

This code outputs 2 strings: “this is a side effect” and “this will print”. Even though the first expression that & is evaluating is false, it still calls evalWithSideEffect. At Menlo, we are careful not to write code like this because some languages handle short-circuit evaluation differently than others.

M’s operators may seem strange to programmers who are used to more-modern languages, but used correctly, they can do everything that we need them to do. They even include a few handy tools that other languages do not, like \ and [. We did not mention one of M’s operators in this post: ?, the pattern-matching operator. It can be tricky for newcomers to grasp, but it is extremely powerful. Check out the documentation on more information about the ? operator.

Footnotes:

[1] — For the definition of “subscript collation sequence” and discussion of ] and ]], see this page. It is specific to the GT.M implementation of MUMPS, but holds true in Caché.