Discussion:
It's all the fault of the ASCII committee
(too old to reply)
trijezdci
2016-08-30 13:07:10 UTC
Permalink
Raw Message
The "|" is a logical OR separating the various cases and as such should
only appear between clauses and not precede the first one.
PIM and ISO define the | within CASE statements as case label separators.

M2 R10 defines them as case label prefixes, not as separators.

I will explain the rationale for prefixes a little further below but let me first talk about how this all came about.

Initially, we went along with the concept of using separators and we considered to use double-semicolons as case label separators.

CASE x OF
a : ...; ... ;;
b : ...; ... ;;
c : ...; ...
END;

This would have freed up the | for use as a replacement for OR which in turn would have allowed consistent use of symbols as logical operators, removing reserved words AND, OR and NOT. Our rationale was: either all are symbols or all are reserved words, not both, not mixed.

However, considering the principle of least surprise, we eventually decided against double-semicolons and removed the & and ~ as synonyms for AND and OR instead.

This turned out to be helpful later when we decided against the use of + as concatenation operator since we were then able to use the freed up &.

Yet, while we had been looking for an alternative for | as case label separator, we felt that the ideal symbol would be a bullet and that this ought to be a prefix, not a separator:

CASE x OF
• a : ...
• b : ...
• c : ...
END;

I am pretty confident this is what Wirth would have designed had he had a bullet symbol at his disposal[1].


Our primary driver to switch from case label separator to case label prefix was readability when the CASE statement spans multiple lines which is the vast majority of use cases.

CASE x OF
| a : ...
| b : ...
| c : ...
END;

In fact this had already been a discussion point at meetings of the ISO M2 working group. Many of the delegates felt that a prefix was more readable and a compromise was reached to allow both variants.

Since we don't like to have alternative variants of syntax, we decided in favour of prefixes and against separators.


[1] Unfortunately, the good people on the ASCII committee decided to waste 33 code points on control codes most of which were only useful for teletype machines, the replacement of which with video terminals was already foreseeable and had in fact already started while the committee was active. Due to this shortsightedness, there wasn't any space left for a bullet in the ASCII set.
It is inconsistent to put this "|" phantom in
and then fault trailing ";".
It would be so if | was a separator, but it isn't since it is a prefix. ;-)


Hope this clarifies.
r***@gmail.com
2016-09-01 22:49:48 UTC
Permalink
Raw Message
Hi,

You're clearly very knowledgeable, so I'm not sure why I'm commenting.
You don't need my help. :-)
Post by trijezdci
The "|" is a logical OR separating the various cases and as such should
only appear between clauses and not precede the first one.
PIM and ISO define the | within CASE statements as case label separators.
IIRC, one or two compilers I tried (PIM, not ISO) didn't like '|' before
the first choice. Those may have been old PIM2, though.
Post by trijezdci
Initially, we went along with the concept of using separators and
we considered to use double-semicolons as case label separators.
This would have freed up the | for use as a replacement for OR which
in turn would have allowed consistent use of symbols as logical operators,
removing reserved words AND, OR and NOT. Our rationale was: either all are
symbols or all are reserved words, not both, not mixed.
Presumably "OR" isn't considered too much harder to type than '|'.
IIRC, the aliases '~', '&' were only optional and introduced in PIM3
(although Wirth made them mandatory in Oberon).
Post by trijezdci
However, considering the principle of least surprise, we eventually
decided against double-semicolons and removed the & and ~ as synonyms
for AND and OR instead.
So did Modula-3.
Post by trijezdci
Yet, while we had been looking for an alternative for | as case label
separator, we felt that the ideal symbol would be a bullet and that
Wasn't ":=" supposed to represent the left arrow? So why not go all out?
(Personally, I abhor that, I prefer old-fashioned ASCII.)
Post by trijezdci
I am pretty confident this is what Wirth would have designed had he
had a bullet symbol at his disposal[1].
Maybe so, but he changes his mind a lot.
Post by trijezdci
Our primary driver to switch from case label separator to case label
prefix was readability when the CASE statement spans multiple lines
which is the vast majority of use cases.
CASE x OF
| a : ...
| b : ...
| c : ...
END;
In fact this had already been a discussion point at meetings of the
ISO M2 working group. Many of the delegates felt that a prefix was
more readable and a compromise was reached to allow both variants.
They also allowed '!' instead, right? (Unlike PIM.)
Post by trijezdci
Since we don't like to have alternative variants of syntax,
we decided in favour of prefixes and against separators.
[1] Unfortunately, the good people on the ASCII committee decided
to waste 33 code points on control codes .... Due to this
shortsightedness, there wasn't any space left for a bullet in the
ASCII set.
So use '.' or '*' instead. Don't a lot of markup languages use the
asterisk for that?
Post by trijezdci
Hope this clarifies.
I wish you luck, but (to be honest) this all sounds way over my head.
trijezdci
2016-09-02 02:06:50 UTC
Permalink
Raw Message
Post by r***@gmail.com
IIRC, one or two compilers I tried (PIM, not ISO) didn't like '|' before
the first choice. Those may have been old PIM2, though.
I don't think there would be any PIM compilers that permit this, but I do recall vividly that WG13 discussed and decided to permit this in ISO, although I don't quite remember which meeting it was.
Post by r***@gmail.com
Wasn't ":=" supposed to represent the left arrow? So why not go all out?
(Personally, I abhor that, I prefer old-fashioned ASCII.)
I think this is entirely unrelated but if the ASCII committee had included a left array in the set and it would have taken hold for use as an assignment operator, we would by now find it natural. You would likely have found the idea of using ":=" for assignment abhorrent then. ;-)
Post by r***@gmail.com
Post by trijezdci
I am pretty confident this is what Wirth would have designed had he
had a bullet symbol at his disposal[1].
Maybe so, but he changes his mind a lot.
Correct. Although I am near certain that had there been a bullet available and had he used it as a prefix for case labels he wouldn't have changed his mind on that because it is the most natural looking syntax possible. We'd all find it so obvious, nobody would question it.
Post by r***@gmail.com
They also allowed '!' instead [of '|'], right? (Unlike PIM.)
Indeed, thanks for reminding me of that, I forgot to mention it in the comparison chart.
Post by r***@gmail.com
So use '.' or '*' instead. Don't a lot of markup languages use the
asterisk for that?
That would make the syntax ambiguous.

You could have

CASE x OF
. foo : y := 5
.

where . follows 5 and it might as well mean 5. although we forbid real number literals starting or ending in a decimal point and this could be resovled in the lexer.

However, the dot has a very small visual footprint which makes it unsuitable for this purpose since the whole rationale for the case label prefix is to stand out.


The asterisk is better in this regard, but it causes ambiguity, too. You could have

CASE x OF
* foo : y := 5
* bar :

which could mean y := 5*bar

The grammar would no longer be LL(1).

In fact you need an LL(*) grammar, that is LL(k) for indefinite k because the case label could be a constant expression of the form:

CASE x OF
* foo : y := 5
* bar*baz+bam/boo-blob : ...

You'd need to keep parsing the expression without knowing whether it belongs to the right hand side of the assignment to y or to the following case label until you eventually find the : and since the expression has arbitrary length from the grammar point of view, you would need indefinite lookahead.


In any event, we settled for | as a prefix and this is not going to change now. It's not as nice as a true bullet would have been, but it is a good compromise. It stands out nicely ...

CASE x OF
| foo : ...
| bar : ...
| baz : ...
END

It stands out a little less when each case spans multiple lines ...

CASE x OF
| foo : ...
...; ...;
...; ...; ...
| bar : ...
...; ...;
...; ...; ...
| baz : ...
...; ...;
...; ...; ...
END;

but still enough to give a reader's eyes visual cues.

Loading...