Discussion:
New front-end for M2C operational -- testers wanted.
(too old to reply)
trijezdci
2015-12-11 11:03:25 UTC
Permalink
Raw Message
I have now rewritten the front-end of Makarov's M2C Modula-2 compiler.

The front end can be downloaded from:

https://bitbucket.org/trijezdci/m2c-rework/src

Simply type make and hit return to build it.

m2c -h will print a help with available options.

All OS dependencies are in a single module called m2-fileutils. For this module there is at present no Windows/DOS implementation. It is a small module with only a handful of functions and most are wrappers. It shouldn't be difficult to modify this to derive a Windows/DOS version. With a bit of luck somebody may volunteer to contribute an Windows/DOS implementation of the module.

For those who don't know, M2C is a "via-C" Modula-2 compiler and translator that supports PIM3 and PIM4 plus some extensions. In translator mode it generates C output. In compiler mode it generates executables by calling the resident C compiler on the intermediate C code.

We are using M2C for our initial bootstrap compiler for Modula-2 R10. There are some annoyances with M2C however and its source code is rather incomprehensible due to the way it is structured and because the comments are all in Russian pidgin English. Since I am doing the same kind of work for our own compiler anyway (although in M2, not C) I decided to simply rewrite it, or at least the front end.

At present I am leaning towards rewriting the back end as well, but will take another look to assess the prospect of marrying the new front end with the existing back end at least in the interim. It may however, be less effort to just rewrite the back end cleanly than wading through all that mud.

Anyway, it would be very much appreciated if folks here could test drive the front end on their real world PIM Modula-2 sources to give the parser plenty of scrutiny.

I added an undocumented flag --parser-debug which if set prints the current production rule, line, column and lookahead symbol which is useful to asses what the parser is doing when it doesn't report any errors (as will be the case with error free code).

Note that by default the front end expects C-style prefix literals and Oberon-style extensible record types. It can be put into strict PIM mode by flags --pim3 and --pim4. The help screen explains all the option flags.
Nemo
2015-12-12 03:01:49 UTC
Permalink
Raw Message
Post by trijezdci
I have now rewritten the front-end of Makarov's M2C Modula-2 compiler.
https://bitbucket.org/trijezdci/m2c-rework/src
Simply type make and hit return to build it.
The Makefile needs m2c.c and there appears to be no m2c.c in the
archive. Please advise.
trijezdci
2015-12-12 04:08:44 UTC
Permalink
Raw Message
Post by Nemo
Post by trijezdci
I have now rewritten the front-end of Makarov's M2C Modula-2 compiler.
https://bitbucket.org/trijezdci/m2c-rework/src
Simply type make and hit return to build it.
The Makefile needs m2c.c and there appears to be no m2c.c in the
archive. Please advise.
Seems I had forgotten to check in the file, my apologies. It is now in the repo.

https://bitbucket.org/trijezdci/m2c-rework/commits/1decfb74c5d214fb4ea5843a8e4e97b1139b8cbc

thanks
trijezdci
2015-12-12 08:40:50 UTC
Permalink
Raw Message
The repo's wiki page now shows instructions how to build and test the front end:

https://bitbucket.org/trijezdci/m2c-rework/wiki/Home
Christoph Schlegel
2015-12-13 07:02:38 UTC
Permalink
Raw Message
Post by trijezdci
Post by Nemo
Post by trijezdci
I have now rewritten the front-end of Makarov's M2C Modula-2 compiler.
https://bitbucket.org/trijezdci/m2c-rework/src
Simply type make and hit return to build it.
The Makefile needs m2c.c and there appears to be no m2c.c in the
archive. Please advise.
Seems I had forgotten to check in the file, my apologies. It is now in the repo.
https://bitbucket.org/trijezdci/m2c-rework/commits/1decfb74c5d214fb4ea5843a8e4e97b1139b8cbc
thanks
Hi,

stops here now:

$ make
make: *** No rule to make target 'print_first_sets.c', needed by 'print_first_sets.o'. Stop.

Regards,
C.
trijezdci
2015-12-13 08:58:18 UTC
Permalink
Raw Message
Post by Christoph Schlegel
make: *** No rule to make target 'print_first_sets.c', needed by 'print_first_sets.o'. Stop.
Sorry, there is a copy paste error in the makefile, it should be

APP_1 = m2c

APP_2 = testlex

APP_3 = gen_first_sets

APP_4 = gen_follow_sets

APP_5 = gen_resync_sets

APP_6 = print_first_sets

APP_7 = print_follow_sets

but APP_2 was there twice

I am stomped that this Makefile worked for me at all :-o

Anyway, I updated the repo with the fixed Makefile.

https://bitbucket.org/trijezdci/m2c-rework/commits/50d2c13fbba29039238d9899dcc4ddfc6e7b7be8

If you check it out now, it should work.

thanks for the feedback.
j***@gmail.com
2015-12-13 10:49:11 UTC
Permalink
Raw Message
m2c goes berzerk when a TAB is found:

***@oxygen ~/Modula/PIM$ ../m2c ggT.mod --pim3
m2c Modula-2 Compiler & Translator, version 1.00
line: 3, column: 7, invalid character, offending character code: 0u9
line: 5, column: 4, invalid character, offending character code: 0u9
line: 5, column: 9, invalid character, offending character code: 0u9
line: 9, column: 44, invalid character, offending character code: 0u9
line: 9, column: 59, invalid character, offending character code: 0u9
line: 9, column: 60, invalid character, offending character code: 0u9
line: 10, column: 44, invalid character, offending character code: 0u9
line: 10, column: 59, invalid character, offending character code: 0u9
line: 10, column: 60, invalid character, offending character code: 0u9
parse error count: 0
***@oxygen ~/Modula/PIM$
trijezdci
2015-12-13 11:59:26 UTC
Permalink
Raw Message
Post by j***@gmail.com
line: 3, column: 7, invalid character, offending character code: 0u9
ASCII_TAB was incorrectly defined, fixed now:

https://bitbucket.org/trijezdci/m2c-rework/commits/4db8376f89d03c3f71a17f28843ec4adeae54e87

thanks
Christoph Schlegel
2015-12-13 14:12:05 UTC
Permalink
Raw Message
m2c now builds on Windows7/Cygwin! I fed the first program from the Coronado Tutorial to it and got a weird result. Here is the program:

MODULE FirstEx;

FROM InOut IMPORT WriteLn, WriteString, WriteCard;

VAR Index : CARDINAL;

BEGIN

WriteString("This is our first example program");
WriteLn;
WriteLn;
FOR Index := 1 TO 12 DO
WriteString("The value of the index is now ");
WriteCard(Index,3);
WriteLn;
END

END FirstEx.

Here is the result of
$ ./m2c ./FirstEx.mod --parser-debug

***@Christoph-THINK ~/m2c-rework
$ ./m2c ./FirstEx.mod --parser-debug
m2c Modula-2 Compiler & Translator, version 1.00
*** programModule ***
@ line: 1, column: 1, lookahead: MODULE
*** import ***
@ line: 4, column: 1, lookahead: FROM
*** unqualifiedImport ***
@ line: 4, column: 1, lookahead: FROM
*** identList ***
@ line: 4, column: 19, lookahead: WriteLn
*** block ***
@ line: 7, column: 1, lookahead: VAR
*** declaration ***
@ line: 7, column: 1, lookahead: VAR
*** variableDeclaration ***
@ line: 7, column: 5, lookahead: Index
*** identList ***
@ line: 7, column: 5, lookahead: Index
*** type ***
@ line: 7, column: 13, lookahead: CARDINAL
*** qualident ***
@ line: 7, column: 13, lookahead: CARDINAL
*** statementSequence ***
@ line: 13, column: 4, lookahead: WriteString
*** statement ***
@ line: 13, column: 4, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 13, column: 4, lookahead: WriteString
*** designator ***
@ line: 13, column: 4, lookahead: WriteString
*** qualident ***
@ line: 13, column: 4, lookahead: WriteString
*** actualParameters ***
@ line: 13, column: 15, lookahead: WriteString
*** expressionList ***
@ line: 13, column: 16, lookahead: This is our first example program
*** expression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleExpression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** term ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleTerm ***
@ line: 13, column: 16, lookahead: This is our first example program
*** factor ***
@ line: 13, column: 16, lookahead: This is our first example program
*** statement ***
@ line: 15, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 15, column: 4, lookahead: WriteLn
*** designator ***
@ line: 15, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 15, column: 4, lookahead: WriteLn
*** statement ***
@ line: 17, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 17, column: 4, lookahead: WriteLn
*** designator ***
@ line: 17, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 17, column: 4, lookahead: WriteLn
*** statement ***
@ line: 19, column: 4, lookahead: FOR
*** forStatement ***
@ line: 19, column: 4, lookahead: FOR
*** expression ***
@ line: 19, column: 17, lookahead: 1
*** simpleExpression ***
@ line: 19, column: 17, lookahead: 1
*** term ***
@ line: 19, column: 17, lookahead: 1
*** simpleTerm ***
@ line: 19, column: 17, lookahead: 1
*** factor ***
@ line: 19, column: 17, lookahead: 1
*** expression ***
@ line: 19, column: 22, lookahead: 12
*** simpleExpression ***
@ line: 19, column: 22, lookahead: 12
*** term ***
@ line: 19, column: 22, lookahead: 12
*** simpleTerm ***
@ line: 19, column: 22, lookahead: 12
*** factor ***
@ line: 19, column: 22, lookahead: 12
*** statementSequence ***
@ line: 21, column: 7, lookahead: WriteString
*** statement ***
@ line: 21, column: 7, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 21, column: 7, lookahead: WriteString
*** designator ***
@ line: 21, column: 7, lookahead: WriteString
*** qualident ***
@ line: 21, column: 7, lookahead: WriteString
*** actualParameters ***
@ line: 21, column: 18, lookahead: WriteString
*** expressionList ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** expression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleExpression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** term ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleTerm ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** factor ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** statement ***
@ line: 23, column: 7, lookahead: WriteCard
*** assignmentOrProcCall ***
@ line: 23, column: 7, lookahead: WriteCard
*** designator ***
@ line: 23, column: 7, lookahead: WriteCard
*** qualident ***
@ line: 23, column: 7, lookahead: WriteCard
*** actualParameters ***
@ line: 23, column: 16, lookahead: WriteCard
*** expressionList ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 17, lookahead: Index
*** simpleExpression ***
@ line: 23, column: 17, lookahead: Index
*** term ***
@ line: 23, column: 17, lookahead: Index
*** simpleTerm ***
@ line: 23, column: 17, lookahead: Index
*** factor ***
@ line: 23, column: 17, lookahead: Index
*** designatorOrFuncCall ***
@ line: 23, column: 17, lookahead: Index
*** designator ***
@ line: 23, column: 17, lookahead: Index
*** qualident ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 23, lookahead: 3
*** simpleExpression ***
@ line: 23, column: 23, lookahead: 3
*** term ***
@ line: 23, column: 23, lookahead: 3
*** simpleTerm ***
@ line: 23, column: 23, lookahead: 3
*** factor ***
@ line: 23, column: 23, lookahead: 3
*** statement ***
@ line: 25, column: 7, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 25, column: 7, lookahead: WriteLn
*** designator ***
@ line: 25, column: 7, lookahead: WriteLn
*** qualident ***
@ line: 25, column: 7, lookahead: WriteLn
line: 27, column: 4, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.
parse error count: 1

So m2c doesn't report correct line numbers. I also do not understand why it reports an error.

All the best and kind regards,
Christoph
trijezdci
2015-12-13 14:57:30 UTC
Permalink
Raw Message
Post by Christoph Schlegel
MODULE FirstEx;
FROM InOut IMPORT WriteLn, WriteString, WriteCard;
VAR Index : CARDINAL;
BEGIN
WriteString("This is our first example program");
WriteLn;
WriteLn;
FOR Index := 1 TO 12 DO
WriteString("The value of the index is now ");
WriteCard(Index,3);
WriteLn;
END
END FirstEx.
Here is the result of
$ ./m2c ./FirstEx.mod --parser-debug
$ ./m2c ./FirstEx.mod --parser-debug
m2c Modula-2 Compiler & Translator, version 1.00
*** programModule ***
@ line: 1, column: 1, lookahead: MODULE
*** import ***
@ line: 4, column: 1, lookahead: FROM
*** unqualifiedImport ***
@ line: 4, column: 1, lookahead: FROM
*** identList ***
@ line: 4, column: 19, lookahead: WriteLn
*** block ***
@ line: 7, column: 1, lookahead: VAR
*** declaration ***
@ line: 7, column: 1, lookahead: VAR
*** variableDeclaration ***
@ line: 7, column: 5, lookahead: Index
*** identList ***
@ line: 7, column: 5, lookahead: Index
*** type ***
@ line: 7, column: 13, lookahead: CARDINAL
*** qualident ***
@ line: 7, column: 13, lookahead: CARDINAL
*** statementSequence ***
@ line: 13, column: 4, lookahead: WriteString
*** statement ***
@ line: 13, column: 4, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 13, column: 4, lookahead: WriteString
*** designator ***
@ line: 13, column: 4, lookahead: WriteString
*** qualident ***
@ line: 13, column: 4, lookahead: WriteString
*** actualParameters ***
@ line: 13, column: 15, lookahead: WriteString
*** expressionList ***
@ line: 13, column: 16, lookahead: This is our first example program
*** expression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleExpression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** term ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleTerm ***
@ line: 13, column: 16, lookahead: This is our first example program
*** factor ***
@ line: 13, column: 16, lookahead: This is our first example program
*** statement ***
@ line: 15, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 15, column: 4, lookahead: WriteLn
*** designator ***
@ line: 15, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 15, column: 4, lookahead: WriteLn
*** statement ***
@ line: 17, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 17, column: 4, lookahead: WriteLn
*** designator ***
@ line: 17, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 17, column: 4, lookahead: WriteLn
*** statement ***
@ line: 19, column: 4, lookahead: FOR
*** forStatement ***
@ line: 19, column: 4, lookahead: FOR
*** expression ***
@ line: 19, column: 17, lookahead: 1
*** simpleExpression ***
@ line: 19, column: 17, lookahead: 1
*** term ***
@ line: 19, column: 17, lookahead: 1
*** simpleTerm ***
@ line: 19, column: 17, lookahead: 1
*** factor ***
@ line: 19, column: 17, lookahead: 1
*** expression ***
@ line: 19, column: 22, lookahead: 12
*** simpleExpression ***
@ line: 19, column: 22, lookahead: 12
*** term ***
@ line: 19, column: 22, lookahead: 12
*** simpleTerm ***
@ line: 19, column: 22, lookahead: 12
*** factor ***
@ line: 19, column: 22, lookahead: 12
*** statementSequence ***
@ line: 21, column: 7, lookahead: WriteString
*** statement ***
@ line: 21, column: 7, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 21, column: 7, lookahead: WriteString
*** designator ***
@ line: 21, column: 7, lookahead: WriteString
*** qualident ***
@ line: 21, column: 7, lookahead: WriteString
*** actualParameters ***
@ line: 21, column: 18, lookahead: WriteString
*** expressionList ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** expression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleExpression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** term ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleTerm ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** factor ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** statement ***
@ line: 23, column: 7, lookahead: WriteCard
*** assignmentOrProcCall ***
@ line: 23, column: 7, lookahead: WriteCard
*** designator ***
@ line: 23, column: 7, lookahead: WriteCard
*** qualident ***
@ line: 23, column: 7, lookahead: WriteCard
*** actualParameters ***
@ line: 23, column: 16, lookahead: WriteCard
*** expressionList ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 17, lookahead: Index
*** simpleExpression ***
@ line: 23, column: 17, lookahead: Index
*** term ***
@ line: 23, column: 17, lookahead: Index
*** simpleTerm ***
@ line: 23, column: 17, lookahead: Index
*** factor ***
@ line: 23, column: 17, lookahead: Index
*** designatorOrFuncCall ***
@ line: 23, column: 17, lookahead: Index
*** designator ***
@ line: 23, column: 17, lookahead: Index
*** qualident ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 23, lookahead: 3
*** simpleExpression ***
@ line: 23, column: 23, lookahead: 3
*** term ***
@ line: 23, column: 23, lookahead: 3
*** simpleTerm ***
@ line: 23, column: 23, lookahead: 3
*** factor ***
@ line: 23, column: 23, lookahead: 3
*** statement ***
@ line: 25, column: 7, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 25, column: 7, lookahead: WriteLn
*** designator ***
@ line: 25, column: 7, lookahead: WriteLn
*** qualident ***
@ line: 25, column: 7, lookahead: WriteLn
line: 27, column: 4, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.
parse error count: 1
Great that you got it working under Cygwin.
Post by Christoph Schlegel
So m2c doesn't report correct line numbers.
This must be due to Windows CR LF end-of-line markers. The filereader module should recognise this correctly and count up only once, but apparently it doesn't work properly. I have only tested on Posix systems with LF end-of-line markers. I will take a look and fix it. Thanks.
Post by Christoph Schlegel
I also do not understand why it reports an error.
WriteLn;
END
line: 27, column: 4, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.
On the error, the parser is correct. The semicolon in a statement sequence is a separator, not a terminator, it can only appear in between two statements, not between a statement and END.

See the grammar rules for the for loop and statement sequence:

http://modula-2.info/m2pim/pmwiki.php/SyntaxDiagrams/PIM4NonTerminals#forStatement

http://modula-2.info/m2pim/pmwiki.php/SyntaxDiagrams/PIM4NonTerminals#statementSequence

thanks again
tbreeden
2015-12-13 15:32:20 UTC
Permalink
Raw Message
Post by trijezdci
On the error, the parser is correct. The semicolon in a statement sequence is a separator, not a terminator, it can only appear in between two statements, not between a statement and END.
Uh-oh.

Lots of semicolons in my code will hit this. I had forgotten that an empty statement
was not permitted in PIM. Didn't many compilers provide the redundant ";" as an
empty statement.

It certainly makes it easier to not worry so much about semicolon placement when adding
and deleting statements.

Consider an m2c option for this?

Tom

ISO

6.6.2 Empty Statements
An empty statement contains no symbols and denotes no action.
Its use permits the relaxation of punctuation rules in statement sequences.
Concrete Syntax
empty statement = ;
trijezdci
2015-12-13 16:25:46 UTC
Permalink
Raw Message
Post by tbreeden
Lots of semicolons in my code will hit this. I had forgotten that an empty statement
was not permitted in PIM. Didn't many compilers provide the redundant ";" as an
empty statement.
Some did.
Post by tbreeden
It certainly makes it easier to not worry so much about semicolon placement when adding
and deleting statements.
In the interest of correctness, I prefer to pass on ISO here and keep it PIM.

Also, I had looked into this before and found that the grammar was no longer LL(1) when allowing for a final semicolon. The semicolon is the very symbol that makes the difference between continuing in the loop or leaving it.
Post by tbreeden
Consider an m2c option for this?
It should be possible to add an option to turn this particular error into a warning and continue. This could probably be done via error recovery. However, this will have to wait. It is more important to the get compiler back end working first.

regards
Chris Burrows
2015-12-13 21:40:12 UTC
Permalink
Raw Message
Post by tbreeden
Lots of semicolons in my code will hit this. I had forgotten that an empty statement
was not permitted in PIM. Didn't many compilers provide the redundant ";" as an
empty statement.
On the contrary. PIM specifically states that superfluous semicolons / empty statements ARE allowed. A PIM-compatible compiler should allow this.

Chris Burrows
CFB Software
http://www.cfbsoftware.com/modula2
tbreeden
2015-12-14 03:27:28 UTC
Permalink
Raw Message
Post by Chris Burrows
On the contrary. PIM specifically states that superfluous semicolons / empty
statements ARE allowed. A PIM-compatible compiler should allow this.
This seems to be right.

I looked at the original report on Module-2 and the text on Statements contains a
sentence pretty much identical to the one I reference in the ISO document,

 An empty statement contains no symbols and denotes no action.
      Its use permits the relaxation of punctuation rules in statement sequences.

In both PIM3 and PIM4 it reads

"The syntax of statements implies that a statement may consist of no
symbols at all. In this case, the statement is said to be empty and evidently
denotes the null action. This curiosity among statements has a definite reason:
it allows semicolons to be inserted at places where they are actually
superfluous, such as at the end of a statement sequence."

The problem is that the actual EBNF syntax section at the end of the books does not
seem to allow defining an empty statement. However the railroad diagram for Statement
does.

Considering the explicit wording in the text part of the report, it seems that a PIM3/4 compiler
should really support the empty statement, but the clearly the report is has
inconsistencies. Oh well, we knew Wirth was not much interested in copy editing. :)

Tom
trijezdci
2015-12-14 04:46:42 UTC
Permalink
Raw Message
Post by tbreeden
Post by Chris Burrows
On the contrary. PIM specifically states that superfluous semicolons / empty
statements ARE allowed. A PIM-compatible compiler should allow this.
This seems to be right.
I looked at the original report on Module-2 and the text on Statements contains a
sentence pretty much identical to the one I reference in the ISO document,
 An empty statement contains no symbols and denotes no action.
      Its use permits the relaxation of punctuation rules in statement sequences.
In both PIM3 and PIM4 it reads
"The syntax of statements implies that a statement may consist of no
symbols at all. In this case, the statement is said to be empty and evidently
it allows semicolons to be inserted at places where they are actually
superfluous, such as at the end of a statement sequence."
The problem is that the actual EBNF syntax section at the end of the books does not
seem to allow defining an empty statement. However the railroad diagram for Statement
does.
Considering the explicit wording in the text part of the report, it seems that a PIM3/4 compiler
should really support the empty statement, but the clearly the report is has
inconsistencies. Oh well, we knew Wirth was not much interested in copy editing. :)
Whenever I asked Wirth about these kinds of discrepancies before, he always answered that the grammar is the ultimate authority and one should always follow the grammar.

Since the syntax diagrams in PIM have errors, I will have to go with the EBNF as authoritative.

I don't care what the text says, when in similar circumstances Wirth insists that the grammar always has the last word.

Thus, it will remain an error and the only concession I am going to make is to provide an option that the error can be turned into a warning. Stricter is always better anyway. Programmers have already got too many liberties to be sloppy, which is the whole raison d'etre for languages like Ada and Modula-2: The forming of good habits by enforcing rules rigorously, not the encouragement of sloppiness.

regards
Chris Burrows
2015-12-14 08:51:17 UTC
Permalink
Raw Message
Post by trijezdci
I don't care what the text says, when in similar circumstances Wirth insists that the grammar always has the last word.
Unfortunately you do need to take heed of what is in the text as well as the grammar alone is insufficient to cover all the nuances of the language. e.g. where in the grammar do you find comments specified?
trijezdci
2015-12-14 09:12:55 UTC
Permalink
Raw Message
Post by Chris Burrows
Post by trijezdci
I don't care what the text says, when in similar circumstances Wirth insists that the grammar always has the last word.
Unfortunately you do need to take heed of what is in the text as well as the grammar alone is insufficient to cover all the nuances of the language. e.g. where in the grammar do you find comments specified?
Wirth says that the grammar has the ultimate authority, so I am going to enforce what the grammar says.

Far more importantly, the job of the Modula-2 language and compiler is to educate users about their bad habits, and thereby -- over time -- form good habits.

In M2C this will be achieved by providing a compiler option --sloppy-semicolon which then reports the erroneous semicolon as a warning instead of an error.

There is nothing in PIM that says that you cannot do so.

And this is the end of this discussion.
Chris Burrows
2015-12-14 10:31:19 UTC
Permalink
Raw Message
Post by trijezdci
In M2C this will be achieved by providing a compiler option --sloppy-semicolon which then reports the erroneous semicolon as a warning instead of an error.
Maybe OK if you only ever use code that you have written yourself. Not much use if you've inherited somebody else's code and all the noise leads you to ignore / miss the other warning messages that you might really need to worry about.

Better to include this sort of checking in a separate optional analysis tool. As well as detecting superfluous semicolons you should warn about variables that are never used, never set, set but never used etc. etc. An example of such a tool is DevAnalyzer included with Blackbox Component Pascal:

http://www.oberon.ch/blackbox.html

The source code is included - shouldn't be too difficult to adapt it to handle Modula-2 sources.

Regards,
Chris Burrows
http://www.cfbsoftware.com/modula2
trijezdci
2015-12-14 10:58:12 UTC
Permalink
Raw Message
Post by Chris Burrows
Post by trijezdci
In M2C this will be achieved by providing a compiler option --sloppy-semicolon which then reports the erroneous semicolon as a warning instead of an error.
Maybe OK if you only ever use code that you have written yourself. Not much use if you've inherited somebody else's code and all the noise leads you to ignore / miss the other warning messages that you might really need to worry about.
This is precisely the point. If you inherited code with lots of technical debt, the compiler should pester you about that technical debt until you fix it.

I am afraid to disappoint you but there is no way in the universe that you will ever convince me that sloppiness should be tolerated. The state of software is so bad today precisely because of this kind of attitude.

What we need is more thoroughness.
Post by Chris Burrows
Better to include this sort of checking in a separate optional analysis tool.
I disagree again. This is precisely the problem. Correctness and thoroughness has become an option in this day and age, and this means that it is conveniently skipped when there is pressure in form of looming delivery dates and cost cutting, which then leads to more and more technical debt.
Post by Chris Burrows
As well as detecting superfluous semicolons you should warn about variables that are never used, never set, set but never used etc. etc.
If you look at the first post in this thread you will see I posted a link to a wiki page which lists the deliverables and milestones. The present milestone is syntax analysis.

Semantic analysis is the next stage.

regards
Martin Brown
2015-12-14 11:21:36 UTC
Permalink
Raw Message
Post by trijezdci
Post by Chris Burrows
Post by trijezdci
In M2C this will be achieved by providing a compiler option --sloppy-semicolon which
then reports the erroneous semicolon as a warning instead of an error.
Maybe OK if you only ever use code that you have written yourself.
Not much use if you've inherited somebody else's code and all the
noise leads you
Post by trijezdci
Post by Chris Burrows
to ignore / miss the other warning messages that you might really
need to worry about.
Post by trijezdci
This is precisely the point. If you inherited code with lots of technical debt,
the compiler should pester you about that technical debt until you
fix it.

Yes iff the defect is capable of making a real difference but the extra
semicolon/null statement error is on a par with faulting a blank line.
Post by trijezdci
I am afraid to disappoint you but there is no way in the universe that you will ever
convince me that sloppiness should be tolerated. The state of
software is so bad today
Post by trijezdci
precisely because of this kind of attitude.
What we need is more thoroughness.
Yes, but having a compiler that annoys its users by mithering about
insignificant "defects" will mask the ones that really matter like
pointers and variables possibly used before initialisation and the like.
Post by trijezdci
Post by Chris Burrows
Better to include this sort of checking in a separate optional analysis tool.
I disagree again. This is precisely the problem. Correctness and thoroughness has
become an option in this day and age, and this means that it is
conveniently skipped
Post by trijezdci
when there is pressure in form of looming delivery dates and cost
cutting, which
Post by trijezdci
then leads to more and more technical debt.
I am with you on this one. Having parsed the language you may as well go
on to use dataflow analysis and to find potentially uninitialised
variables and the like but it is secondary to having the compiler work.

Similarly it makes sense to generate some code metrics like McCabe's CCI
that show complexity/testability and therefore risk of latent bugs.
Although they could just as easily be done as a standalone component.
Post by trijezdci
Post by Chris Burrows
As well as detecting superfluous semicolons you should warn about variables
that are never used, never set, set but never used etc. etc.
Unused imports is my favourite for culling on a large project. It is
astonishing what dross can be still linked in but doing nothing useful.

A tool to generate the who imports what from whom and a summary of the
connectedness of the imported definition modules is also useful on
larger projects when looking at cohesiveness (or the opposite :( )
Post by trijezdci
If you look at the first post in this thread you will see I posted a link to a wiki page
which lists the deliverables and milestones. The present milestone is
syntax analysis.

Looks like promising progress.
--
Regards,
Martin Brown
trijezdci
2015-12-14 12:45:55 UTC
Permalink
Raw Message
Post by Martin Brown
Yes iff the defect is capable of making a real difference but the extra
semicolon/null statement error is on a par with faulting a blank line.
I look at this from the perspective of psychology.

The compiler enforces the specification and the programmer needs to be psychologically conditioned to accept the fact that the specification reigns supreme over the programmer.

This is like boot camp at the military. Your personality will be broken down in order to be rebuilt afterwards. The compiler takes the place of the drill sergeant. A nice drill sergeant is of no use.
Post by Martin Brown
Yes, but having a compiler that annoys its users by mithering about
insignificant "defects" will mask the ones that really matter like
pointers and variables possibly used before initialisation and the like.
In fact, from a psychology point of view it is a perfect scenario when you have a frequently made error that seems insignificant but the compiler pesters you until you fix it.

The whole point is to build a pedantic mindset.

For M2 R10 we had a discussion with Harry Sneed, one of Europe's foremost experts on software quality auditing and assurance. He explained to us what tools he has built to capture and manage technical debt in software and based on his input we have removed the ability to use empty statements altogether and replaced them with a TO DO statement that stands in for an empty statement but is part of the language and has parameters such as an identifier and description as well as optional time estimates.

The point is twofold. The first is on a psychological level. It is about making programmers aware of technical debt and build a mindset that this will not go away until they fix it.

The second is on a management level. It is about making technical debt, visible, reportable, recordable, retrievable, computable. In other words, making it manageable.

Anyway, this is my philosophy and I learned it the hard way, going through mile high shit in a rotten industry that has too many primadonnas. Break the primadonnas and things will improve. Ask Harry Sneed what he thinks of programmers. He even suggested to withhold part of their salaries and pay the remainder based on how much technical debt they work off.

If finance people get desperate enough to implement this -- and the employment market is going into a direction where they could actually get away with it -- then you will find that a took that pedantically reports technical debt, no matter how annoying you may think of that today, will become a very welcome tool to make sure you get that withheld optional salary.

In short, an anally pedantic compiler is your best friend.
Post by Martin Brown
I am with you on this one. Having parsed the language you may as well go
on to use dataflow analysis and to find potentially uninitialised
variables and the like but it is secondary to having the compiler work.
This is why we have added a CONST attribute for formal parameters and an <*OUT*> pragma for formal VAR parameters in M2 R10. The former tells the compiler that the parameter shall not and will not be written to, the latter tells it that the parameter shall and will be written to. This then allows for the analysis to be local to the current scope.

Whilst those are not features of M2C, I might consider them later. We'll see.
Post by Martin Brown
Unused imports is my favourite for culling on a large project. It is
astonishing what dross can be still linked in but doing nothing useful.
:-)
Post by Martin Brown
A tool to generate the who imports what from whom
Yes, that's on my wish list, too. ;-)
Post by Martin Brown
Post by trijezdci
If you look at the first post in this thread you will see I posted a link to a wiki page
which lists the deliverables and milestones. The present milestone is syntax analysis.
Looks like promising progress.
It just makes sense to do it in that order.

Of course there are some trivial things you might want to do on the fly like making sure that you don't have any duplicates in an identifier list of a variable declaration

VAR foo, foo, foo : Foo;

You probably don't want to get that into your AST only to filter it out again later.

But for type checking, unused variable checking and the like, you want to build your AST first.

regards
j***@gmail.com
2015-12-14 13:47:44 UTC
Permalink
Raw Message
Post by trijezdci
I look at this from the perspective of psychology.
The compiler enforces the specification and the programmer needs to be psychologically conditioned to accept the fact that the specification reigns supreme over the programmer.
Which make you a kind of God....

I don't need a drill sergeant to make a better me. I choose the language/compiler which I feel comfortable with. Mocka, FST, obc, StonyBrook, XDS. But I'm not feeling comfortable with M2R10.

After consulting several 'experts' you come to the conclusion that a loose semi is on the same par as a dangling pointer... We tend to disagree.
Martin Brown
2015-12-14 14:43:23 UTC
Permalink
Raw Message
Post by j***@gmail.com
Post by trijezdci
I look at this from the perspective of psychology.
The compiler enforces the specification and the programmer needs to be
psychologically conditioned to accept the fact that the specification
reigns supreme over the programmer.
Which make you a kind of God....
I see what he means. Although I disagree about the merits of faulting
this particular "error" at all. I used to believe that missing the last
";" was correct and did so but I saw later that lesser qualified
maintenance programmers invariably tripped up over it far too often.
Post by j***@gmail.com
I don't need a drill sergeant to make a better me. I choose the language/compiler
which I feel comfortable with. Mocka, FST, obc, StonyBrook, XDS.
But I'm not feeling comfortable with M2R10.
That is a shame because it has the potential to be very good.
Post by j***@gmail.com
After consulting several 'experts' you come to the conclusion that a loose semi
is on the same par as a dangling pointer... We tend to disagree.
I don't think that is what he said. Though I do think this particular
"defect" should be a maskable warning and certainly not an error.

One heretical thought would be to alter the language specification to
make ";" optional on lines immediately prior to any reserved word or
containing just a single statement (ie promote <cr> to a soft ";").

IOW ";" only necessary where multiple statements per line occur.
(and for disambiguation where it may be essential)

I expect this will cause a few parsing ambiguities somewhere along the
lines but at first glance it looks to me like it would be feasable.
--
Regards,
Martin Brown
trijezdci
2015-12-14 16:12:29 UTC
Permalink
Raw Message
M2C is and will remain a PIM compiler.

The only extensions it will gain are Oberon style extensible record types and some minor extensions that are purely *lexical*, such as lowlines '_' in identifiers, all of which can be turned on and off by compiler options.

The semicolon will remain a separator in statement sequences just as Wirth's AUTHORITATIVE grammar says.

The option to reduce the error of an errant semicolon at the end of a statement sequence to a mere warning is a very reasonable and completely satisfactory solution and the matter should thus be accepted as settled.

Anyone who would argue with that expose themselves to be arguing for the sake of arguing. Whoever wants to engage in such an argument for argument's sake, please start your own thread.
Chris Burrows
2015-12-15 02:43:52 UTC
Permalink
Raw Message
Post by trijezdci
M2C is and will remain a PIM compiler.
Good.
Post by trijezdci
The semicolon will remain a separator in statement sequences just as Wirth's AUTHORITATIVE grammar says.
Good.
Post by trijezdci
The option to reduce the error of an errant semicolon at the end of a statement sequence to a mere warning is a very reasonable and completely satisfactory solution and the matter should thus be accepted as settled.
Not good.

The phrase "an errant semicolon at the end of a statement sequence" indicates to me that you have some misunderstanding of the Modula-2 syntax. Read the Modula-2 EBNF again. EBNF allows for options and repetition. If a construct A may be either a B or nothing (empty), this is expressed as

A = [B].

Now, in Modula-2 statement is defined as:

statement = [assignment | ProcedureCall | IfStatement | CaseStatement | WhileStatement | RepeatStatement | LoopStatement | ForStatement | WithStatement | EXIT | RETURN [expression] ].

As the definition above is bracketed by [ ] a statement can be nothing (empty).

The following statement:

IF E
THEN S;
END

Is perfectly valid Modula-2 syntax accoriding to this rule. The semicolon is not a statement *terminator*, it *separates* the statement S from the null (empty) statement before the END.

If you want to provide the user with an option to warn about the existence of null statements all well and good. However, if he opts not to be warned the compiler should accept it and NOT report an error.
m***@gmail.com
2015-12-14 18:29:59 UTC
Permalink
Raw Message
Post by j***@gmail.com
Post by trijezdci
I look at this from the perspective of psychology.
The compiler enforces the specification and the programmer needs to be psychologically conditioned to accept the fact that the specification reigns supreme over the programmer.
Which make you a kind of God....
Why all the complaints and bickering about this issue? What on Earth is your Major Malfunction? Let me guess -- Mommie didn't send you a Christmas Card this year, right? :-P
j***@gmail.com
2015-12-14 21:30:43 UTC
Permalink
Raw Message
Post by m***@gmail.com
Let me guess -- Mommie didn't send you a Christmas Card this year, right? :-P
Santaclaus is a fraud. Only Sinterklaas and zwarte piet exist. And they're back to Spain already. And they gave me a new Dell Latutude.
Martin Brown
2015-12-15 07:38:19 UTC
Permalink
Raw Message
Post by m***@gmail.com
Post by j***@gmail.com
Post by trijezdci
I look at this from the perspective of psychology.
The compiler enforces the specification and the programmer needs to be psychologically
conditioned to accept the fact that the specification reigns
supreme over the programmer.
Post by m***@gmail.com
Post by j***@gmail.com
Which make you a kind of God....
Why all the complaints and bickering about this issue?
Because what has been implemented is not correct PIM3 M2 behaviour. RTFM
Post by m***@gmail.com
What on Earth is your Major Malfunction?
Let me guess -- Mommie didn't send you a Christmas Card this year,
right? :-P
--
Regards,
Martin Brown
Nemo
2015-12-14 16:00:43 UTC
Permalink
Raw Message
Wirth says that the grammar has the ultimate authority,[...]
Perhaps but his compiler did otherwise and many considered his compiler
the ultimate authority. (I have yet to try m2c on his compiler sources
-- that should be interesting.)

N.
trijezdci
2015-12-14 17:17:36 UTC
Permalink
Raw Message
Post by Nemo
I have yet to try m2c on his compiler sources
-- that should be interesting.
That would be helpful. Please do share the results if you do. Thanks.

I could of course grab various sources here and there myself and spend the next couple of weeks testing to give the parser some scrutiny, but if others help testing, it will allow me to proceed with the back end and get the job done faster. Thanks.
Christoph Schlegel
2015-12-14 22:30:59 UTC
Permalink
Raw Message
Post by trijezdci
Post by Nemo
I have yet to try m2c on his compiler sources
-- that should be interesting.
That would be helpful. Please do share the results if you do. Thanks.
I could of course grab various sources here and there myself and spend the next couple of weeks testing to give the parser some scrutiny, but if others help testing, it will allow me to proceed with the back end and get the job done faster. Thanks.
Hi,

as I don't really know where to post it here I just created an issue over at Bitbucket to separate reports from the discussion.

Regards,
Christoph
trijezdci
2015-12-15 05:31:48 UTC
Permalink
Raw Message
Post by Christoph Schlegel
as I don't really know where to post it here I just created an issue over at Bitbucket to separate reports from the discussion.
You need to use --pim3 or --pim4, more detail is on the tracker.

thanks.
Richard
2015-12-14 18:50:49 UTC
Permalink
Raw Message
Post by trijezdci
Whenever I asked Wirth about these kinds of discrepancies before, he
always answered that the grammar is the ultimate authority and one
should always follow the grammar.
Wirth said, wrote, and programmed a lot of things. Sometimes there are
inconsistencies, sometimes he also changes his mind. Why do you obsess
about this single "rule" then?

BTW: The EBNF of the Oberon Report has allowed empty statements, and
thus superfluous semicolons, since the very beginning. So, this is
certainly the behaviour preferred by Wirth.
trijezdci
2015-12-14 19:58:40 UTC
Permalink
Raw Message
Post by Richard
Post by trijezdci
Whenever I asked Wirth about these kinds of discrepancies before, he
always answered that the grammar is the ultimate authority and one
should always follow the grammar.
Wirth said, wrote, and programmed a lot of things. Sometimes there are
inconsistencies, sometimes he also changes his mind. Why do you obsess
about this single "rule" then?
Aristotle said

"Excellence is not an act, it is a habit"

In modern times, "excellence" has been replaced with "quality" in that quote.

You either have the habit, or you don't.

If you feel that discipline is not worth the effort, then there are plenty of languages and compilers to choose from that give a damn about discipline. Those are the ones that do not form any good habits, but it is your privilege. Ada and Modula-2 are the last bastions of an era in which discipline meant something. An era which understood the meaning behind Aristotle's quote.

If the grammar says there is no semicolon after a statement sequence, then it is an error when there is one. If a compiler implementor, Wirth or whoever else, chooses to give you some lenience, then that is their choice and it is merely an act of either pity or courtesy, depending on viewpoint. I do not believe that such leniency is in the spirit of the language and I know for a fact that it is not helpful. Ignoring laid down rules only leads to erosion of good habits.

In other words, this is an attitude problem, not a technology problem. It can be addressed by constant reminders, that's what the parser is doing, remind you that whoever wrote this code did not muster the discipline to write it to specification. A leniency switch that doesn't remind you will not address the attitude problem, it only proliferates it.
Post by Richard
BTW: The EBNF of the Oberon Report has allowed empty statements, and
thus superfluous semicolons, since the very beginning. So, this is
certainly the behaviour preferred by Wirth.
This is not about Wirth's preferences. However, I agree with him when he says that the grammar should be the ultimate authority. If the Oberon grammar specifies the semicolon as a terminator, fine with me. But the Modula-2 grammar certainly does not.

When the grammar says there shouldn't be a semicolon, then it is an error to have one there. A leniency switch should therefore report it as a warning.

The only proper way to allow a semicolon as a terminator without any such warning is indeed to change the language and specify it accordingly. I didn't feel like doing that, hence the warning.

My message is very simple: If you lay down rules, follow them. If you want to change the rules, lay down the new rules first, then follow the new rules. Don't just ignore your laid down rules for it undermines the very concept of having rules. It undermines discipline. Discipline is more important than convenience, at least in engineering.

With all due respect to Prof. Wirth, he hasn't always consequently followed the rules he has laid down. Sometimes out of oversight, sometimes out of convenience, both of which is human and understandable. But it doesn't imply that the rules are meaningless.
Chris Burrows
2015-12-14 21:18:12 UTC
Permalink
Raw Message
Post by trijezdci
With all due respect to Prof. Wirth, he hasn't always consequently followed the rules he has laid down. Sometimes out of oversight, sometimes out of convenience, both of which is human and understandable. But it doesn't imply that the rules are meaningless.
That does does not apply here. I see no ambiguity or oversight in the relevant section of PIM (Third, Corrected Edition: 1985)

<quote>
This statement separator (not terminator) indicates that the action specifled by SO is to be followed immediately by the action corresponding to S1. A sequence of statements is syntactically defined as

$ StatementSequence = statement {";" statement}.

The syntax of statements implies that a statement may consist of no symbols at all In this case, the statement is said to be empty and evidently denotes the null action. This curiosity among statements has a definite reason: it allows semicolons to be inserted at places where they are actually superfluous, such as at the end of a statement sequence.
</quote>

If you want to define your own language you are perfectly free to do so - just don't try to pretend it's Modula-2.
trijezdci
2015-12-15 05:28:16 UTC
Permalink
Raw Message
You are wrong. Wirth's grammar says there is no semicolon after a statement sequence. Consequently there is a discrepancy between his text and the grammar and in such cases the grammar has authority. The text describes a leniency switch and Wirth is wrong to simply ignore the error without warning. It remains an error for as long as the grammar does not permit it. Consequently it remains a leniency switch for as the grammar does not permit it. Consequently it should generate an error without leniency enabled and a warning with leniency enabled.

This is the last word on this matter in this thread, you can open your own thread if you like. Don't hijack this one. It is off-topic. thanks.
Chris Burrows
2015-12-15 07:44:15 UTC
Permalink
Raw Message
Post by trijezdci
You are wrong.
No - you are wrong.
Post by trijezdci
Wirth's grammar says there is no semicolon after a statement sequence.
A statement sequence can contain null statements and there can be a semicolon *before* a null statement. It is a subtle difference. Take some time to think about it before replying again.

I would also be interested in hearing how your compiler handles empty statements in other circumstances e.g. the empty statement in the body of the infinite loop:

WHILE TRUE DO END
Post by trijezdci
Consequently there is a discrepancy between his text and the grammar
That is not true - there is no discrepancy between his text and the grammar. Please stop repeating this misinformation.
Post by trijezdci
This is the last word on this matter in this thread
If you don't want to say any more that is your choice. If you keep repeating fallacies I will keep objecting.
j***@gmail.com
2015-12-15 08:44:26 UTC
Permalink
Raw Message
Post by trijezdci
This is the last word on this matter in this thread, you can open your own thread if you like. Don't hijack this one. It is off-topic. thanks.
This is an open newsgroup. You do not own this group. Neither do I or Chris or Martin, Nemo, even MSC386 doesn't.

It looks like you're creating Modula-2++, building on Modula-2 syntax and reputation, thereby hijacking the Modula-2 language, much like Stroustrup did with the C language (to the discontent of Dennis Ritchie).
Martin Brown
2015-12-15 07:20:29 UTC
Permalink
Raw Message
Post by Chris Burrows
Post by trijezdci
With all due respect to Prof. Wirth, he hasn't always consequently followed the rules he has laid down.
Sometimes out of oversight, sometimes out of convenience, both of
which is human and understandable.
Post by Chris Burrows
Post by trijezdci
But it doesn't imply that the rules are meaningless.
That does does not apply here. I see no ambiguity or oversight in the relevant section of PIM (Third, Corrected Edition: 1985)
<quote>
This statement separator (not terminator) indicates that the action specifled by SO is to be followed
immediately by the action corresponding to S1. A sequence of
statements is syntactically defined as
Post by Chris Burrows
$ StatementSequence = statement {";" statement}.
The syntax of statements implies that a statement may consist of no symbols at all
In this case, the statement is said to be empty and evidently denotes
the null action.
Post by Chris Burrows
This curiosity among statements has a definite reason: it allows
semicolons to be inserted
Post by Chris Burrows
at places where they are actually superfluous, such as at the end of
a statement sequence.
Post by Chris Burrows
</quote>
If you want to define your own language you are perfectly free to do so - just don't try to pretend it's Modula-2.
Mine is a second edition and initially seems to require that statements
are not empty and in addition all the examples follow the no trailing
";" rule so there is apparently some revision going on here. p168 gives

53 statement = [assignment|ProcedureCall|
54 IfStatement|CaseStatement|WhileStatement|
55 RepeatStatement|LoopStatement|ForStatement|
56 WithStatement|EXIT|RETURN [expression]].

However, the *ERROR* is in the language specification itself. The EBNF
for "statement" is incomplete and does not match the verbal description
given in the language report which explicitly *permits* a null statement
but neither defines one nor includes it in the grammar.

EmptyStatement is left undefined and never referenced again.
EmptyStatement = ; would be sufficient to resolve the angst
or appending |; to the allowed statements

The offending paragraph is section 9. Statements on p151 in my copy
where the grammar is reproduced same as above sans numbers but with two
explanatory sentences underneath which say:

"A statement may also be empty, in which case it denotes no action. The
empty statement is included to relax punctuation rules in statement
sequences."

However, nowhere in the EBNF grammar is this rule formally expressed. It
speaks volumes about the number of people who have actually read the
report that this defect has lain undetected until now. I can't believe
that WG13 missed this inconsistency. And if they did then it shows the
falibility of human effort to make failsafe computer languages.

I can't recall ever encountering an M2 compiler that would fault a
superfluous semicolon - a null statement is trivial to compile.
(about as difficult as a blank line or other white space)

In view of this evidence that the language M2 intended to permit
superfluous semicolons I suggest the compiler flag be renamed
-anal-semicolons and that the default is to allow them.
--
Regards,
Martin Brown
Chris Burrows
2015-12-15 07:57:37 UTC
Permalink
Raw Message
Post by Martin Brown
Mine is a second edition and initially seems to require that statements
are not empty and in addition all the examples follow the no trailing
";" rule so there is apparently some revision going on here. p168 gives
53 statement = [assignment|ProcedureCall|
54 IfStatement|CaseStatement|WhileStatement|
55 RepeatStatement|LoopStatement|ForStatement|
56 WithStatement|EXIT|RETURN [expression]].
However, the *ERROR* is in the language specification itself. The EBNF
for "statement" is incomplete and does not match the verbal description
given in the language report which explicitly *permits* a null statement
but neither defines one nor includes it in the grammar.
53 statement = assignment|ProcedureCall|
54 IfStatement|CaseStatement|WhileStatement|
55 RepeatStatement|LoopStatement|ForStatement|
56 WithStatement|EXIT|RETURN [expression].
i.e. the enclosing [ ] that indicate the whole item is optional would have to be removed.
Post by Martin Brown
EmptyStatement is left undefined and never referenced again.
EmptyStatement = ; would be sufficient to resolve the angst
or appending |; to the allowed statements
None of that is necessary. The rule that allows superfluous semicolons is

StatementSequence = statement {; statement}

Given that
S0, S1 are valid statements
null is a valid statement

the following parse as valid statement sequences:

S0
S0;S1
null
S0;null

It is this last one that misleads some people into thinking that the semicolon is a terminator when it is in fact a separator.
Post by Martin Brown
In view of this evidence that the language M2 intended to permit
superfluous semicolons I suggest the compiler flag be renamed
-anal-semicolons and that the default is to allow them.
I agree.
Martin Brown
2015-12-15 08:17:59 UTC
Permalink
Raw Message
Post by Chris Burrows
Post by Martin Brown
Mine is a second edition and initially seems to require that statements
are not empty and in addition all the examples follow the no trailing
";" rule so there is apparently some revision going on here. p168 gives
53 statement = [assignment|ProcedureCall|
54 IfStatement|CaseStatement|WhileStatement|
55 RepeatStatement|LoopStatement|ForStatement|
56 WithStatement|EXIT|RETURN [expression]].
However, the *ERROR* is in the language specification itself. The EBNF
for "statement" is incomplete and does not match the verbal description
given in the language report which explicitly *permits* a null statement
but neither defines one nor includes it in the grammar.
53 statement = assignment|ProcedureCall|
54 IfStatement|CaseStatement|WhileStatement|
55 RepeatStatement|LoopStatement|ForStatement|
56 WithStatement|EXIT|RETURN [expression].
i.e. the enclosing [ ] that indicate the whole item is optional would have to be removed.
I had missed the enclosing [ ] when I wrote that.
Post by Chris Burrows
Post by Martin Brown
EmptyStatement is left undefined and never referenced again.
EmptyStatement = ; would be sufficient to resolve the angst
or appending |; to the allowed statements
None of that is necessary. The rule that allows superfluous semicolons is
StatementSequence = statement {; statement}
Given that
S0, S1 are valid statements
null is a valid statement
S0
S0;S1
null
S0;null
It is this last one that misleads some people into thinking that the
semicolon is a terminator when it is in fact a separator.
Agreed. That is what I had intended to say. Although the way I looked at
it there first needs to be a rule that permits an Empty statement. I
hadn't spotted the enclosing [...] and had misread it as ... my mistake.
Post by Chris Burrows
Post by Martin Brown
In view of this evidence that the language M2 intended to permit
superfluous semicolons I suggest the compiler flag be renamed
-anal-semicolons and that the default is to allow them.
I agree.
--
Regards,
Martin Brown
trijezdci
2015-12-15 10:04:20 UTC
Permalink
Raw Message
Post by Martin Brown
Agreed. That is what I had intended to say. Although the way I looked at
it there first needs to be a rule that permits an Empty statement.
The sole semicolon rule you suggested as empty statement renders the grammar non-LL(1).

As for the naming of the flag, I had changed it to --errant-semicolon but have not uploaded it yet.

regards
Chris Burrows
2015-12-15 10:36:49 UTC
Permalink
Raw Message
Post by trijezdci
The sole semicolon rule you suggested as empty statement renders the grammar non-LL(1).
No need to worry - the "sole semicolon" rule is not required to allow superfluous semicolon statement separators. See my previous response to Martin and his subsequent reply.
tbreeden
2015-12-13 15:13:14 UTC
Permalink
Raw Message
m2c also made, almost without glitches, on my AmigaOS PPC system. (!)

A little more testing info:

MODULE Hello;
VAR i, j :INTEGER;
BEGIN
i := 22*j+13;
END Hello.

Gives the error:

line: 5, column: 1, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.

But if you remove the semicolon in the line before "END Hello" line, there is no error:

MODULE Hello;
VAR i, j :INTEGER;
BEGIN
i := 22*j+13
END Hello.

Also, this parses with no errors:

MODULE Hello;
VAR i, j :INTEGER;
BEGIN
IF i = 34 THEN
i := 22*j+13
END;
J := 123
END Hello.

But if you remove the semicolon after the if statement "END"

MODULE Hello;
VAR i, j :INTEGER;
BEGIN
IF i = 34 THEN
i := 22*j+13
END
J := 123
END Hello.

You get parse errors:

m2c Modula-2 Compiler & Translator, version 1.00
line: 7, column: 1, unexpected identifier 'J' found
expected reserved word END
line: 7, column: 3, unexpected symbol ':=' found
expected symbol '.'

Some error with Statement or Statement sequence parsing?

Tom
Martin Brown
2015-12-14 09:20:46 UTC
Permalink
Raw Message
Post by tbreeden
m2c also made, almost without glitches, on my AmigaOS PPC system. (!)
MODULE Hello;
VAR i, j :INTEGER;
BEGIN
i := 22*j+13;
END Hello.
line: 5, column: 1, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.
The final ";" before an END is superfluous but should still be tolerated
in PIM as a null statement. I never thought having the option of missing
the final ";" out before "END" was sensible in PIM M2 it used to
regularly catch out the unwary when making code modifications.
Post by tbreeden
MODULE Hello;
VAR i, j :INTEGER;
BEGIN
i := 22*j+13
END Hello.
This is allowable. ";" is optional before "END" rule.
Post by tbreeden
MODULE Hello;
VAR i, j :INTEGER;
BEGIN
IF i = 34 THEN
i := 22*j+13
END;
J := 123
END Hello.
Again this is consistent with the ";" is optional before "END".
Post by tbreeden
But if you remove the semicolon after the if statement "END"
MODULE Hello;
VAR i, j :INTEGER;
BEGIN
IF i = 34 THEN
i := 22*j+13
END
J := 123
END Hello.
m2c Modula-2 Compiler & Translator, version 1.00
line: 7, column: 1, unexpected identifier 'J' found
expected reserved word END
line: 7, column: 3, unexpected symbol ':=' found
expected symbol '.'
Some error with Statement or Statement sequence parsing?
That is correct behaviour. It is only the ";" immediately before an
"END" keyword that can be legitimately omitted. Any where else it should
ideally fail with an error "Missing semicolon at line N".
(rather than found unexpected identifier at line N+1)

If you deleted the line J := 123 it would be happy again since "END" is
syntactically valid in that position.
--
Regards,
Martin Brown
trijezdci
2015-12-14 10:04:53 UTC
Permalink
Raw Message
Due to popular request, I have added a compiler option --sloppy-semicolon which will treat a semicolon after a statement sequence as a warning instead of an error.

https://bitbucket.org/trijezdci/m2c-rework/commits/38cabb3586810dede6845fad1b4a9bbf3d25cf86

I agree that there are certain scenarios where error messages can be made more specific, but this this should be considered refinement. At this stage of development it is sufficient to use generic syntax error messages while attending to bugs and proceed with work on the back end.

thanks and regards
trijezdci
2015-12-14 10:18:16 UTC
Permalink
Raw Message
$ ./m2c EBNFScanner.mod --pim4 --sloppy-semicolon
m2c Modula-2 Compiler & Translator, version 1.00
line 20, column 5, warning: semicolon after statement sequence
line 43, column 3, warning: semicolon after statement sequence
line 51, column 5, warning: semicolon after statement sequence
line 54, column 5, warning: semicolon after statement sequence
line 56, column 4, warning: semicolon after statement sequence
line 81, column 35, warning: semicolon after statement sequence
line 85, column 4, warning: semicolon after statement sequence
line 94, column 4, warning: semicolon after statement sequence
line 103, column 4, warning: semicolon after statement sequence
parse error count: 0
trijezdci
2015-12-13 15:39:34 UTC
Permalink
Raw Message
Christoph, I cannot replicate the double incrementing on CR LF end-of-line markers.

Could you replace function m2c_read_char in module m2-filereader.c at line 196 with the following:

int m2c_read_char (m2c_infile_t infile) {
char ch;

/* check pre-conditions */
if (infile == NULL) {
return ASCII_NUL;
} /* end if */

if (infile->index == infile->buflen) {
infile->status = M2C_INFILE_STATUS_ATTEMPT_TO_READ_PAST_EOF;
return EOF;
} /* end if */

ch = infile->buffer[infile->index];
infile->index++;

/* if new line encountered, update line and column counters */
if (ch == ASCII_LF) {
infile->line++;
infile->column = 1;
}
else if (ch == ASCII_CR) {
infile->line++;
infile->column = 1;

printf("### end-of-line marker encountered ###\n");
printf(" infile->buffer[%u] = CR\n", infile->index-1);

/* if LF follows, skip it */
if ((infile->index < infile->buflen) &&
(infile->buffer[infile->index] == ASCII_LF)) {

printf(" infile->buffer[%u] = LF\n", infile->index);

infile->index++;
} /* end if */

ch = ASCII_LF;
}
else {
infile->column++;
} /* end if */

infile->status = M2C_INFILE_STATUS_SUCCESS;
return ch;
} /* end m2c_read_char */


then do make clean and make and run it again on the same input and let me know what it prints please.

thanks.
Christoph Schlegel
2015-12-13 16:12:03 UTC
Permalink
Raw Message
Post by trijezdci
Christoph, I cannot replicate the double incrementing on CR LF end-of-line markers.
int m2c_read_char (m2c_infile_t infile) {
char ch;
/* check pre-conditions */
if (infile == NULL) {
return ASCII_NUL;
} /* end if */
if (infile->index == infile->buflen) {
infile->status = M2C_INFILE_STATUS_ATTEMPT_TO_READ_PAST_EOF;
return EOF;
} /* end if */
ch = infile->buffer[infile->index];
infile->index++;
/* if new line encountered, update line and column counters */
if (ch == ASCII_LF) {
infile->line++;
infile->column = 1;
}
else if (ch == ASCII_CR) {
infile->line++;
infile->column = 1;
printf("### end-of-line marker encountered ###\n");
printf(" infile->buffer[%u] = CR\n", infile->index-1);
/* if LF follows, skip it */
if ((infile->index < infile->buflen) &&
(infile->buffer[infile->index] == ASCII_LF)) {
printf(" infile->buffer[%u] = LF\n", infile->index);
infile->index++;
} /* end if */
ch = ASCII_LF;
}
else {
infile->column++;
} /* end if */
infile->status = M2C_INFILE_STATUS_SUCCESS;
return ch;
} /* end m2c_read_char */
then do make clean and make and run it again on the same input and let me know what it prints please.
thanks.
Ok, source unchanged:

$ ./m2c ./FirstEx.mod --parser-debug
m2c Modula-2 Compiler & Translator, version 1.00
*** programModule ***
@ line: 1, column: 1, lookahead: MODULE
### end-of-line marker encountered ###
infile->buffer[16] = CR
infile->buffer[17] = LF
### end-of-line marker encountered ###
infile->buffer[18] = CR
*** import ***
@ line: 4, column: 1, lookahead: FROM
*** unqualifiedImport ***
@ line: 4, column: 1, lookahead: FROM
*** identList ***
@ line: 4, column: 19, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[70] = CR
infile->buffer[71] = LF
### end-of-line marker encountered ###
infile->buffer[72] = CR
*** block ***
@ line: 7, column: 1, lookahead: VAR
*** declaration ***
@ line: 7, column: 1, lookahead: VAR
*** variableDeclaration ***
@ line: 7, column: 5, lookahead: Index
*** identList ***
@ line: 7, column: 5, lookahead: Index
*** type ***
@ line: 7, column: 13, lookahead: CARDINAL
*** qualident ***
@ line: 7, column: 13, lookahead: CARDINAL
### end-of-line marker encountered ###
infile->buffer[95] = CR
infile->buffer[96] = LF
### end-of-line marker encountered ###
infile->buffer[97] = CR
### end-of-line marker encountered ###
infile->buffer[104] = CR
infile->buffer[105] = LF
### end-of-line marker encountered ###
infile->buffer[106] = CR
*** statementSequence ***
@ line: 13, column: 4, lookahead: WriteString
*** statement ***
@ line: 13, column: 4, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 13, column: 4, lookahead: WriteString
*** designator ***
@ line: 13, column: 4, lookahead: WriteString
*** qualident ***
@ line: 13, column: 4, lookahead: WriteString
*** actualParameters ***
@ line: 13, column: 15, lookahead: WriteString
*** expressionList ***
@ line: 13, column: 16, lookahead: This is our first example program
*** expression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleExpression ***
@ line: 13, column: 16, lookahead: This is our first example program
*** term ***
@ line: 13, column: 16, lookahead: This is our first example program
*** simpleTerm ***
@ line: 13, column: 16, lookahead: This is our first example program
*** factor ***
@ line: 13, column: 16, lookahead: This is our first example program
### end-of-line marker encountered ###
infile->buffer[160] = CR
*** statement ***
@ line: 15, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 15, column: 4, lookahead: WriteLn
*** designator ***
@ line: 15, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 15, column: 4, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[173] = CR
*** statement ***
@ line: 17, column: 4, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 17, column: 4, lookahead: WriteLn
*** designator ***
@ line: 17, column: 4, lookahead: WriteLn
*** qualident ***
@ line: 17, column: 4, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[186] = CR
*** statement ***
@ line: 19, column: 4, lookahead: FOR
*** forStatement ***
@ line: 19, column: 4, lookahead: FOR
*** expression ***
@ line: 19, column: 17, lookahead: 1
*** simpleExpression ***
@ line: 19, column: 17, lookahead: 1
*** term ***
@ line: 19, column: 17, lookahead: 1
*** simpleTerm ***
@ line: 19, column: 17, lookahead: 1
*** factor ***
@ line: 19, column: 17, lookahead: 1
*** expression ***
@ line: 19, column: 22, lookahead: 12
*** simpleExpression ***
@ line: 19, column: 22, lookahead: 12
*** term ***
@ line: 19, column: 22, lookahead: 12
*** simpleTerm ***
@ line: 19, column: 22, lookahead: 12
*** factor ***
@ line: 19, column: 22, lookahead: 12
### end-of-line marker encountered ###
infile->buffer[214] = CR
*** statementSequence ***
@ line: 21, column: 7, lookahead: WriteString
*** statement ***
@ line: 21, column: 7, lookahead: WriteString
*** assignmentOrProcCall ***
@ line: 21, column: 7, lookahead: WriteString
*** designator ***
@ line: 21, column: 7, lookahead: WriteString
*** qualident ***
@ line: 21, column: 7, lookahead: WriteString
*** actualParameters ***
@ line: 21, column: 18, lookahead: WriteString
*** expressionList ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** expression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleExpression ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** term ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** simpleTerm ***
@ line: 21, column: 19, lookahead: The value of the index is now
*** factor ***
@ line: 21, column: 19, lookahead: The value of the index is now
### end-of-line marker encountered ###
infile->buffer[268] = CR
*** statement ***
@ line: 23, column: 7, lookahead: WriteCard
*** assignmentOrProcCall ***
@ line: 23, column: 7, lookahead: WriteCard
*** designator ***
@ line: 23, column: 7, lookahead: WriteCard
*** qualident ***
@ line: 23, column: 7, lookahead: WriteCard
*** actualParameters ***
@ line: 23, column: 16, lookahead: WriteCard
*** expressionList ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 17, lookahead: Index
*** simpleExpression ***
@ line: 23, column: 17, lookahead: Index
*** term ***
@ line: 23, column: 17, lookahead: Index
*** simpleTerm ***
@ line: 23, column: 17, lookahead: Index
*** factor ***
@ line: 23, column: 17, lookahead: Index
*** designatorOrFuncCall ***
@ line: 23, column: 17, lookahead: Index
*** designator ***
@ line: 23, column: 17, lookahead: Index
*** qualident ***
@ line: 23, column: 17, lookahead: Index
*** expression ***
@ line: 23, column: 23, lookahead: 3
*** simpleExpression ***
@ line: 23, column: 23, lookahead: 3
*** term ***
@ line: 23, column: 23, lookahead: 3
*** simpleTerm ***
@ line: 23, column: 23, lookahead: 3
*** factor ***
@ line: 23, column: 23, lookahead: 3
### end-of-line marker encountered ###
infile->buffer[295] = CR
*** statement ***
@ line: 25, column: 7, lookahead: WriteLn
*** assignmentOrProcCall ***
@ line: 25, column: 7, lookahead: WriteLn
*** designator ***
@ line: 25, column: 7, lookahead: WriteLn
*** qualident ***
@ line: 25, column: 7, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[311] = CR
line: 27, column: 4, unexpected reserved word END found
expected CASE, EXIT, FOR, IF, LOOP, REPEAT, RETURN, WHILE, WITH or identifier.
### end-of-line marker encountered ###
infile->buffer[319] = CR
infile->buffer[320] = LF
### end-of-line marker encountered ###
infile->buffer[321] = CR
parse error count: 1

By the way after converting the file to UNIX LF it worked as expected.
trijezdci
2015-12-13 16:31:15 UTC
Permalink
Raw Message
Post by Christoph Schlegel
$ ./m2c ./FirstEx.mod --parser-debug
m2c Modula-2 Compiler & Translator, version 1.00
*** programModule ***
@ line: 1, column: 1, lookahead: MODULE
### end-of-line marker encountered ###
infile->buffer[16] = CR
infile->buffer[17] = LF
### end-of-line marker encountered ###
infile->buffer[18] = CR
*** import ***
@ line: 4, column: 1, lookahead: FROM
*** unqualifiedImport ***
@ line: 4, column: 1, lookahead: FROM
*** identList ***
@ line: 4, column: 19, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[70] = CR
infile->buffer[71] = LF
### end-of-line marker encountered ###
infile->buffer[72] = CR
[snip]

According to the output, your input file ends the lines not with CR LF but with CR LF CR.

It this is indeed the case, then the behavior of the filereader is correct. It counts CR LF and one line and the following sole CR as another line. The syntax for end-of-line allows LF or CR or CR LF.

Can you verify that the file has CR LF CR at the positions reported in the output?

thanks
Christoph Schlegel
2015-12-13 17:11:06 UTC
Permalink
Raw Message
Post by trijezdci
Post by Christoph Schlegel
$ ./m2c ./FirstEx.mod --parser-debug
m2c Modula-2 Compiler & Translator, version 1.00
*** programModule ***
@ line: 1, column: 1, lookahead: MODULE
### end-of-line marker encountered ###
infile->buffer[16] = CR
infile->buffer[17] = LF
### end-of-line marker encountered ###
infile->buffer[18] = CR
*** import ***
@ line: 4, column: 1, lookahead: FROM
*** unqualifiedImport ***
@ line: 4, column: 1, lookahead: FROM
*** identList ***
@ line: 4, column: 19, lookahead: WriteLn
### end-of-line marker encountered ###
infile->buffer[70] = CR
infile->buffer[71] = LF
### end-of-line marker encountered ###
infile->buffer[72] = CR
[snip]
According to the output, your input file ends the lines not with CR LF but with CR LF CR.
It this is indeed the case, then the behavior of the filereader is correct. It counts CR LF and one line and the following sole CR as another line. The syntax for end-of-line allows LF or CR or CR LF.
Can you verify that the file has CR LF CR at the positions reported in the output?
thanks
This is the file with inserted CRs and LFs as shown by Notepad++ (show all characters on):

MODULE FirstEx;CRLF
CRLF
FROM InOut IMPORT WriteLn, WriteString, WriteCard;CRLF
CRLF
VAR Index : CARDINAL;CRLF
CRLF
BEGINCRLF
CRLF
WriteString("This is our first example program");CRLF
WriteLn;CRLF
WriteLn;CRLF
FOR Index := 1 TO 12 DOCRLF
WriteString("The value of the index is now ");CRLF
WriteCard(Index,3);CRLF
WriteLn;CRLF
ENDCRLF
CRLF
END FirstEx.

No additional CR -
j***@gmail.com
2015-12-13 15:34:02 UTC
Permalink
Raw Message
A source too big to post here, compiles cleanly on mocka. Here it is:

http://fruttenboel.verhoeven272.nl/m4m/data/Plov022.mod

When running it through m2c, it reports:

***@oxygen ~/Modula/Plov$ cat plov022.m2c
m2c Modula-2 Compiler & Translator, version 1.00
line: 167, column: 1, unexpected reserved word VAR found
expected reserved word BEGIN
line: 185, column: 16, unexpected symbol ';' found
expected symbol '.'
parse error count: 2

As far as I can tell, the line numbering is correct.

m2c does not accept local vars in the second procedure of the source.
Also, m2c thinks the procedure is the MAIN module.
trijezdci
2015-12-13 15:57:03 UTC
Permalink
Raw Message
Post by j***@gmail.com
m2c does not accept local vars in the second procedure of the source.
Also, m2c thinks the procedure is the MAIN module.
thanks.

there was a bug in the rule for procedure declaration in that it was looking for token BEGIN instead for FIRST(block), probably a copy-paste artefact. I thought I had fixed this already before, but maybe this was in module declaration. The second error is almost certainly caused by the first. Fixed now.

https://bitbucket.org/trijezdci/m2c-rework/commits/ada06ade32cb2fdb8b3e279430bf27bbeb856815

regards
j***@gmail.com
2015-12-13 15:54:55 UTC
Permalink
Raw Message
***@oxygen ~/Modula/klim$ ../../Langs/cc/princ/detab <klim08.mod >klim08.m
***@oxygen ~/Modula/klim$ m2c klim08.m --pim3 >klim08.m2c
bash: m2c: command not found
***@oxygen ~/Modula/klim$ ../m2c klim08.m --pim3 >klim08.m2c
***@oxygen ~/Modula/klim$ less klim08.m2c
***@oxygen ~/Modula/klim$ cat klim08.m2c
m2c Modula-2 Compiler & Translator, version 1.00
invalid filename, suffix must be .def, .DEF, .mod or .MOD
parse error count: 3

error count = 3, only one error made... error inflation? ;)

it makes a big difference if I enter

m2c --pim3 sourcefile

or

m2c sourcefile --pim3

the latter won't run, to be precise.

***@oxygen ~/Modula/klim$ ../m2c klim.mod --pim3
m2c Modula-2 Compiler & Translator, version 1.00

no 'parse error count' produced. It looks like m2c choked in it....

this is the sourcefile:

http://fruttenboel.verhoeven272.nl/tmp/klim.mod
trijezdci
2015-12-13 16:07:26 UTC
Permalink
Raw Message
Post by j***@gmail.com
invalid filename, suffix must be .def, .DEF, .mod or .MOD
parse error count: 3
it shouldn't report any parse error count at all if the input file is invalid.

fixed now

https://bitbucket.org/trijezdci/m2c-rework/commits/78b6e4a07c9b0e0344f551d4cf3b53c61bb36530
Post by j***@gmail.com
it makes a big difference if I enter
m2c --pim3 sourcefile
usage is

m2c sourcefile [option]

see m2c -- help
Post by j***@gmail.com
m2c Modula-2 Compiler & Translator, version 1.00
no 'parse error count' produced. It looks like m2c choked in it....
you can use option --parser-debug to get a print out of what the parser is doing.
Post by j***@gmail.com
http://fruttenboel.verhoeven272.nl/tmp/klim.mod
thanks I will take a look at it later.
j***@gmail.com
2015-12-13 16:08:41 UTC
Permalink
Raw Message
How do I change my local files with hg? hg clone only works once. hg update?
trijezdci
2015-12-13 16:32:16 UTC
Permalink
Raw Message
Post by j***@gmail.com
How do I change my local files with hg? hg clone only works once. hg update?
yes, correct.

https://www.selenic.com/hg/help/update
j***@gmail.com
2015-12-14 12:52:16 UTC
Permalink
Raw Message
It's not that difficult to incorporate a null statement in a recursive descent parser. A warning about a null statement is a definite nono.

It gives me a warm feeling to NEVER use a semi before a closing-keyword but sometimes, when looking for errors and making quick fixes, you forget to take out a loose ';'.
trijezdci
2015-12-14 13:06:44 UTC
Permalink
Raw Message
Post by j***@gmail.com
It gives me a warm feeling to NEVER use a semi before a closing-keyword but sometimes, when looking for errors and making quick fixes, you forget to take out a loose ';'.
Indeed, and the cure is to fix it. Very simple.
trijezdci
2015-12-14 13:13:49 UTC
Permalink
Raw Message
Since the code is already there for the warning, I rearranged it such that M2C will also report a specific message when the option is set to treat it as an error.

https://bitbucket.org/trijezdci/m2c-rework/commits/ae5f70aea197c7f7c96dd87189b529eca9a72b2a

$ ./m2c EBNFScanner.mod --pim4
m2c Modula-2 Compiler & Translator, version 1.00
line 20, column 5, error: semicolon at end of statement sequence
line 43, column 3, error: semicolon at end of statement sequence
line 51, column 5, error: semicolon at end of statement sequence
line 54, column 5, error: semicolon at end of statement sequence
line 56, column 4, error: semicolon at end of statement sequence
line 81, column 35, error: semicolon at end of statement sequence
line 85, column 4, error: semicolon at end of statement sequence
line 94, column 4, error: semicolon at end of statement sequence
line 103, column 4, error: semicolon at end of statement sequence
parse error count: 9

$ ./m2c EBNFScanner.mod --pim4 --sloppy-semicolon
m2c Modula-2 Compiler & Translator, version 1.00
line 20, column 5, warning: semicolon at end of statement sequence
line 43, column 3, warning: semicolon at end of statement sequence
line 51, column 5, warning: semicolon at end of statement sequence
line 54, column 5, warning: semicolon at end of statement sequence
line 56, column 4, warning: semicolon at end of statement sequence
line 81, column 35, warning: semicolon at end of statement sequence
line 85, column 4, warning: semicolon at end of statement sequence
line 94, column 4, warning: semicolon at end of statement sequence
line 103, column 4, warning: semicolon at end of statement sequence
parse error count: 0


For consistency, this applies to a semicolon following a statement sequence in any context, that is before any symbol that is in set FOLLOW(statementSequence).

FOLLOW(statementSequence) = {
RW-ELSE,
RW-ELSIF,
RW-END,
RW-UNTIL,
VERTICAL-BAR
};
Christoph Schlegel
2015-12-14 23:28:12 UTC
Permalink
Raw Message
Post by trijezdci
I have now rewritten the front-end of Makarov's M2C Modula-2 compiler.
https://bitbucket.org/trijezdci/m2c-rework/src
Simply type make and hit return to build it.
m2c -h will print a help with available options.
All OS dependencies are in a single module called m2-fileutils. For this module there is at present no Windows/DOS implementation. It is a small module with only a handful of functions and most are wrappers. It shouldn't be difficult to modify this to derive a Windows/DOS version. With a bit of luck somebody may volunteer to contribute an Windows/DOS implementation of the module.
[politely snipped]

Hi Benjamin,

just wanted to let you know m2c builds under MinGW32/MSYS which means there is a Windows binary file for those interested or willing to help in testing.

http://freepages.modula2.org/downloads/m2c.exe

Regards,
Christoph
trijezdci
2015-12-15 05:38:43 UTC
Permalink
Raw Message
Post by Christoph Schlegel
just wanted to let you know m2c builds under MinGW32/MSYS which means there is a Windows binary file for those interested or willing to help in testing.
http://freepages.modula2.org/downloads/m2c.exe
thanks.

Can the binary work without MinGW or do you still need to install MinGW to run it?

In any event I presume it expects Unix style pathnames still. The reason we want a Windows/DOS specific version of m2-fileutils is that it would then work with Windows/DOS pathnames.

regards
Christoph Schlegel
2015-12-16 06:55:10 UTC
Permalink
Raw Message
Post by trijezdci
Post by Christoph Schlegel
just wanted to let you know m2c builds under MinGW32/MSYS which means there is a Windows binary file for those interested or willing to help in testing.
http://freepages.modula2.org/downloads/m2c.exe
thanks.
Can the binary work without MinGW or do you still need to install MinGW to run it?
In any event I presume it expects Unix style pathnames still. The reason we want a Windows/DOS specific version of m2-fileutils is that it would then work with Windows/DOS pathnames.
regards
Hi,

Programs compiled with MinGW are native Windows programs. Nothing else needed.

Yes, you are right, the program crashes immediately.
trijezdci
2015-12-16 09:09:02 UTC
Permalink
Raw Message
Post by Christoph Schlegel
Post by trijezdci
In any event I presume it expects Unix style pathnames still. The reason we want a Windows/DOS specific version of m2-fileutils is that it would then work with Windows/DOS pathnames.
Yes, you are right, the program crashes immediately.
Does it print any error message?

It would be interesting to find out whether this happens as a result of calling the system calls that are OS specific (test if a file exists, get a file's size, get the current working directory) or the functions I wrote to verify that a filename or pathname is correct (which checks for ./ ../ and / and only lets characters pass that are allowed in M2 identifier names because the input files need to match up with module identifiers).

Do you know what the system calls are on Windows to

(1) check if a file exists
(2) get the filesize of a file
(3) get the current working directory

and in which include files those system calls are defined?

Further, perhaps you can tell me if .\ and ..\ is valid pathname syntax on Windows and means the same as ./ and ../ on Posix. If so, we can quickly change / into \ to derive a Windows version of the module.

Posix doesn't have device names like C: or D: so this would need some extra work still, but at least relative pathames should then work.
Marco van de Voort
2015-12-16 10:00:58 UTC
Permalink
Raw Message
Post by trijezdci
It would be interesting to find out whether this happens as a result of calling the system calls that are OS specific (test if a file exists, get a file's size, get the current working directory) or the functions I wrote to verify that a filename or pathname is correct (which checks for ./ ../ and / and only lets characters pass that are allowed in M2 identifier names because the input files need to match up with module identifiers).
Do you know what the system calls are on Windows to
(1) check if a file exists
(2) get the filesize of a file
both can be done with findfirstfile/findnextfile.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa364418%28v=vs.85%29.aspx
Post by trijezdci
(3) get the current working directory
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364934%28v=vs.85%29.aspx
Post by trijezdci
and in which include files those system calls are defined?
In the core (325k) windows api header winuser.h (which includes winbase.h and
winuser.h)
Post by trijezdci
Further, perhaps you can tell me if .\ and ..\ is valid pathname syntax on
Windows and means the same as ./ and ../ on Posix. If so, we can quickly
change / into \ to derive a Windows version of the module.
Yes, but Windows also has drive letters. Files might be on different
drives. Though for bootstrap functionality you might require everything on
one drive.
Post by trijezdci
Posix doesn't have device names like C: or D: so this would need some
extra work still, but at least relative pathames should then work.
Devices are things like CON: and PRN: and the nt devices like
\\.\PhysicalDiskX. IOW the things that are mounted, not
the place they are mounted on like driveletters. Those are volumes or
mountpoints.
trijezdci
2015-12-16 10:53:08 UTC
Permalink
Raw Message
Post by Marco van de Voort
findfirstfile/findnextfile.
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364418%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364934%28v=vs.85%29.aspx
In the core (325k) windows api header winuser.h (which includes winbase.h and
winuser.h)
Thanks for that, I will take a look at it.
Post by Marco van de Voort
Post by trijezdci
Posix doesn't have device names like C: or D: so this would need some
extra work still, but at least relative pathames should then work.
Devices are things like CON: and PRN: and the nt devices like
\\.\PhysicalDiskX. IOW the things that are mounted, not
the place they are mounted on like driveletters. Those are volumes or
mountpoints.
Storage medium support should be perfectly sufficient. Who wants to read a source from a console only to end up with a compilation product for which the source no longer exists?!

As for reading a source file from a printer, good luck with that. I am not going to even imagine to try. ;-)
Christoph Schlegel
2015-12-16 13:22:21 UTC
Permalink
Raw Message
Post by trijezdci
Post by Christoph Schlegel
Post by trijezdci
In any event I presume it expects Unix style pathnames still. The reason we want a Windows/DOS specific version of m2-fileutils is that it would then work with Windows/DOS pathnames.
Yes, you are right, the program crashes immediately.
Does it print any error message?
A small system window opens up and informs me that the program doesn't work any more. This is a Windows message. I started the executable from the MSYS shell. As soon as I close the window (which also informs that Windows is searching for a solution for the problem) m2c.exe terminates without message etc.

Interesting fact: I tried to find out where it crashes and run the program within gdb: here it works fine, parse error count 0.
Post by trijezdci
It would be interesting to find out whether this happens as a result of calling the system calls that are OS specific (test if a file exists, get a file's size, get the current working directory) or the functions I wrote to verify that a filename or pathname is correct (which checks for ./ ../ and / and only lets characters pass that are allowed in M2 identifier names because the input files need to match up with module identifiers).
Do you know what the system calls are on Windows to
(1) check if a file exists
(2) get the filesize of a file
(3) get the current working directory
and in which include files those system calls are defined?
Further, perhaps you can tell me if .\ and ..\ is valid pathname syntax on Windows and means the same as ./ and ../ on Posix. If so, we can quickly change / into \ to derive a Windows version of the module.
Posix doesn't have device names like C: or D: so this would need some extra work still, but at least relative pathames should then work.
trijezdci
2015-12-16 13:26:31 UTC
Permalink
Raw Message
Post by Christoph Schlegel
A small system window opens up and informs me that the program doesn't work any more. This is a Windows message. I started the executable from the MSYS shell. As soon as I close the window (which also informs that Windows is searching for a solution for the problem) m2c.exe terminates without message etc.
Interesting fact: I tried to find out where it crashes and run the program within gdb: here it works fine, parse error count 0.
That's funny.

Perhaps you can try out the Windows specific version of fileutils I made (using diverse online resources).

https://bitbucket.org/trijezdci/m2c-rework/src/tip/m2-fileutils.win.c

You will need to rename this file to m2-fileutils.c

regards
Christoph Schlegel
2015-12-19 19:22:32 UTC
Permalink
Raw Message
Post by Christoph Schlegel
Post by trijezdci
Post by Christoph Schlegel
Post by trijezdci
In any event I presume it expects Unix style pathnames still. The reason we want a Windows/DOS specific version of m2-fileutils is that it would then work with Windows/DOS pathnames.
Yes, you are right, the program crashes immediately.
Does it print any error message?
A small system window opens up and informs me that the program doesn't work any more. This is a Windows message. I started the executable from the MSYS shell. As soon as I close the window (which also informs that Windows is searching for a solution for the problem) m2c.exe terminates without message etc.
Interesting fact: I tried to find out where it crashes and run the program within gdb: here it works fine, parse error count 0.
Post by trijezdci
It would be interesting to find out whether this happens as a result of calling the system calls that are OS specific (test if a file exists, get a file's size, get the current working directory) or the functions I wrote to verify that a filename or pathname is correct (which checks for ./ ../ and / and only lets characters pass that are allowed in M2 identifier names because the input files need to match up with module identifiers).
Hi,

the MinGW journey continues. The program works well when called with the options --pim4 --parser-debug. Without --parser-debug it crashes. I cannot find out why as everything works fine in gdb. Searching for alternative debugging methods I found DrMinGW, a postmortem debugging tool for MinGW.

Added to CFLAGS in Makefile -ggdb -fno-omit-frame-pointer following docs of DrMinGW, here is the output:

m2c.exe caused an Access Violation at location 00000000FFFFFFFF Reading from location 00000000FFFFFFFF.

Loading symbols... done.

Registers:
eax=ffffffff ebx=0028f890 ecx=76bc830c edx=00000008 esi=769c030c edi=00401000
eip=ffffffff esp=0028f7bc ebp=0028f860 iopl=0 nv up ei ng nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010286

AddrPC Params
FFFFFFFF 0000000B 00000000 769C030C
004010F9 0028F890 7737332C 00000000 m2c.exe!_gnu_exception_handler [e:\p\giaw\src\pkg\mingwrt-4.0.3-1-mingw32-src\bld/../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c @ 137]
7737344F 00000000 0028FFD4 7732C540 ntdll.dll!***@8
7737332C 00000000 00000000 00000000 ntdll.dll!@***@8
773731D1 FFFFFFFE 0028FFC4 0028F9E0 ntdll.dll!__except_handler4
7735B60D 0028F990 0028FFC4 0028F9E0 ntdll.dll!***@20
7735B5DF 0028F990 0028FFC4 0028F9E0 ntdll.dll!***@20
7735B580 0028F990 0028F9E0 0028F990 ntdll.dll!***@8
77310133 0028F9E0 C0000005 00000000 ntdll.dll!***@8
0028F990 00000000 0064187C 00000003
0040AB83 0064187C 000000B3 00000003 m2c.exe!m2c_get_string_for_slice [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-unique-string.c @ 453]
451: if /* match found in current entry */
452: ((this_entry->key == key) &&
Post by Christoph Schlegel
453: matches_str_and_length(this_entry->str, str, length)) {
454:
455: /* get string object of matching entry and retain it */
00402BF8 00641858 4628FE59 0028FD78 m2c.exe!m2c_read_marked_lexeme [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-filereader.c @ 305]
303:
304: /* copy lexeme */
Post by Christoph Schlegel
305: lexeme = m2c_get_string_for_slice
306: (infile->buffer, infile->marked_index, length, &status);
307:
004023B8 00641698 0028FD90 0028FDC8 m2c.exe!get_ident_or_resword [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-lexer.c @ 987]
985:
986: /* get lexeme */
Post by Christoph Schlegel
987: lexer->lookahead.lexeme = m2c_read_marked_lexeme(lexer->infile);
988:
989: /* check if lexeme is reserved word */
00401F8A 00641698 0040FA50 0028FE08 m2c.exe!get_new_lookahead_sym [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-lexer.c @ 729]
727: case 'Z' :
728: /* identifier or reserved word */
Post by Christoph Schlegel
729: next_char = get_ident_or_resword(lexer, &token);
730: break;
731:
00401867 00641698 0040F5D0 00410000 m2c.exe!m2c_consume_sym [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-lexer.c @ 287]
285:
286: /* read new lookahead symbol and return it */
Post by Christoph Schlegel
287: get_new_lookahead_sym(lexer);
288: return lexer->lookahead.token;
289:
004085DC 00641A20 0040F5C0 0040FA50 m2c.exe!statement_sequence [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-parser.c @ 2635]
2633: line_of_semicolon = m2c_lexer_lookahead_line(p->lexer);
2634: column_of_semicolon = m2c_lexer_lookahead_column(p->lexer);
Post by Christoph Schlegel
2635: lookahead = m2c_consume_sym(p->lexer);
2636:
2637: /* check if semicolon occurred at the end of a statement sequence */
00407922 00641A20 0040F530 0040F990 m2c.exe!block [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-parser.c @ 2177]
2175: /* statementSequence */
2176: else if (match_set(p, FIRST(STATEMENT_SEQUENCE), FOLLOW(STATEMENT))) {
Post by Christoph Schlegel
2177: lookahead = statement_sequence(p);
2178: } /* end if */
2179: } /* end if */
0040765A 00641A20 006419F8 00000000 m2c.exe!program_module [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-parser.c @ 2082]
2080: /* block */
2081: if (match_set(p, FIRST(BLOCK), FOLLOW(PROGRAM_MODULE))) {
Post by Christoph Schlegel
2082: lookahead = block(p);
2083:
2084: /* moduleIdent */
004048E6 006419F8 0028FED8 0028FEE0 m2c.exe!m2c_syntax_check_mod [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2-parser.c @ 219]
217: }
218: else if (lookahead == TOKEN_MODULE) {
Post by Christoph Schlegel
219: program_module(p);
220: }
221: else /* not a program or implementation part */ {
0040C424 00000003 00641680 00641D20 m2c.exe!main [C:\MinGW\msys\1.0\home\Christoph\trijezdci-m2c-rework-e41d6ef82d51/m2c.c @ 185]
183: }
184: else if (IS_MOD(file_suffix)) {
Post by Christoph Schlegel
185: m2c_syntax_check_mod(filename, &parser_status, &error_count);
186: printf("parse error count: %u\n", error_count);
187: }
00401413 00000001 00000000 00000000 m2c.exe!__mingw_CRTStartup [e:\p\giaw\src\pkg\mingwrt-4.0.3-1-mingw32-src\bld/../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c @ 254]
00401585 7EFDE000 7772ED0A 00000000 m2c.exe!mainCRTStartup [e:\p\giaw\src\pkg\mingwrt-4.0.3-1-mingw32-src\bld/../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c @ 272]
77339882 00401570 7EFDE000 00000000 ntdll.dll!***@8
77339855 00401570 7EFDE000 00000000 ntdll.dll!***@8

Maybe this is useful? Not for me...

Regards
C.
r***@gmail.com
2015-12-17 00:17:29 UTC
Permalink
Raw Message
Hi,
Post by trijezdci
It would be interesting to find out whether this happens as a result
of calling the system calls that are OS specific (test if a file
exists, get a file's size, get the current working directory) or the
functions I wrote to verify that a filename or pathname is correct
(which checks for ./ ../ and / and only lets characters pass that are
allowed in M2 identifier names because the input files need to match
up with module identifiers).
Do you know what the system calls are on Windows to
(1) check if a file exists
(2) get the filesize of a file
(3) get the current working directory
and in which include files those system calls are defined?
This may be naive, but the obvious thing to do would be to glean
the required functions from FreePascal's RTL sources (since that
does support Win32/64, among others):

1). http://www.freepascal.org/docs-html/rtl/sysutils/fileexists.html
2). http://www.freepascal.org/docs-html/rtl/system/filesize.html
3). http://www.freepascal.org/docs-html/rtl/system/getdir.html
trijezdci
2015-12-16 12:57:41 UTC
Permalink
Raw Message
On Wednesday, 16 December 2015 15:55:11 UTC+9, Christoph Schlegel wrote:

Christoph, could you get a copy of

https://bitbucket.org/trijezdci/m2c-rework/src/8e00051beb6256a539e00d318c9cbdee14177a29/m2-fileutils.win.c

rename it to m2-fileutils.c, replace that with the version in your m2c working directory and try that please?

Note, I have not built this because I do not have the required headers. So, if there are any build errors, just post the entire output and I'll fix it.

regards
Christoph Schlegel
2015-12-16 13:37:44 UTC
Permalink
Raw Message
Post by trijezdci
Christoph, could you get a copy of
https://bitbucket.org/trijezdci/m2c-rework/src/8e00051beb6256a539e00d318c9cbdee14177a29/m2-fileutils.win.c
rename it to m2-fileutils.c, replace that with the version in your m2c working directory and try that please?
Note, I have not built this because I do not have the required headers. So, if there are any build errors, just post the entire output and I'll fix it.
regards
Hi,

just compiled m2c with the new file. There was a missing ")" in line 144:59. Now I get "invalid filename" when m2c is called like "m2c ./FirstEx.mod --pim4", the program still crashes without more info when called "m2c FirstEx.mod --pim4" - maybe I'll find time to look into the C code in the evening.

Regards
C.
trijezdci
2015-12-16 14:04:52 UTC
Permalink
Raw Message
Post by Christoph Schlegel
just compiled m2c with the new file. There was a missing ")" in line 144:59.
Thanks. I have now fixed that in the latest commit.
Post by Christoph Schlegel
Now I get "invalid filename" when m2c is called like "m2c ./FirstEx.mod --pim4"
Of course it would. The Windows specific version will need backslashes as directory separators.

Try m2c .\FirstEx.mod --pim4 instead.
Post by Christoph Schlegel
the program still crashes without more info when called "m2c FirstEx.mod --pim4" - maybe I'll find time to look into the C code in the evening.
If it doesn't report an invalid filename as it does when you pass an invalid name, that suggests that the crash occurs in one of the system calls to test if the file exists or to get the filesize.

I am using _stat() following the info on this page below:

https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx

perhaps you can verify if these calls succeed.

thanks & regards
Martin Brown
2015-12-16 14:05:06 UTC
Permalink
Raw Message
Post by Christoph Schlegel
Post by trijezdci
Christoph, could you get a copy of
https://bitbucket.org/trijezdci/m2c-rework/src/8e00051beb6256a539e00d318c9cbdee14177a29/m2-fileutils.win.c
rename it to m2-fileutils.c, replace that with the version in your m2c working directory and try that please?
Note, I have not built this because I do not have the required headers. So, if there are any build errors, just post the entire output and I'll fix it.
regards
Hi,
just compiled m2c with the new file. There was a missing ")" in line 144:59. Now I get "invalid filename"
when m2c is called like "m2c ./FirstEx.mod --pim4",
Suspect that strip pathname isn't behaving quite right so it fails on
checking the module name. Old M2C did that under DOS too.

It worked OK for files in the same directory as the executable.
Post by Christoph Schlegel
the program still crashes without more info when called "m2c
FirstEx.mod --pim4" -
Post by Christoph Schlegel
maybe I'll find time to look into the C code in the evening.
That is probably invoking the M2C compiler and then hitting something
that is Unix specific and not available. Suspect stat as a first guess.
--
Regards,
Martin Brown
trijezdci
2015-12-16 14:38:32 UTC
Permalink
Raw Message
Post by Martin Brown
Suspect that strip pathname isn't behaving quite right so it fails on
checking the module name. Old M2C did that under DOS too.
Note that this front end including its driver program is a complete rewrite.

There is no pathname stripping. The function filetype() in m2-fileutils.c scans the entire path, remembers a pointer to the last period found, and this pointer is NULLed when it finds a directory separator. It returns the remembered pointer when it finds the terminating ASCII NUL.

The only difference between the POSIX and Windows versions of this function is the directory separator, a slash for POISX and a backslash for Windows.

The driver program then tests the returned pointer for ".def", ".DEF", ".mod" and ".MOD" and reports an error if none of those match.

Since this function works just fine in the POSIX version, it is not likely it fails, let alone crash in the Windows version because the only change is the '\\' instead of '/' in the separator test.

It is more likely that one of the system calls is to blame, there are a plethora of similar calls, _stat, _wstat, _stat32, _wstat32, _stat64, _wstat64 and I have absolutely no clue about the Windows programming environment, I simply picked _stat since that looked closest to the POSIX version of the same system call on POSIX systems.

It could well be that you have to test various environment dependent macros and call one of these many stat functions depending on whatever the OS environment is. If that's the case it would not be surprising if it crashes when you pick the wrong system call.
Post by Martin Brown
Post by Christoph Schlegel
the program still crashes without more info when called "m2c
FirstEx.mod --pim4" -
That is probably invoking the M2C compiler and then hitting something
that is Unix specific and not available.
This is a standalone front end, it does not call and is not linked to any backend code of the old M2C.
Marco van de Voort
2015-12-16 14:52:58 UTC
Permalink
Raw Message
Post by trijezdci
The only difference between the POSIX and Windows versions of this
function is the directory separator, a slash for POISX and a backslash for
Windows.
(best to simply change all forward slashes to backward on entry of
every filename handling procedure. Windows shells and explorers are lax with
this, and that rubs off on users).

In general, forget about POSIX and _stat as soon as you enter windows, and
do things the windows way. That saves you a lot of pain in the long run.

Most of the POSIX users don't really grok that it is shell that rewrites
commandlines for programs, and not the program itself, so assuming
shell parameter rewriting ports badly.
trijezdci
2015-12-16 15:21:10 UTC
Permalink
Raw Message
Post by Marco van de Voort
Post by trijezdci
The only difference between the POSIX and Windows versions of this
function is the directory separator, a slash for POISX and a backslash for
Windows.
(best to simply change all forward slashes to backward on entry of
every filename handling procedure. Windows shells and explorers are lax with
this, and that rubs off on users).
That's precisely what I have described above.

The windows version of the file uses backslashes and the posix version uses slashes.

Apart from that, the windows version also handles a storage device (C:, D: etc) and a server name (\\foobar) while the posix version handles a home directory symbol (~).

There are three functions, one checks sole filenames, no slashes and no backslashes, that's the same for windows and posix, one that checks pathnames, and one that returns a pointer to the last period in a NUL terminated string. None of this uses any libraries or system calls.
Post by Marco van de Voort
In general, forget about POSIX and _stat as soon as you enter windows, and
do things the windows way. That saves you a lot of pain in the long run.
The _stat functions are documented on the Microsoft site. I also assume those will work on DOS, not only on Windows.

Under no circumstances am I going to do iteration over directories if all I need is a boolean test if a given file exists or not. Nor am I going to do it when all I need is the file size of a given file.

If the Windows programming environment is such a horrible mess that there aren't any extremely simple file-exists-yes-or-no, file-size-in-bytes-or-error functions, then I am afraid somebody else will have to step up and do this.
Post by Marco van de Voort
Most of the POSIX users don't really grok that it is shell that rewrites
commandlines for programs, and not the program itself, so assuming
shell parameter rewriting ports badly.
Whatever the shell passes to the program should be a path nevertheless and that path should meet the pathname syntax rules. Are you suggesting it turns the path into some internal gobbledeegook that is undocumented and cannot be verified?

It wasn't like that 25 years ago which was the last time I did some programming on a DOS machine.

But hey, it may well be so nowadays, in which case somebody else will have to step forward and do it.

regards
Marco van de Voort
2015-12-16 18:50:07 UTC
Permalink
Raw Message
Post by trijezdci
There are three functions, one checks sole filenames, no slashes and no backslashes, that's the same for windows and posix, one that checks pathnames, and one that returns a pointer to the last period in a NUL terminated string. None of this uses any libraries or system calls.
Post by Marco van de Voort
In general, forget about POSIX and _stat as soon as you enter windows, and
do things the windows way. That saves you a lot of pain in the long run.
The _stat functions are documented on the Microsoft site.
They are part of the visual studio runtime, not the OS. One version of the
VS runtime (the mosy recent one when the version of the OS came out) is
default installed though. I don't know if these calls are really meant to
be used or for legacy unix/posix apps. I always see code using the native,
win32 calls not the posix emulation functions in the runtime.
Post by trijezdci
I also assume those will work on DOS, not only on Windows
There is no default C runtime on Dos. Maybe your compiler brings one though,
and then the availability depends on that runtime.
Post by trijezdci
Under no circumstances am I going to do iteration over directories if all
I need is a boolean test if a given file exists or not. Nor am I going to
do it when all I need is the file size of a given files.
Findfirst has filespecs. If you pass the exact filename, it doesn't iterate.
Stat will use findfirst under the hood. The C runtime works on top of win32,
not the other way around. (the win32 lies on top of the NT API, but that is
a thin layer for calls like this)

Also, if you want unicode support, directly move to two-byte (UTF16 apis).
One byte apis are pain.
Post by trijezdci
If the Windows programming environment is such a horrible mess that there
aren't any extremely simple file-exists-yes-or-no,
file-size-in-bytes-or-error functions, then I am afraid somebody else will
have to step up and do this.
Under the hood it has to trawl directory entries anyway, so this is a
sterile difference. I believe you can use getfileattributes too

https://msdn.microsoft.com/en-us/library/windows/desktop/aa364944%28v=vs.85%29.aspx

and getfilesize(ex):
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364955%28v=vs.85%29.aspx
Post by trijezdci
Post by Marco van de Voort
Most of the POSIX users don't really grok that it is shell that rewrites
commandlines for programs, and not the program itself, so assuming
shell parameter rewriting ports badly.
Whatever the shell passes to the program should be a path nevertheless and
that path should meet the pathname syntax rules. Are you suggesting it
turns the path into some internal gobbledeegook that is undocumented and
cannot be verified?
No. I'm saying *nix shells do a lot of things that way. Like expanding
wildcards and ~. Windows mostly only handles quoting of arugments.
Post by trijezdci
It wasn't like that 25 years ago which was the last time I did some
programming on a DOS machine.
If you move to Windows, best leave your Dos legacy at the door. Windows NT
and Dos are mostly unrelated. Windows 64-bit doesn't even run dos apps.
trijezdci
2015-12-16 20:03:13 UTC
Permalink
Raw Message
Post by Marco van de Voort
Also, if you want unicode support, directly move to two-byte (UTF16 apis).
One byte apis are pain.
Why would I want unicode support for a filename that is confined to characters of Modula-2 identifiers as it must match the module identifier?
Post by Marco van de Voort
getfileattributes
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364944%28v=vs.85%29.aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364955%28v=vs.85%29.aspx
I'll have a look but my hunch at this point is that somebody else will need to do the Windows part.
Post by Marco van de Voort
No. I'm saying *nix shells do a lot of things that way. Like expanding
wildcards and ~. Windows mostly only handles quoting of arugments.
Ok, I see I misread, but that doesn't really have any impact on verifying a pathname string on Windows.
Post by Marco van de Voort
If you move to Windows,
I am not.
Post by Marco van de Voort
best leave your Dos legacy at the door.
Not much of a legacy there after 25+ years. I hardly remember anything, what I do remember is that it wasn't that hard to test if a file existed and what the filesize of a given file was.
Post by Marco van de Voort
Windows NT and Dos are mostly unrelated.
Windows 64-bit doesn't even run dos apps.
Well, BSD, Linux and MacOS X -- 64 bit or not -- do run DOS apps via a utility called DOSbox which many folks are using and this allows you to share directories with the host system. It is therefore desirable to have M2C build and run under DOSbox.

Anyway, thanks for the URLs

regards
Nemo
2015-12-16 23:22:36 UTC
Permalink
Raw Message
[...] do run DOS apps via a utility called DOSbox
Turbo C 2.01 is available from the Borland Musuem (via the Wayback
Machine
https://web.archive.org/web/20040202043446/http://community.borland.com/article/0,1410,20841,00.html)
and is easilly installed under DOSbox. The first problem is to shorten
the names of the files in m2c.

DOSbox is fairly portable. I run mine on a PPC.

N.
r***@gmail.com
2015-12-17 00:32:20 UTC
Permalink
Raw Message
Hi,
Post by Nemo
[...] do run DOS apps via a utility called DOSbox
Turbo C 2.01 is available from the Borland Musuem (via the Wayback
Machine and is easilly installed under DOSbox. The first problem is
to shorten the names of the files in m2c.
DOSbox is fairly portable. I run mine on a PPC.
Honestly, such talk about DOS here feels very incomplete. I shall
open a new thread (in vain) to discuss it further.
trijezdci
2015-12-17 09:44:22 UTC
Permalink
Raw Message
m2c.exe seems to work okay for me. Though it does error out (incorrectly)
VAR blah: PROCEDURE(): INTEGER;
This should be fixed now in the latest commit.

https://bitbucket.org/trijezdci/m2c-rework/commits/9f8c3d1516988e88b457d3431b9f49eb3334137f

thanks for reporting.
Marco van de Voort
2015-12-17 09:26:06 UTC
Permalink
Raw Message
Post by trijezdci
Post by Marco van de Voort
Also, if you want unicode support, directly move to two-byte (UTF16 apis).
One byte apis are pain.
Why would I want unicode support for a filename that is confined to characters of Modula-2 identifiers as it must match the module identifier?
You might have an user with an accent in his name, and he might have that
perfectly old low ascii 8.3 name in a directory in his homedir. Then the
path could still require unicode. (depending on if he has a windows version
that has that accented character in its ACS 1 byte codepage)

(real example from FPC deployment experience)
Post by trijezdci
Post by Marco van de Voort
No. I'm saying *nix shells do a lot of things that way. Like expanding
wildcards and ~. Windows mostly only handles quoting of arugments.
Ok, I see I misread, but that doesn't really have any impact on verifying
a pathname string on Windows.
It means that if you rely on unix filename expanding instead of checking
your arguments for wildcards yourself, you will need to implement it
on windows yourself.
Post by trijezdci
Post by Marco van de Voort
best leave your Dos legacy at the door.
Not much of a legacy there after 25+ years. I hardly remember anything,
what I do remember is that it wasn't that hard to test if a file existed
and what the filesize of a given file was.
Nearly all systems use error on retrieving some attribute.
Post by trijezdci
Well, BSD, Linux and MacOS X -- 64 bit or not -- do run DOS apps via a
utility called DOSbox which many folks are using and this allows you to
share directories with the host system. It is therefore desirable to have
M2C build and run under DOSbox.
(I prefer native and emulators apart, unless they are deeply integrated to
the system and nearly transparent, like the Linuxator)
trijezdci
2015-12-17 09:56:09 UTC
Permalink
Raw Message
Post by Marco van de Voort
Post by trijezdci
Why would I want unicode support for a filename that is confined to characters of Modula-2 identifiers as it must match the module identifier?
You might have an user with an accent in his name, and he might have that
perfectly old low ascii 8.3 name in a directory in his homedir. Then the
path could still require unicode. (depending on if he has a windows version
that has that accented character in its ACS 1 byte codepage)
Then he or she will only be able to use relative paths.

At this stage in the development I have more important things to do.

I will not let myself get sidetracked with extreme borderline use cases.
Post by Marco van de Voort
It means that if you rely on unix filename expanding instead of checking
your arguments for wildcards yourself, you will need to implement it
on windows yourself.
The point is that I do not rely on it. That's why I have these filename and pathname verification functions. I also need to reject filenames that use characters allowed by the OS/filesystem but which are not M2 identifiers because all input files need to be 1:1 mapped to their M2 module identifiers.
Post by Marco van de Voort
Post by trijezdci
Well, BSD, Linux and MacOS X -- 64 bit or not -- do run DOS apps via a
utility called DOSbox which many folks are using and this allows you to
share directories with the host system. It is therefore desirable to have
M2C build and run under DOSbox.
(I prefer native and emulators apart, unless they are deeply integrated to
the system and nearly transparent, like the Linuxator)
This is not about preferences, it is about what meets the goals of the project.

It is more likely somebody will just download DOSbox, than having an actual DOS machine, it is also more likely somebody will just download DOSbox instead of installing DOS on a virtual machine.

We have one contributor who is traveling a lot and does everything he does on a Netbook where the resources are limited to the point that he doesn't want a virtual machine nor a dual boot setup. DOSbox is the tool that takes the least effort and the least amount of resources, in terms of storage, memory and CPU. But the trouble was syncing M2 sources between M2M and M2C then, which means everyone else would have to use M2M, which is not desirable. Besides, it is always good to have control over the bootstap system, too. Affords more flexibility and more independence.
Marco van de Voort
2015-12-17 10:18:49 UTC
Permalink
Raw Message
Post by trijezdci
Post by Marco van de Voort
You might have an user with an accent in his name, and he might have that
perfectly old low ascii 8.3 name in a directory in his homedir. Then the
path could still require unicode. (depending on if he has a windows
version that has that accented character in its ACS 1 byte codepage)
Then he or she will only be able to use relative paths.
Writing obsolete-by-design code. How nice :-)
Post by trijezdci
Post by Marco van de Voort
Post by trijezdci
M2C build and run under DOSbox.
(I prefer native and emulators apart, unless they are deeply integrated to
the system and nearly transparent, like the Linuxator)
This is not about preferences, it is about what meets the goals of the project.
I was talking terminology. Anyway, it doesn't matter, I got sidetracked by
the other dos related opinions. Anyway, I'll just backtrack and give my
opinion on the whole dos issue:

I would start by stating that I don't want issues like dos having problems
with LFN to hamper the compiler development on all targets (wrt directory
names, module names and other tooling) forever.

So no allowances for LFN, support dos by crosscompiling only (iow only
target, not host). Saves a lot of (recurring) discussion. So sorry, but
we live in almost 2016 now. Some things change.
trijezdci
2015-12-17 10:55:38 UTC
Permalink
Raw Message
Post by Marco van de Voort
Post by trijezdci
Then he or she will only be able to use relative paths.
Writing obsolete-by-design code. How nice :-)
No, it's called limiting scope and keeping focus on the upcoming milestones.

It makes no sense to get side tracked with accommodating extreme borderline cases and in the process delay far more important deliverables. You can always make up many more extreme borderline cases.
Post by Marco van de Voort
I would start by stating that I don't want issues like dos having problems
with LFN to hamper the compiler development on all targets (wrt directory
names, module names and other tooling) forever.
Precisely.
Post by Marco van de Voort
So no allowances for LFN, support dos by crosscompiling only (iow only
target, not host). Saves a lot of (recurring) discussion. So sorry, but
we live in almost 2016 now. Some things change.
There are two issues with long filenames.

The first is the C side for the C sources. This is the painful part because we don't really want to have to maintain separate versions of the C sources with different filenames.

The second is the M2 side. This could be addressed with a module identifier to filename translation layer. The venerable M2SDS system for DOS did that.

Of course, none of that is a priority, but eventually, it will be desirable to be able to both build and run in DOSbox.

I agree that if this can't be easily achieved in a single step, then it it makes sense to start with building on a hosted DOS system and then only deploy to DOSbox, so that at least the compiler can be used there.
Martin Brown
2015-12-17 17:31:13 UTC
Permalink
Raw Message
Post by trijezdci
Post by Marco van de Voort
Post by trijezdci
Why would I want unicode support for a filename that is confined to characters of Modula-2 identifiers as it must match the module identifier?
You might have an user with an accent in his name, and he might have that
perfectly old low ascii 8.3 name in a directory in his homedir. Then the
path could still require unicode. (depending on if he has a windows version
that has that accented character in its ACS 1 byte codepage)
Then he or she will only be able to use relative paths.
At this stage in the development I have more important things to do.
I will not let myself get sidetracked with extreme borderline use cases.
I can't see any point in supporting 8.3 names in a newly developed
compiler (although if it has to be done) then allow a non case sensitive
match on the first 8 characters of the full name.

It is so unlikely that anyone will be running legacy bare 8.3 filename
MSDOS today that you can pretty much assume long filenames now and
accept losing a tiny fraction of the old embedded market.
Post by trijezdci
Post by Marco van de Voort
It means that if you rely on unix filename expanding instead of checking
your arguments for wildcards yourself, you will need to implement it
on windows yourself.
The point is that I do not rely on it. That's why I have these filename and pathname verification functions.
I also need to reject filenames that use characters allowed by the
OS/filesystem but which are not M2 identifiers
Post by trijezdci
because all input files need to be 1:1 mapped to their M2 module
identifiers.

Main gotcha difference in DOS/Win is that filename capitalisation is
ignored. So MyFile.mod and MYFILE.mod are the same file on MSDOS.
Post by trijezdci
Post by Marco van de Voort
Post by trijezdci
Well, BSD, Linux and MacOS X -- 64 bit or not -- do run DOS apps via a
utility called DOSbox which many folks are using and this allows you to
share directories with the host system. It is therefore desirable to have
M2C build and run under DOSbox.
Understood. I presume the ultimate aim is to have it run under any of
Windows (as a console DOSbox program), Unix (DOSbox) or Unix (native).
(even if a few compilation flags and link libraries have to change)

I will look at it again properly when I get a chance, but year end looms
large right now.

Provided you avoid using Posix fork this should not be too difficult.
(almost everything else has some way of doing it in DOS/Windows)
Post by trijezdci
Post by Marco van de Voort
(I prefer native and emulators apart, unless they are deeply integrated to
the system and nearly transparent, like the Linuxator)
This is not about preferences, it is about what meets the goals of the project.
It is more likely somebody will just download DOSbox, than having an actual DOS machine,
it is also more likely somebody will just download DOSbox instead of
installing DOS on a virtual machine.

It is more likely today that they will be on a Win7 or Win10 box with
DOS running inside the Windows environment as a console session.

Surely anyone on a Unix system will run Unix tools under native Unix.
Initially get it going in Unix and then we can tweak it later for DOS.
--
Regards,
Martin Brown
trijezdci
2015-12-17 18:14:16 UTC
Permalink
Raw Message
On Friday, 18 December 2015 02:31:22 UTC+9, Martin Brown wrote:

No, I won't be using fork().

Yes, the priority is on getting a working compiler on Posix asap, then Windows native, others come after that. Although, Tom Breeden has got it working on Amiga OS already so it may earn the distinction of being the first supported non-Posix platform even before Windows :-)
j***@gmail.com
2015-12-16 09:38:43 UTC
Permalink
Raw Message
All known PIM sources online now at http://fruttenboel.verhoeven272.nl/tmp/ so everyone can now run the tests, no need to wait until some one else has done so.
j***@gmail.com
2015-12-17 09:27:31 UTC
Permalink
Raw Message
Has some one tested the lot on CP/M 2.2 or perhaps on CP/M 3? It was the major operating system until the late 80's.

To be honest: why would m2c need to parse the source on syntax? m2c is only fed with error free modula-2 code. The code runs in any Modula-2 compiler and then m2c is used to create C source, instead of machine code.

So, for example:

MODULE HiThere;

IMPORT InOut;

BEGIN
LOOP ;;
InOut.WriteString ("Hi There ")
END
END HiThere.

which is perfectly good Modula-2 source code and it ought to be translated into

#include <stdio.h>

int main ()
{
label:
printf ("Hi There ");
goto label;
return 0;
}

which is perfectly good C source code.

Only when m2c is meant to be a front end to gcc, syntax checking makes sense.
trijezdci
2015-12-17 10:08:00 UTC
Permalink
Raw Message
Post by j***@gmail.com
why would m2c need to parse the source on syntax?
In order to generate output, you need to recognise the structure of the input.

In order to recognise the structure of the input, you need to parse it.

Parsing it means you are verifying the syntax. Call it a byproduct of parsing if you like.
Post by j***@gmail.com
m2c is only fed with error free modula-2 code.
No, it is not.

We use it as a bootstrap tool, too.
Post by j***@gmail.com
Only when m2c is meant to be a front end to gcc, syntax checking makes sense.
You could omit semantic checking, but syntax checking is always required for any translator.
trijezdci
2015-12-17 15:26:04 UTC
Permalink
Raw Message
In addition to Oberon style extensible record types I have now also added C style variable sized record types to the M2C parser.

TYPE String = VAR RECORD
size : CARDINAL
VAR
str : ARRAY size OF CHAR
END;

We are making use of this technique in M2C itself

typedef struct m2c_string_struct_t {
unsigned int size;
char str[];
};

And this is also how M2C will translate it to C.

Note that, in C this is not bounds checked, but in M2C it will be bounds checked and the field that determines the size of the array will automatically be assigned when a variable of the type is allocated and the field will then become immutable.

VAR s : String;

NEW(s, 20); (* allocated new string s of size 20 *)

WriteCard(s^.size); => 20
s^.str := "up to 19 characters";

Like all non-PIM features, this will be enabled by compiler switch, but I have not assigned a compiler option for this as yet.

This is now in the latest version on the repo.
trijezdci
2015-12-19 08:03:36 UTC
Permalink
Raw Message
Post by Christoph Schlegel
There was one quirk that only shows up when using --parser-debug,
related to the NOT keyword, but I don't know why it isn't an error
otherwise.
That would be interesting to know more about.
*** simpleTerm ***
@ line: 120, column: 8, lookahead: NOT
line: 120, column: 8, unexpected reserved word NOT found
expected identifier, string, integer, real number, character code, '(' or '{'.
IF NOT Done THEN WriteString("Oops!"); WriteLn; HALT END;
*** simpleTerm ***
@ line: 164, column: 22, lookahead: NOT
line: 164, column: 22, unexpected reserved word NOT found
expected identifier, string, integer, real number, character code, '(' or '{'.
BEGIN NEW(stk); IF NOT dynalloc THEN
This is now fixed in the latest commit.

https://bitbucket.org/trijezdci/m2c-rework/commits/56c21c049ebf71db863ec1efdedd8694dc06ce34

Thanks for reporting.
trijezdci
2015-12-19 09:56:51 UTC
Permalink
Raw Message
C:\tmp>trijezdci\m2c m2p.mod --pim3
m2c Modula-2 Compiler & Translator, version 1.00
line: 98, column: 19, unexpected symbol '[' found
expected symbol ';'
line: 98, column: 29, unexpected symbol ';' found
expected reserved word END
line: 99, column: 10, unexpected symbol '=' found
expected symbol '.'
parse error count: 3
C:\tmp>sed 96,101!d m2p.mod
TYPE
natural = INTEGER[0..32767];
This fixed now in the latest commit

https://bitbucket.org/trijezdci/m2c-rework/commits/c486dbebc2f4b97471eb1c4baf65f3353c8a1ebd#chg-EmptyStmtSeq.mod

Thanks for reporting.
trijezdci
2015-12-19 15:43:16 UTC
Permalink
Raw Message
I have now added a tests directory with source files for specific test scenarios

https://bitbucket.org/trijezdci/m2c-rework/src/tip/tests

This will gradually be populated with more tests.

Meanwhile, any test case contributions are welcome.
trijezdci
2015-12-19 22:21:33 UTC
Permalink
Raw Message
The latest version does now print the source lines for warnings and errors when in verbose mode. This should make reporting bugs easier.

https://bitbucket.org/trijezdci/m2c-rework/commits/b21996595958bb7d8a572f2a5d5030c415518908

Example:

$ ./m2c ./tests/Semicolon.mod --pim4 --errant-semicolon --verbose
m2c Modula-2 Compiler & Translator, version 1.00
processing ./tests/Semicolon.mod
line 4, column 28, warning: semicolon at end of field list sequence

TYPE R = RECORD i : INTEGER; (* errant semicolon *) END;
^

line 9, column 22, warning: semicolon at end of field list sequence

bar : i : INTEGER; (* errant semicolon *)
^

line 10, column 23, warning: semicolon at end of field list sequence

| baz : n : CARDINAL; (* errant semicolon *)
^

line 12, column 14, warning: semicolon at end of field list sequence

ch : CHAR; (* errant semicolon *)
^

line 13, column 6, warning: semicolon at end of field list sequence

END; (* errant semicolon *)
^

line 18, column 18, warning: semicolon at end of field list sequence

size : CARDINAL; (* errant semicolon *)
^

line 20, column 30, warning: semicolon at end of field list sequence

buffer : ARRAY size OF CHAR; (* errant semicolon *)
^

line 24, column 37, warning: semicolon at end of formal parameter list

PROCEDURE Foo ( bar : Bar; baz : Baz; (* errant semicolon *) );
^

line 26, column 9, warning: semicolon at end of statement sequence

Barbaz; (* errant semicolon *)
^

line 31, column 18, warning: semicolon at end of statement sequence

WITH foo DO bar; (* errant semicolon *) END;
^

line 33, column 18, warning: semicolon at end of statement sequence

IF foo THEN bar; (* errant semicolon *) END;
^

line 35, column 37, warning: semicolon at end of statement sequence

IF foo THEN bar ELSIF baz THEN bam; (* errant semicolon *) END;
^

line 37, column 27, warning: semicolon at end of statement sequence

IF foo THEN bar ELSE baz; (* errant semicolon *) END;
^

line 40, column 14, warning: semicolon at end of statement sequence

bar : bam; (* errant semicolon *)
^

line 41, column 14, warning: semicolon at end of statement sequence

| baz : boo; (* errant semicolon *)
^

line 43, column 9, warning: semicolon at end of statement sequence

dodo; (* errant semicolon *)
^

line 46, column 11, warning: semicolon at end of statement sequence

LOOP bar; (* errant semicolon *) END;
^

line 48, column 19, warning: semicolon at end of statement sequence

WHILE foo DO bar; (* errant semicolon *) END;
^

line 50, column 13, warning: semicolon at end of statement sequence

REPEAT bar; (* errant semicolon *) UNTIL foo;
^

line 52, column 26, warning: semicolon at end of statement sequence

FOR i := 0 TO 99 DO bar; (* errant semicolon *) END;
^

line 54, column 17, warning: semicolon at end of statement sequence

Foobar; Bazbam; (* errant semicolon *)
^

parse error count: 0
trijezdci
2015-12-19 22:25:20 UTC
Permalink
Raw Message
Post by trijezdci
line 4, column 28, warning: semicolon at end of field list sequence
TYPE R = RECORD i : INTEGER; (* errant semicolon *) END;
^
Note that the position marker will only be in the right place if you are using a monospaced font to display the newsgroup post. If you use google's web interface the position markers will most likely be off.
trijezdci
2015-12-21 12:31:57 UTC
Permalink
Raw Message
As of the latest commit, the front end now supports CONST parameters via compiler switch --const-parameters.

TYPE P = PROCEDURE ( CONST ARRAY OF CHAR );

PROCEDURE WriteString ( s : CONST ARRAY OF CHAR );

This option is turned off in pim3 and pim4 mode.

$ ./m2c ./tests/ProcType.def --pim4 --verbose
m2c Modula-2 Compiler & Translator, version 1.00
processing ./tests/ProcType.def
line 9, column 23, error: unexpected reserved word CONST found
expected symbol ')'

TYPE P4 = PROCEDURE ( CONST CHAR );
^

line 15, column 23, error: unexpected reserved word CONST found
expected symbol ')'

TYPE P7 = PROCEDURE ( CONST ARRAY OF CHAR );
^

regards
trijezdci
2015-12-21 12:33:30 UTC
Permalink
Raw Message
Post by trijezdci
As of the latest commit, the front end now supports CONST parameters via compiler switch --const-parameters.
TYPE P = PROCEDURE ( CONST ARRAY OF CHAR );
PROCEDURE WriteString ( s : CONST ARRAY OF CHAR );
This option is turned off in pim3 and pim4 mode.
$ ./m2c ./tests/ProcType.def --pim4 --verbose
m2c Modula-2 Compiler & Translator, version 1.00
processing ./tests/ProcType.def
line 9, column 23, error: unexpected reserved word CONST found
expected symbol ')'
TYPE P4 = PROCEDURE ( CONST CHAR );
^
line 15, column 23, error: unexpected reserved word CONST found
expected symbol ')'
TYPE P7 = PROCEDURE ( CONST ARRAY OF CHAR );
^
regards
that should have been PROCEDURE WriteString ( CONST s : ARRAY OF CHAR ) anyway.
trijezdci
2015-12-21 12:37:35 UTC
Permalink
Raw Message
Please also note that various test files are now in two versions, one for PIM and one for M2C extensions. Files with PIM in the name indicate PIM versions, for example ProcType.PIM.def.
Loading...