Rendering Modula-2 source code

Discussion:

(too old to reply)

trijezdci

2015-08-17 11:49:37 UTC

This may be of some interest to readers of this group.

Years ago I had contributed Modula-2 plug-ins and language description files for a variety of source code rendering frameworks and other syntax highlighting tools. To me the most important one of these is the Pygments framework because Bitbucket use that and I have most of my Modula-2 sources on Bitbucket.

More recently, it became apparent that it would be useful to render source code depending on the dialect in which the sources are written rather than settling for lowest common denominator capability. To this end I replaced my earlier Modula-2 plug-in for Pygments with a new one that has multi-dialect support. The multi-dialect plug-in is now part of the official Pygments distribution. It can also be found at:

https://bitbucket.org/trijezdci/m2r10/src/tip/_GRAMMAR/pygments.lexers.modula2.py

Detailed documentation can be found within the source of the file above.

Although it is in principle possible to determine the dialect automatically, to do it right, one would need to do a full syntax analysis of the input source and most of these frameworks are not designed for that. Also, most of these frameworks are written in inefficient languages such as Python and Ruby where a full syntax analysis prior to rendering will add considerable server side payload which may some large sites like Bitbucket to choke and possibly remove such plug-ins which the site operators consider wasteful.

I thus decided on a design where the selection of the dialect is determined by a special comment tag within the source, ideally placed at the top, alongside the copyright notice. I invited maintainers of Modula-2 compilers as far as their contact details are known to me to a discussion to agree on how the special comment tags should be. The comment tags the plug-in recognises are an outcome of that discussion.

A dialect tag is a special comment, defined by the following EBNF:

dialectTag :
OpeningCommentDelim Prefix dialectOption ClosingCommentDelim ;

dialectOption :
baseDialect ( '+' languageExtension )? '

baseDialect : 'm2pim' | 'm2iso' | 'm2r10' | 'objm2' ;

languageExtension :
'gm2' | 'mocka' | 'aglet' | 'gpm' | 'p1' | 'sbu' | 'xds' ;

Prefix : '!' ;

OpeningCommentDelim : '(*' ;

ClosingCommentDelim : '*)' ;

No whitespace is permitted between the tokens of a dialect tag.

The following is an example of such a dialect tag at the beginning of a Modula-2 source file:

(*!m2pim+gm2*) DEFINTION MODULE FooLib;

A distinct benefit of dialect selection by embedded special comment tag is that the tag acts to state the intent of the author and syntax used within the source file that is not supported by the specified dialect can then be rendered in red to indicate errors. The plug-in supports this via default style sheets.

Another feature of the plug-in is a special rendering mode called Algol Publication mode. In this mode. source text is rendered for publication in scientific papers and academic texts following the format of the Revised Algol-60 Language Report which set a de-facto standard in rendering algorithms for scientific publications and many scientific texts use it to this day.

When rendering Modula-2 source text in Algol Publication mode, reserved words are rendered lowercase boldface (optionally underlined) and built-in identifiers are rendered lowercase boldface italic. In other words, the capitalisation of reserved words and built-in identifiers is then considered to be a form of stropping (see the Algol report for a definition of stropping).

The Algol Publication mode is activated by command line switch when invoking a local Pygments installation. It cannot be activated by special comment tags.

Some example PDFs of rendered output can be downloaded below:

https://bitbucket.org/trijezdci/m2r10/downloads/M2LexerTestReport.ISOplusGM2.pdf

https://bitbucket.org/trijezdci/m2r10/downloads/M2LexerTestReport.PIMplusGM2.pdf

https://bitbucket.org/trijezdci/m2r10/downloads/M2LexerTestReport.M2R10.pdf

The Pygments rendering framework can be downloaded from

http://pygments.org

We have also looked into supporting Github but they have recently started a migration to their own rendering framework and while in transition they use various different renderers depending on the language which means it is extremely difficult to provide support while the transition is in progress. We'll just have to wait until they have completed the migration.

I have also made a multi-dialect plug-in for Algol and one for Wirth's LOLA-2 HDL but these have not been committed yet to the Pygments distribution. The maintainer is only doing these commits sporadically. At a later date, I might also make a plug-in for Oberon but currently I have other priorities.

I hope this is useful to somebody out there.

j***@gmail.com

2015-08-18 08:14:28 UTC

Permalink

Post by trijezdci
http://pygments.org

It's nice but if I need syntax highlighting, Kate is the best, so far. Kate also highlights standard TYPES (Cardinal, Integer, etc).

Gour

2015-08-18 08:48:32 UTC

Permalink

Post by j***@gmail.com
It's nice but if I need syntax highlighting, Kate is the best, so
far. Kate also highlights standard TYPES (Cardinal, Integer, etc).

What if you e.g. need syntax highlighting in a blog post?

Sincerely,
Gour

--
He is a perfect yogī who, by comparison to his own self,
sees the true equality of all beings, in both their
happiness and their distress, O Arjuna!

trijezdci

2015-08-18 10:43:10 UTC

Permalink

Post by Gour

Post by j***@gmail.com
It's nice but if I need syntax highlighting, Kate is the best, so
far. Kate also highlights standard TYPES (Cardinal, Integer, etc).

What if you e.g. need syntax highlighting in a blog post?

This reminds me of a movie I watched recently. The movie was "The Imitation Game". In one scene, Alan Turing as a young boy, was given a book on cryptography by his friend who explained that it was about text in plain view that nobody can understand unless they have the key. Alan Turing asked back "How is this different from ordinary speech?". Throughout the movie there was this theme of people using a common language but meaning different things. The movie portrayed Alan Turing to always be precise in his use of terminology, often to the point of annoying his coworkers.

I differentiate the terminology in question in this way: Syntax highlighting is what you do in an editor, source code rendering is what you do for publishing. Doing the one is similar to doing the other, but not the same.

Gour

2015-08-18 10:51:15 UTC

Permalink

Post by trijezdci

Post by Gour

Post by j***@gmail.com
It's nice but if I need syntax highlighting, Kate is the best, so
far. Kate also highlights standard TYPES (Cardinal, Integer, etc).

What if you e.g. need syntax highlighting in a blog post?

I differentiate the terminology in question in this way: Syntax
highlighting is what you do in an editor, source code rendering is
what you do for publishing. Doing the one is similar to doing the
other, but not the same.

You’re right. I should use ’syntax highlighting’ in my question how kate
can help with source rendering. ;)

Sincerely,
Gour

--
As a blazing fire turns firewood to ashes, O Arjuna, so does the
fire of knowledge burn to ashes all reactions to material activities.

j***@gmail.com

2015-08-19 09:16:38 UTC

Permalink

Post by trijezdci
I differentiate the terminology in question in this way: Syntax highlighting is what you do in an editor, source code rendering is what you do for publishing. Doing the one is similar to doing the other, but not the same.

Now you sound like alan turing.... I mentioned Kate. But 'jed' has execellent source code highlighting too, just like just about any editor.

trijezdci

2015-08-19 09:31:12 UTC

Permalink

Post by j***@gmail.com
Now you sound like alan turing.... I mentioned Kate. But 'jed' has execellent source code highlighting too, just like just about any editor.

I will add some annoyance for you as well. The topic of this thread is rendering existing source text for publishing. Editing source code is off-topic. Editors are off-topic. You are welcome.

j***@gmail.com

2015-08-19 09:10:56 UTC

Permalink

Post by Gour
What if you e.g. need syntax highlighting in a blog post?

Kate has an export function, pumping out html source. It's eay to cut/paste it in any webpage. Try it. You'll be surprised.

trijezdci

2015-08-19 09:25:37 UTC

Permalink

Post by j***@gmail.com

Post by Gour
What if you e.g. need syntax highlighting in a blog post?

Kate has an export function, pumping out html source. It's eay to cut/paste it in any webpage. Try it. You'll be surprised.

That may well be fine for the odd infrequent blog entry, in other words occasional use.

But if you run a blog where you add source code on a daily basis, then you don't want to cut and paste, you want your blog to understand the language and automatically render your Modula-2 sources without you doing anything other than marking the source section, for example [[code=Modula2]]...[[/code]] or whatever your markup may be.

Even worse if you run a site like bitbucket or github with repositories of different authors, then you cannot use anything that requires cutting and pasting. Whenever somebody views a file in a given repository, that file needs to be rendered on the fly because the files are subject to change and new files are being checked in all the time. The repo has to do all of that automatically, ideally with a framework that is integrated into the content management system.

j***@gmail.com

2015-08-19 09:11:15 UTC

Permalink

Post by Gour
What if you e.g. need syntax highlighting in a blog post?

Kate has an export function, pumping out html source. It's eay to cut/paste it in any webpage. Try it. You'll be surprised.

trijezdci

2015-08-18 09:59:34 UTC

Permalink

Post by j***@gmail.com
It's nice but if I need syntax highlighting, Kate is the best, so far. Kate also highlights standard TYPES (Cardinal, Integer, etc).

Pygments will do that, too, if you want it. It uses style sheets and you can select from various pre-defined style sheets or you can make your own style sheets. It is up to the plug-in to recognise different syntactical entities and mark them accordingly. A lazily written plug-in might only mark reserved words and built-in identifiers and not distinguish anything. In that case the style sheet will not help you to render different entities in different styles of course. However, many of the plug-ins I looked at do distinguish between different syntactical entities in fair detail.

I personally like to differentiate between source code rendering and syntax highlighting,

Source code rendering is for publishing sources, either in print or online.

Syntax highlighting is for editing within a syntax aware editor or IDE.

For source code rendering I actually don't like it all that colourful to be honest. When I am editing sources, then I may want to have more differentiation and thus more colours.

Pygments is a rendering framework designed to render an entire file or one or more chunks of source within a web page once for display, or for producing an output file for print or for pasting into a document. It is not designed as a highlighter within an editor or IDE.

I hope this makes more sense now.