Acessing array by its address

Discussion:

(too old to reply)

Anton Shepelev

2017-10-14 19:14:51 UTC

Hello, all

My generic sorting routine accepts type-specific op-
erations as the following procedural variables:

TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;

So that for an array of INTEGERs the user may imple-
ments them as follows:

PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;

PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;

I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

tbreeden

2017-10-16 18:01:30 UTC

Permalink

Post by Anton Shepelev
Hello, all
My generic sorting routine accepts type-specific op-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may imple-
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?

Hi Anton,

The user will be calling the procs like this?
Compare(ADR(myarray), i, j);
or
Compare(ADR(myarray[0]), i, j);

The compiler only knows the "data:" parameter as an ADDRESS,
so it probably will copy only one word (or not copy at all, since "data:" is never
assigned).

Note though that this approach in general is extremly unsafe
since ADDRESS could point to anything. It relies on the caller
for all parameter checking and even if passed only arrays of
INTEGER all array overflow checking is lost.

It is probably a better idea to provide separate safe functions for arrays
of various commonly used
types like INTEGER, CARDINAL, REAL, maybe along with DEF file exports of useful ones
like String20 = ARRAY[0..21] OF CHAR; String80 = ARRAY[0..81] OF CHAR;, etc.

PROCEDURE SortInt(VAR Array:ARRAY OF INTEGER);
PROCEDURE SortStr20(VAR Array:ARRAY OF String20);
etc.

And, for projects, various specialized ones for RECORDS and POINTERS
unique to that project.

I've never found, within plain M2, a safe way to write a truly generic
sort routine. I guess that is why ISO M2 added a "Generic M2" extension
document. But I'm not sure anyone has ever implemented "Generic M2".

In the past, I've defined a generic sorting procedure like this:

TYPE CmpFuncPtrs = PROCEDURE(ADDRESS, ADDRESS):INTEGER;
PROCEDURE QuickSortSized(VAR Array:ARRAY OF BYTE; ArrayByteSize:CARDINAL;
ElemByteSize:CARDINAL;
Compare:CmpFuncPtrs);

and used an internal POINTER TO BYTE type to do the sort processing
byte by ElemByteSize, via callbacks to a user supplied compare function which he/she
will have to handle a type transfer between ADDRESS and POINTER TO histype in
doing the compare

BUT I definitely have misgivings about "programming in the mode of C" this way.

FWIW,
tom

Anton Shepelev

2017-10-17 12:28:07 UTC

Permalink

Post by tbreeden

Post by Anton Shepelev
My generic sorting routine accepts type-specific
operations as the following procedural vari-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for access-
ing an array by its address: will not the deref-
erence operator cause the copying of the en-
tire array to the stack every time, and if it
will, what are the alternatives?

The user will be calling the procs like this?
Compare(ADR(myarray), i, j);
or
Compare(ADR(myarray[0]), i, j);

The first.

Post by tbreeden
The compiler only knows the "data:" parameter as
an ADDRESS, so it probably will copy only one word

I forgot to add that the type PArray may be:

TYPE TArray = ARRAY OF INTEGER;
TYPE PArray = POINTER TO TArray;

So it is the strongly-typed pointer to a value of
the user-defined type that I am dereferencing. And
when I dereference a pointer to an array, how do I
know whether the compiler will not copy the full ar-
ray onto the stack?

Post by tbreeden
(or not copy at all, since "data:" is never as-
signed).

How can this happen? The user-supplied data-trans-
fer and comparison procedures quoted above must
needs work via CPU registers, so *some* data (at
least the pair of array elements specified by the
indices) it must copy to the stack.

Post by tbreeden
Note though that this approach in general is ex-
tremly unsafe since ADDRESS could point to any-
thing.

I think that sometimes we may tolerate this in ex-
change for the *huge* benifit of generic code.

Post by tbreeden
It relies on the caller for all parameter checking
and even if passed only arrays of INTEGER all ar-
ray overflow checking is lost.

I don't think so, for the dereference occur in user
code and applies to strongly-typed pointers to an
array of known size).

Post by tbreeden
It is probably a better idea to provide separate
safe functions for arrays of various commonly used
types like INTEGER, CARDINAL, REAL, maybe along
with DEF file exports of useful ones like
String20 = ARRAY[0..21] OF CHAR;
String80 = ARRAY[0..81] OF CHAR; , etc
PROCEDURE SortInt(VAR Array:ARRAY OF INTEGER);
PROCEDURE SortStr20(VAR Array:ARRAY OF String20);
etc.
And, for projects, various specialized ones for
RECORDS and POINTERS unique to that project.

Although it is the right solution for some low-leven
programs with a fixed structure, such as data com-
pressors, there are many applications where this ap-
proach will cause massive and painful copy-pasting.

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

tbreeden

2017-10-19 02:40:49 UTC

Permalink

Hello Anton,

Your approach will work, but it still is unsafe.

Post by Anton Shepelev
I think that sometimes we may tolerate this in ex-
change for the *huge* benifit of generic code.

This is debatable.

Neither the KineSort module procedures nor the
user callback procedures can compile in checks
for correctness of the parameters "data", "left", and
"right", leaving them both at risk at runtime of
overwriting memory via programmer error or even malicious
attack.
Maintaining type-safety would have eliminated this
risk at least.

Lots of Microsoft and Linux programmers have probably
regretted the wrong decision on this.

Post by Anton Shepelev
Although it is the right solution for some low-leven
programs with a fixed structure, such as data com-
pressors, there are many applications where this ap-
proach will cause massive and painful copy-pasting.

As I mentioned, the creators of ISO Modula-2 recognized
the problem and came up with an extension document
for generic programming. It added a new special type
of module, GENERIC, that added the ability to pass a
type-parameter to the module itself, with the callback
procedures like Swap() and Compare() defined w/r/t
that type-parameter. The idea was that the compiler
would build and create code for a "refined" generic
procedure as required when compiling the caller.

or something like this. Don't know whether it
has ever been implemented.

Here is a reference to an informal paper where they
describe the idea.

ACM SIGPLAN Notices
Volume 32 Issue 11, Nov. 1997
https://dl.acm.org/citation.cfm?id=270949

It would be interesting to survey how other type-safe
languages have handled this.

regards,

tom

r***@gmail.com

2017-10-19 23:52:40 UTC

Permalink

Hi,

Post by tbreeden
As I mentioned, the creators of ISO Modula-2 recognized
the problem and came up with an extension document
for generic programming.
Don't know whether it has ever been implemented.

I don't have any Macs, but p1 Modula-2 claims this:

http://modula2.awiedemann.de/

Object Oriented Modula-2 compiler
for Apple Macintosh supporting Mach-O as command line tool or from Xcode .
Implementing ISO standards IS 10514-1,2,3
Last update: July, 17th 2017

And apparently "ISO/IEC 10514-2:1998" is "Part 2: Generics Modula-2".

tbreeden

2017-10-23 16:14:30 UTC

Permalink

Post by r***@gmail.com

Post by tbreeden
Don't know whether it has ever been implemented.

http://modula2.awiedemann.de/
Object Oriented Modula-2 compiler
for Apple Macintosh supporting Mach-O as command line tool or from Xcode .
Implementing ISO standards IS 10514-1,2,3
Last update: July, 17th 2017
And apparently "ISO/IEC 10514-2:1998" is "Part 2: Generics Modula-2".

Wow. I'm tempted to look for an old Mac just to try this out.
Maybe I can get them to port to my Powerpc Amiga. :)

Thanks for the information.

BTW, w/r/t generic programming:

Another proposal that you is very interesting is the M2R10 updated Modula-2
started by B.Kowarsch and R.Sutcliffe with sources on GitHub.com.
It has some beautiful ideas on type safe generic programming which may even get
into GNU M2 at some time in the future.

see Modula-2 R10 Blueprints
http://www.callapple.org/columns/the-northern-spy-modula-2-r10-blueprints/

tom

Chris Burrows

2017-10-24 11:58:10 UTC

Permalink

Post by tbreeden
I've never found, within plain M2, a safe way to write a truly generic
sort routine. I guess that is why ISO M2 added a "Generic M2" extension
document. But I'm not sure anyone has ever implemented "Generic M2".

I haven't tried it myself, but according to the documentation ADW (ex-Stonybrook) Modula-2 supports generic modules:

https://www.modula2.org/

Regards,
Chris Burrows
CFB Software
http://www.cfbsoftware.com/modula2

tbreeden

2017-10-26 14:48:57 UTC

Permalink

Thanks.
Better and better!

tom

Martin Brown

2017-10-17 13:24:13 UTC

Permalink

p := CAST(PArray, data); ought to do it

It isn't very generic if it assumes that you can treat the contents of
PArray as being identical to a 4 byte INTEGER in all respects.

The above comparison would fail badly half the time with CARDINAL.

Post by Anton Shepelev
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?

PROCEDURE Compare ( VAR data : ARRAY OF whatever; i, j: CARDINAL): INTEGER;

Ensures a safe reference retains array bounds and avoids copying the
array. The one thing MODULA lacks is a promise that the PROCEDURE will
not modify the contents of an ARRAY passed by reference for speed.

PROCEDURE Compare ( CONST data : ARRAY OF whatever; i, j: CARDINAL):
INTEGER;

Would enforce the rule that data must not be altered whilst allowing
read access to any element of it.

You could CAST data into any TYPE you like but woe betide you if the
input data does not genuinely point to the right sort of object.

--
Regards,
Martin Brown

Anton Shepelev

2017-10-17 14:08:12 UTC

Permalink

Post by Martin Brown

p := CAST(PArray, data); ought to do it

That line works as it stands. By "immediately" I
meant without the intermediate variable. Since
pointer CAST is only a pseudoinstruction telling the
compiler to treat an ADDRESS variable as a pointer
to a specific type, that variable should not be re-
quired. I wanted to write:

CAST( PArray, data )^[i] - CAST( PArray, data )^[j];

It is possible in Borland Pascal and C.

Post by Martin Brown
It isn't very generic if it assumes that you can
treat the contents of PArray as being identical to
a 4 byte INTEGER in all respects.

I think you misunderstood me because I forgot to in-
clude the declaration of PArray:

TArray = ARRAY[0..Size - 1] OF INTEGER;
PArray = POINTER TO TArray;

Post by Martin Brown
The above comparison would fail badly half the
time with CARDINAL.

Compare is the user-supplied procedure that my
generic unit accepts as a procedural variable. For
an array of CARDINALs it would be implemented other-
wise.

My generic unit makes no assumptions about the
structure of the underlying data and accesses it ex-
clusively via two user-supplied proceudres: Swap and
Compare. Its full code is here:

https://pastebin.com/3bEbpYgs

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

Martin Brown

2017-10-17 14:38:00 UTC

Permalink

Post by Anton Shepelev

Post by Martin Brown

Post by Anton Shepelev
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;

p := CAST(PArray, data); ought to do it

That line works as it stands. By "immediately" I
meant without the intermediate variable. Since
pointer CAST is only a pseudoinstruction telling the
compiler to treat an ADDRESS variable as a pointer
to a specific type, that variable should not be re-
CAST( PArray, data )^[i] - CAST( PArray, data )^[j];
It is possible in Borland Pascal and C.

It might have made a difference in the ancient days of PIM3 Logitech 4
pass ETH compilers with zero worthwhile optimisation of anything at all
but with modern compilers you don't need to micromanage code in an HLL.
It is more important to make clear your intentions - the optimiser will
remove or assign to registers any transient variables.

Post by Anton Shepelev

Post by Martin Brown
It isn't very generic if it assumes that you can
treat the contents of PArray as being identical to
a 4 byte INTEGER in all respects.

I think you misunderstood me because I forgot to in-
TArray = ARRAY[0..Size - 1] OF INTEGER;
PArray = POINTER TO TArray;

What do you have against declaring the routines as taking an open array
of TYPE ?

--
Regards,
Martin Brown

Anton Shepelev

2017-10-17 17:18:09 UTC

Permalink

Post by Martin Brown
What do you have against declaring the routines as
taking an open array of TYPE ?

My original design to write a generic sortine mod-
ule, which can be used any kind of array-like data.

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

Chris Burrows

2017-10-17 20:44:50 UTC

Permalink

Post by Anton Shepelev

Post by Martin Brown
What do you have against declaring the routines as
taking an open array of TYPE ?

My original design to write a generic sortine mod-
ule, which can be used any kind of array-like data.

Why choose to use a type-safe language if you want to bypass its type-safety capabilities?

Anton Shepelev

2017-10-18 08:33:43 UTC

Permalink

Why choose to use a type-safe language if you want
to bypass its type-safety capabilities?

In order to avail myself of type-safety in the over-
whelming majority of cases where it does more good
than ill. Generic alrorithyms are a notable excep-
tion, for the only alternative to unsafe mechanics
(loopholes) is heavy code duplication.

--
() ascii ribbon campaign - against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

trijezdci

2018-06-10 13:33:07 UTC

Permalink

I realise this discussion is now 8 months old but I think it warrants some clarification, if only for posterity.

Post by Anton Shepelev

Post by Martin Brown
What do you have against declaring the routines as
taking an open array of TYPE ?

My original design to write a generic sortine mod-
ule, which can be used any kind of array-like data.

It seems to me that you are confusing "generic" and "dynamically typed".

The term generic programming has a very specific meaning. The terminology can be a bit misleading. A better name would be template programming because that is what it actually comes down to.

When you write a generic library, you are writing a template with a type placeholder. When you are using the generic library for a given type, a template engine built into your compiler will do template expansion by replacing the type placeholder with the actual type to create a new library specific to the given type. This is called specialisation but in reality it is simply a template expansion.

The benefit of generic programming is that you only need to design the code once. The template engine will then rewrite it for you when you use it. The disadvantage is that every time you expand a template with a type it hasn't been used with yet, you are duplicating the code. You get one copy of your library for each type you use it with. However, since each expanded library is specific to one type, it is perfectly type safe as long as the language is type safe.

What you described in your casting array example is NOT generic code.

Instead of using a template and expand it to derive specialised libraries, your approach uses a single copy of code that aims to act on multiple different types at runtime. This is dynamic typing, even though in your example it is a very crude form of dynamic typing and without any type safety whatsoever. It is not generic programming however.

As others have pointed out, ISO Modula-2 defines a generic programming extension and there are two compilers that support this. This extension is proper generic programming, a template expansion facility that specialises the code at compiler time.

If you are working with platforms where there is no M2 compiler that supports generic programming, you can easily roll your own template engine to expand placeholders and use that on your generic code as a pre-processor prior to compilation.

We do that in Modula-2 R10 where placeholders are enclosed with a leading and traling ##.

PROCEDURE SortArray ( VAR a : ARRAY OF ##ComponentType##; ascending : BOOLEAN );

You call the template engine with a type to be used for the placeholder, say

ComponentType=INTEGER

and the example above will expand to

PROCEDURE SortArray ( VAR a : ARRAY OF INTEGER; ascending : BOOLEAN );

In Modula-2 R10 the compiler supports this through a directive that can be used within the import section of a module to feed placeholder/translation pairs to the template engine and launch it (as it is an external utility).

IMPORT SuperLongInteger;

GENLIB SortSuperLongIntArraySort FROM SortArrays FOR
ComponentType = SuperLongInteger
END; (* GENLIB *)

IMPORT SuperLongIntArraySort;

But this is merely convenience. You can do this manually with a simple template expansion engine for any kind of code and any kind of language. All you need to do is make sure that the delimiters that mark the placeholders are not used otherwise in the language for which you are writing templates.

Note, it is entirely unnecessary to do type checking on generic templates as is done in ISO Modula-2 because the compiler will do type checking on the expanded source after expansion anyway.

However, in your example, you were trying to do dynamic typing. If this is your preference, you can do that, too. But you need some form of OOP with the ability to do runtime type testing.

In Modula-2 R10 we use the Oberon model. In Oberon, Wirth replaced variant record types with extensible record types. This means, you can define a new record type based on an already existing record type.

(* Root Type *)
TYPE Number = RECORD ( NIL )
...
END; (* Number *)

(* Sub Type *)
TYPE Integer = RECORD ( Number )
value : INTEGER
END; (* Integer *)

TYPE Cardinal = RECORD ( Number )
value : CARDINAL
END; (* Cardinal *)

etc etc

A type extension is compatible with the type it extends. Thus you can pass instances of both Integer and Cardinal to parameters of type Number.

A CASE statement within the procedure is then used to determine at runtime what the actual type is before handling the value field ...

PROCEDURE Foo ( n : Number );
BEGIN
CASE n OF
| Integer : (* code to handle integer values *)
| Cardinal : (* code to handle cardinal values *)
...
END (* CASE *)
END Foo;

For more detail, check this article:

https://github.com/m2sf/m2bsk/wiki/Object-Oriented-Programming

You can do this in Oberon (with some minor differences in syntax). ISO Modula-2's object oriented extension is not dynamically typed, so you would have to manually add a type field to the class, maintain it and check that.

hope this clarifies
benjamin