Discussion:
Acessing array by its address
(too old to reply)
Anton Shepelev
2017-10-14 19:14:51 UTC
Permalink
Raw Message
Hello, all

My generic sorting routine accepts type-specific op-
erations as the following procedural variables:

TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;

So that for an array of INTEGERs the user may imple-
ments them as follows:

PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;

PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;

I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?
--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
tbreeden
2017-10-16 18:01:30 UTC
Permalink
Raw Message
Post by Anton Shepelev
Hello, all
My generic sorting routine accepts type-specific op-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may imple-
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?
Hi Anton,

The user will be calling the procs like this?
Compare(ADR(myarray), i, j);
or
Compare(ADR(myarray[0]), i, j);

The compiler only knows the "data:" parameter as an ADDRESS,
so it probably will copy only one word (or not copy at all, since "data:" is never
assigned).

Note though that this approach in general is extremly unsafe
since ADDRESS could point to anything. It relies on the caller
for all parameter checking and even if passed only arrays of
INTEGER all array overflow checking is lost.

It is probably a better idea to provide separate safe functions for arrays
of various commonly used
types like INTEGER, CARDINAL, REAL, maybe along with DEF file exports of useful ones
like String20 = ARRAY[0..21] OF CHAR; String80 = ARRAY[0..81] OF CHAR;, etc.

PROCEDURE SortInt(VAR Array:ARRAY OF INTEGER);
PROCEDURE SortStr20(VAR Array:ARRAY OF String20);
etc.

And, for projects, various specialized ones for RECORDS and POINTERS
unique to that project.

I've never found, within plain M2, a safe way to write a truly generic
sort routine. I guess that is why ISO M2 added a "Generic M2" extension
document. But I'm not sure anyone has ever implemented "Generic M2".

In the past, I've defined a generic sorting procedure like this:

TYPE CmpFuncPtrs = PROCEDURE(ADDRESS, ADDRESS):INTEGER;
PROCEDURE QuickSortSized(VAR Array:ARRAY OF BYTE; ArrayByteSize:CARDINAL;
ElemByteSize:CARDINAL;
Compare:CmpFuncPtrs);

and used an internal POINTER TO BYTE type to do the sort processing
byte by ElemByteSize, via callbacks to a user supplied compare function which he/she
will have to handle a type transfer between ADDRESS and POINTER TO histype in
doing the compare

BUT I definitely have misgivings about "programming in the mode of C" this way.

FWIW,
tom
Anton Shepelev
2017-10-17 12:28:07 UTC
Permalink
Raw Message
Post by tbreeden
Post by Anton Shepelev
My generic sorting routine accepts type-specific
operations as the following procedural vari-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for access-
ing an array by its address: will not the deref-
erence operator cause the copying of the en-
tire array to the stack every time, and if it
will, what are the alternatives?
The user will be calling the procs like this?
Compare(ADR(myarray), i, j);
or
Compare(ADR(myarray[0]), i, j);
The first.
Post by tbreeden
The compiler only knows the "data:" parameter as
an ADDRESS, so it probably will copy only one word
I forgot to add that the type PArray may be:

TYPE TArray = ARRAY OF INTEGER;
TYPE PArray = POINTER TO TArray;

So it is the strongly-typed pointer to a value of
the user-defined type that I am dereferencing. And
when I dereference a pointer to an array, how do I
know whether the compiler will not copy the full ar-
ray onto the stack?
Post by tbreeden
(or not copy at all, since "data:" is never as-
signed).
How can this happen? The user-supplied data-trans-
fer and comparison procedures quoted above must
needs work via CPU registers, so *some* data (at
least the pair of array elements specified by the
indices) it must copy to the stack.
Post by tbreeden
Note though that this approach in general is ex-
tremly unsafe since ADDRESS could point to any-
thing.
I think that sometimes we may tolerate this in ex-
change for the *huge* benifit of generic code.
Post by tbreeden
It relies on the caller for all parameter checking
and even if passed only arrays of INTEGER all ar-
ray overflow checking is lost.
I don't think so, for the dereference occur in user
code and applies to strongly-typed pointers to an
array of known size).
Post by tbreeden
It is probably a better idea to provide separate
safe functions for arrays of various commonly used
types like INTEGER, CARDINAL, REAL, maybe along
with DEF file exports of useful ones like
String20 = ARRAY[0..21] OF CHAR;
String80 = ARRAY[0..81] OF CHAR; , etc
PROCEDURE SortInt(VAR Array:ARRAY OF INTEGER);
PROCEDURE SortStr20(VAR Array:ARRAY OF String20);
etc.
And, for projects, various specialized ones for
RECORDS and POINTERS unique to that project.
Although it is the right solution for some low-leven
programs with a fixed structure, such as data com-
pressors, there are many applications where this ap-
proach will cause massive and painful copy-pasting.
--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
tbreeden
2017-10-19 02:40:49 UTC
Permalink
Raw Message
Hello Anton,

Your approach will work, but it still is unsafe.
Post by Anton Shepelev
I think that sometimes we may tolerate this in ex-
change for the *huge* benifit of generic code.
This is debatable.

Neither the KineSort module procedures nor the
user callback procedures can compile in checks
for correctness of the parameters "data", "left", and
"right", leaving them both at risk at runtime of
overwriting memory via programmer error or even malicious
attack.
Maintaining type-safety would have eliminated this
risk at least.

Lots of Microsoft and Linux programmers have probably
regretted the wrong decision on this.
Post by Anton Shepelev
Although it is the right solution for some low-leven
programs with a fixed structure, such as data com-
pressors, there are many applications where this ap-
proach will cause massive and painful copy-pasting.
As I mentioned, the creators of ISO Modula-2 recognized
the problem and came up with an extension document
for generic programming. It added a new special type
of module, GENERIC, that added the ability to pass a
type-parameter to the module itself, with the callback
procedures like Swap() and Compare() defined w/r/t
that type-parameter. The idea was that the compiler
would build and create code for a "refined" generic
procedure as required when compiling the caller.

or something like this. Don't know whether it
has ever been implemented.

Here is a reference to an informal paper where they
describe the idea.

ACM SIGPLAN Notices
Volume 32 Issue 11, Nov. 1997
https://dl.acm.org/citation.cfm?id=270949

It would be interesting to survey how other type-safe
languages have handled this.

regards,

tom
r***@gmail.com
2017-10-19 23:52:40 UTC
Permalink
Raw Message
Hi,
Post by tbreeden
As I mentioned, the creators of ISO Modula-2 recognized
the problem and came up with an extension document
for generic programming.
Don't know whether it has ever been implemented.
I don't have any Macs, but p1 Modula-2 claims this:

http://modula2.awiedemann.de/

Object Oriented Modula-2 compiler
for Apple Macintosh supporting Mach-O as command line tool or from Xcode .
Implementing ISO standards IS 10514-1,2,3
Last update: July, 17th 2017

And apparently "ISO/IEC 10514-2:1998" is "Part 2: Generics Modula-2".
tbreeden
2017-10-23 16:14:30 UTC
Permalink
Raw Message
Post by r***@gmail.com
Post by tbreeden
Don't know whether it has ever been implemented.
http://modula2.awiedemann.de/
Object Oriented Modula-2 compiler
for Apple Macintosh supporting Mach-O as command line tool or from Xcode .
Implementing ISO standards IS 10514-1,2,3
Last update: July, 17th 2017
And apparently "ISO/IEC 10514-2:1998" is "Part 2: Generics Modula-2".
Wow. I'm tempted to look for an old Mac just to try this out.
Maybe I can get them to port to my Powerpc Amiga. :)

Thanks for the information.

BTW, w/r/t generic programming:

Another proposal that you is very interesting is the M2R10 updated Modula-2
started by B.Kowarsch and R.Sutcliffe with sources on GitHub.com.
It has some beautiful ideas on type safe generic programming which may even get
into GNU M2 at some time in the future.

see Modula-2 R10 Blueprints
http://www.callapple.org/columns/the-northern-spy-modula-2-r10-blueprints/

tom
Chris Burrows
2017-10-24 11:58:10 UTC
Permalink
Raw Message
Post by tbreeden
I've never found, within plain M2, a safe way to write a truly generic
sort routine. I guess that is why ISO M2 added a "Generic M2" extension
document. But I'm not sure anyone has ever implemented "Generic M2".
I haven't tried it myself, but according to the documentation ADW (ex-Stonybrook) Modula-2 supports generic modules:

https://www.modula2.org/

Regards,
Chris Burrows
CFB Software
http://www.cfbsoftware.com/modula2
tbreeden
2017-10-26 14:48:57 UTC
Permalink
Raw Message
Thanks.
Better and better!

tom

Martin Brown
2017-10-17 13:24:13 UTC
Permalink
Raw Message
Post by Anton Shepelev
Hello, all
My generic sorting routine accepts type-specific op-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may imple-
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
p := CAST(PArray, data); ought to do it

It isn't very generic if it assumes that you can treat the contents of
PArray as being identical to a 4 byte INTEGER in all respects.

The above comparison would fail badly half the time with CARDINAL.
Post by Anton Shepelev
PROCEDURE Swap( data: ADDRESS; i, j: INTEGER );
VAR temp: INTEGER; p: PArray;
BEGIN
p := data;
temp := p^[i];
p^[i] := p^[j];
p^[j] := temp;
END Swap;
I have misgivings about this syntax for accessing an
array by its address: will not the dereference oper-
ator cause the copying of the entire array to the
stack every time, and if it will, what are the al-
ternatives?
PROCEDURE Compare ( VAR data : ARRAY OF whatever; i, j: CARDINAL): INTEGER;

Ensures a safe reference retains array bounds and avoids copying the
array. The one thing MODULA lacks is a promise that the PROCEDURE will
not modify the contents of an ARRAY passed by reference for speed.

PROCEDURE Compare ( CONST data : ARRAY OF whatever; i, j: CARDINAL):
INTEGER;

Would enforce the rule that data must not be altered whilst allowing
read access to any element of it.

You could CAST data into any TYPE you like but woe betide you if the
input data does not genuinely point to the right sort of object.
--
Regards,
Martin Brown
Anton Shepelev
2017-10-17 14:08:12 UTC
Permalink
Raw Message
Post by Martin Brown
Post by Anton Shepelev
My generic sorting routine accepts type-specific
operations as the following procedural vari-
TYPE
SwapProc = PROCEDURE( ADDRESS, INTEGER, INTEGER );
CompProc = PROCEDURE( ADDRESS, INTEGER, INTEGER ): INTEGER;
So that for an array of INTEGERs the user may
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
p := CAST(PArray, data); ought to do it
That line works as it stands. By "immediately" I
meant without the intermediate variable. Since
pointer CAST is only a pseudoinstruction telling the
compiler to treat an ADDRESS variable as a pointer
to a specific type, that variable should not be re-
quired. I wanted to write:

CAST( PArray, data )^[i] - CAST( PArray, data )^[j];

It is possible in Borland Pascal and C.
Post by Martin Brown
It isn't very generic if it assumes that you can
treat the contents of PArray as being identical to
a 4 byte INTEGER in all respects.
I think you misunderstood me because I forgot to in-
clude the declaration of PArray:

TArray = ARRAY[0..Size - 1] OF INTEGER;
PArray = POINTER TO TArray;
Post by Martin Brown
The above comparison would fail badly half the
time with CARDINAL.
Compare is the user-supplied procedure that my
generic unit accepts as a procedural variable. For
an array of CARDINALs it would be implemented other-
wise.

My generic unit makes no assumptions about the
structure of the underlying data and accesses it ex-
clusively via two user-supplied proceudres: Swap and
Compare. Its full code is here:

https://pastebin.com/3bEbpYgs
--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
Martin Brown
2017-10-17 14:38:00 UTC
Permalink
Raw Message
Post by Anton Shepelev
Post by Martin Brown
Post by Anton Shepelev
PROCEDURE Compare( data: ADDRESS; i, j: INTEGER ):INTEGER;
VAR p: PArray;
BEGIN
p := data; (* How to CAST immediately? *)
RETURN p^[i] - p^[j];
END Compare;
p := CAST(PArray, data); ought to do it
That line works as it stands. By "immediately" I
meant without the intermediate variable. Since
pointer CAST is only a pseudoinstruction telling the
compiler to treat an ADDRESS variable as a pointer
to a specific type, that variable should not be re-
CAST( PArray, data )^[i] - CAST( PArray, data )^[j];
It is possible in Borland Pascal and C.
It might have made a difference in the ancient days of PIM3 Logitech 4
pass ETH compilers with zero worthwhile optimisation of anything at all
but with modern compilers you don't need to micromanage code in an HLL.
It is more important to make clear your intentions - the optimiser will
remove or assign to registers any transient variables.
Post by Anton Shepelev
Post by Martin Brown
It isn't very generic if it assumes that you can
treat the contents of PArray as being identical to
a 4 byte INTEGER in all respects.
I think you misunderstood me because I forgot to in-
TArray = ARRAY[0..Size - 1] OF INTEGER;
PArray = POINTER TO TArray;
What do you have against declaring the routines as taking an open array
of TYPE ?
--
Regards,
Martin Brown
Anton Shepelev
2017-10-17 17:18:09 UTC
Permalink
Raw Message
Post by Martin Brown
What do you have against declaring the routines as
taking an open array of TYPE ?
My original design to write a generic sortine mod-
ule, which can be used any kind of array-like data.
--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
Chris Burrows
2017-10-17 20:44:50 UTC
Permalink
Raw Message
Post by Anton Shepelev
Post by Martin Brown
What do you have against declaring the routines as
taking an open array of TYPE ?
My original design to write a generic sortine mod-
ule, which can be used any kind of array-like data.
Why choose to use a type-safe language if you want to bypass its type-safety capabilities?
Anton Shepelev
2017-10-18 08:33:43 UTC
Permalink
Raw Message
Why choose to use a type-safe language if you want
to bypass its type-safety capabilities?
In order to avail myself of type-safety in the over-
whelming majority of cases where it does more good
than ill. Generic alrorithyms are a notable excep-
tion, for the only alternative to unsafe mechanics
(loopholes) is heavy code duplication.
--
() ascii ribbon campaign - against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
Loading...