Page 2 of 2

Notes on Pascal-M models for the CX16

Posted: Sun Jun 13, 2021 4:57 am
by BruceMcF

I figured out something that I found a little cool over the past two days.

A core part of the mcode interpreter (see, for example, this pdf scan of the documentation of the mcode interpreter) is the way that relative addressing is handled. 8 of the instructions have an embedded 4bit parameter that refers to the call level of the data, where the outer block if 0, a procedure it calls is 1, and so on, to a maximum depth of 15 below the outer block. The parameter is relative to the current call level, so a variable defined on the stack within the current level would be a parameter of "0", while if the current level is six below the outer block, a variable defined within the outer block would be a parameter of "6". Then there is one or two bytes that give the offset from the base of the stack for that level.

The entire start of the stack frame is (the stack grows down):


  • (... locals / working values of calling procedure)


  • [Space for return value if this is a function rather than a procedure]


  • Base Address for this procedure


  • Mark pointer of calling procedure


  • Program counter value of calling procedure


  • (... locals and working values of current procedure...)


The way that the Kim-1 code (zip file) did this was the way envisioned in the original language system design. There is a mark pointer that points to the marker position for the current procedure, and the start of the current procedure holds the mark pointer for the procedure that called it, and so on. So you get the mark pointer for the current level, check if the call level parameter is 0, if not, you fetch the value it points to and decrement the level, and repeat until the level count hits zero.

Needless to say, this is REALLY SLOW on the 6502 if you are 6 call levels deep and referencing a value created at the base level. But because of the way that Pascal was defined, lots of variables get created at the base level.

So what I have sketched instead is an X-indexed stack of call level based addresses. It is a 16 integer stack, so 32 bytes (probably on Low RAM rather than the zero page, to preserve some space for Assembly Language procedures), with the level of the index stored in a zero page byte, LEVEL. The parameter been shifted one left by the time this code is reached, so the parameter is now even numbers from 0 to 30. Add LEVEL to the parameter, put the index into X, and then the base address is available to have the offset parameter subtracted from it.


Quote




 



LEVELS:

    BPL LDCIS    ; LDCIS = $0x --> [$0x - $90] = $7x

    LDY #1        ; Save these 8 doing this individually

    ASL        ; LDAS-STR2 are $0x-$Fx, Carry is Set

    STA Z

    BMI +        ; LOD1 / LOD2 / STO1 / STO2 are $8x - $Fx

    CMP #$20

    BPL MSTN    ; LDAA is $2x/$3x, MSTO / MSTN are $4x - $7x

+    AND #$1E    ; Level index

    CLC

    ADC LEVEL

    TAX

    LDA LEVELL,X

    SEC

    SBC (PC),Y

    STA T

    LDA LEVELH,X

    SBC #0

    STA TH

; Parse instruction - lower 5bits have done their job

    LDA #$20

    BIT Z        ; former bits 5, 6, 7 in Z, V and S flags, respectively

    BPL LDAS    ; LDAS if Bit6 clear

    BVC LOD2    ; LOD1 / LOD2 if Bit6 set and Bit5 clear

    BEQ STR1

; 8X YY. STR2 - Store 2-byte data item into memory

STR2:    LDA (S),Y

    STA (T),Y



; 7X YY, STR1 ?Store 1-byte data into memory

; The STR1 instruction pulls one 16-bit item off the stack and stores the

; least significant byte into the YYth byte at the Xth level.

STR1:    LDA (S)

    STA (T)

    JSR POP2

    LDA #2

    JMP NEXTOPA



 



Now, while it does have the characteristic verbose character of 65c02 assembly working with 16bit values, it is MUCH faster in getting an offset computed for anything outside of the current procedure.

Now, in the m-code interpreter for the Kim-1, the base address is a dummy, because all offsets are computed by the interpreter relative to the position of the mark pointer. And with this "call LEVEL" stack, the marker pointer on the stack is also redundant. So that means that the program counter to return to is the only value needed for the stack frame.

If those two redundant values are removed, this means that the base address is the address of the stack pointer at the time the stack frame is being created. Just store that into the call LEVEL stack.

And THAT means that there is no need for dummy base address or mark pointer positions on the main system stack. It can just be:


  • (... locals / working values of calling procedure)


  • [Space for return value if this is a function rather than a procedure]


  • Program counter value of calling procedure


  • (... locals and working values of current procedure...)


... saving four bytes per procedure call.


Notes on Pascal-M models for the CX16

Posted: Mon Jun 14, 2021 1:59 am
by rje

Very nice.  I fully appreciate leveraging the CPU's addressing mode to boost performance of the higher-level language...


Notes on Pascal-M models for the CX16

Posted: Wed Jun 16, 2021 8:54 am
by BruceMcF


On 6/14/2021 at 9:59 AM, rje said:




Very nice.  I fully appreciate leveraging the CPU's addressing mode to boost performance of the higher-level language...



... but man would it be a lot shorter and faster if I could leverage the 65816's addressing modes.


Notes on Pascal-M models for the CX16

Posted: Wed Jun 16, 2021 10:44 am
by paulscottrobson

With something like 6502 Pascal there's a reality that while the runtime will be fine to do a great deal, significant chunks of anything speed orientated (e.g. action games) will require assembler linked in.


Notes on Pascal-M models for the CX16

Posted: Wed Jun 16, 2021 12:28 pm
by BruceMcF


1 hour ago, paulscottrobson said:




With something like 6502 Pascal there's a reality that while the runtime will be fine to do a great deal, significant chunks of anything speed orientated (e.g. action games) will require assembler linked in.



Yes, the mcode has a Call Assembly Procedure bytecode.


Notes on Pascal-M models for the CX16

Posted: Tue Jun 22, 2021 11:44 am
by BruceMcF

Getting there ... maybe, since these are just sketches. But I have sketches for every Version 1 bytecode except the one that does the For loop and the one that does the Case statement, and then mostly the I/O standard procedures to go.

I haven't started on the loader, but it is going to be the easiest part. Plus, half of my classes for the semester ended in week 14 and after that I started getting caught up with grading for the other half that end in week 19, which is another two weeks, so I may be able to do the loader in a couple of longer programming stretches with actual assembly and debugging and the whole shebang over a coming weekend.