Notes on Pascal-M models for the CX16
Posted: Sun May 23, 2021 10:00 am
A little while back, while researching 6502 bytecode interpreters when I was working on the "SweetCX16" implementation of Steve Wozniak's venerable "Sweet16" virtual machine for the Apple II, I came across Pascal-M, an implementation of Pascal for the early KIM-1 8bit 6502 system based on a bytecode virtual machine (VM). The last update I knew of was the version 2 being ported to run on FreePascal.
Recently the original V1 Pascal-M was restored (recently as in 2020! ... so there may have been a bit of a "lockdown project" to it like medievalist armorer Tod Cutler's "Lockdown Longbow" series).
Now, Pascal-M is a lot more primitive than UCSD's bytecode Pascal. It's all upper case, no file I/O in version 1 (version 2 adds it), and of course it runs on a SLOW virtual machine. So why is this interesting as a target?
One thing that is interesting about this is the porting process. Basically, write the bytecode interpreter, including a loader to load a compiled bytecode program, then you can load the already-compiled compiler, then use it to compile the compiler AGAIN from the source, and if the original compiler you started with and the compiler that you created are exactly the same, you can be pretty sure that your bytecode interpreter has worked.
And while the interpreter should be written in assembly, a lot of it is straightforward. You write a bunch of primitives that do relatively straightforward things within the VM system model, each of them ending with a call to NEXTOP to execute the next bytecode. And while the Pascal-M documentation is rudimentary, it is originally based on the P2 VM p-code version of Pascal (which actually was NOT a bytecode) ... which is well documented, since in some computer science programs in some countries, students still "rebuild a Pascal compiler" as part of their first compilers class.
And the loader could literally be written in Basic.
While I have by no means started on the loader yet, I've got sketches for 35 bytecodes ... including the 8 that have an embedded parameter of 0-15 in the code, so occupy the first 128 bytes of the instruction set ... and have 21 to go. Now, that doesn't mean I'm 60% of the way done, because some of the ones I have not yet started on pack a wallop, like the opcode for the CASE statement which has the entire CASE jump table as it's operand. However, I may be over 25% of the way through the first draft of the interpreter in a combined few days of playing around with it.
Now, no doubt Pascal-M would be very slow ... m-code is not optimized for the 6502 to any degree other than being a byte code, where Nicholas Wirth's original pcode was based on 30 bit wide codes. But once the VM interpreter is written, there will be another working "high-ish for the 70s and 80s" language for the CX16.
Another appealing aspect of Pascal-M as a target is that almost the whole of Low RAM is available to it, and since it is a bytecode, the code density will be pretty good. The bytecode interpreter can actually be run out of a High RAM segment, so if any springboarding to other parts of High RAM can be done out of Golden RAM, the system will have 36K+ free for program code, stack, and heap.
Of course, being a VM, the virtual hardware can be redesigned in hardware. There is actually a bit of flexibility here, because there are three logical things ... the procedure CODE, the procedure stack data, both of which grow up, and the heap, which grows down, to work with.
That is, the m-code VM treats CODE and data, or "STORE", as different things, so if one is careful, it is possible to have in excess of 64K total addressable content, because CODE addresses and STORE addresses don't necessarily have to refer to the same memory location.
One model to look at is a "Big CODE" model, where the interpreter is placed into the high side of Low RAM, nestled somewhere up right before the I/O page at $9F00 ... and the CODE is run out of the High RAM window. Then the Heap and the data part of the Stack has 30K+ to work with while the code can extend for up to 64K. The amount of functionality that could be included in a program is increased by the fact that the Pascal system is designed to be able to load and unload distinct segments so, for example, a main program could load a set-up/initialized segment, and then free it up to load the procedures that are going to use the initialized data.
Another to look at is a "Big HEAP" model, where the bytecode operands that store things to general memory treat "STORE" addresses from $8000 up as coming from one of up to four High RAM segments, so the Heap extends from a virtual $FFFF to $8000, while the stack and program extends from $0801 until the start of the interpreter or $7FFF, whichever comes first.
Indeed, although I need to get further into the project than I am now to see if its possible, it might be possible to combine both of those. The VM codes that pull data from code generally push it onto the stack, and the access to data in the Heap is generally to and from the stack or the direct MOV instruction. So it may be that the trickiest part of combining a "HighRAM CODE" and "HighRAM Heap" model was already done when the High RAM Heap is developed, including the various cases that MOV must handle when moving data from one place to another, especially moving data between different parts of the Heap when they are not in the same HighRAM segment.
As some of us will recall quite well from Basic in the 80s, sometimes all that is needed to get something to run "fast enough" is to identify the worst bottlenecks and move them to assembly. So one of the appeals of starting with m-code is that it DOES have a "call assembled procedure" code. That's one of the 35 I have already sketched:
Quote
; BB X X, CAP Call assembly procedure
; The CAP instruction is used to call an assembly-language routine previously
; loaded into a specified address. The actual action of this instruction may
; be considered to be system-dependent.
; NOTE: The assembly procedure must end with RTS
CAP: LDY #1
LDA (PC),Y
STA T
INY
LDA (PC),Y
STA TH
JSR TCALL
LDA #2
JMP NEXTCODE
TCALL: JMP (T)
Now, to be clear, I am not reckoning that Pascal-M on the CX16 will "take over and reform the world!!!" ... I am not even expecting that it would ever be heavily used by anybody. However, it's still an appealing project to take something that ran on the KIM-1, MOS Technologies first 6502 computer ... pre-PET, pre Apple II ... and port it to the CX16.
Oh, by the way, that single KIM-1 card cannot possibly run Pascal-M ... it only has 1K of RAM! ... you would need something like a 32K RAM expansion card: