Notes on Pascal-M models for the CX16

BruceMcF · Post by **BruceMcF** » Sun May 23, 2021 10:00 am

A little while back, while researching 6502 bytecode interpreters when I was working on the "SweetCX16" implementation of Steve Wozniak's venerable "Sweet16" virtual machine for the Apple II, I came across Pascal-M, an implementation of Pascal for the early KIM-1 8bit 6502 system based on a bytecode virtual machine (VM). The last update I knew of was the version 2 being ported to run on FreePascal.

Recently the original V1 Pascal-M was restored (recently as in 2020! ... so there may have been a bit of a "lockdown project" to it like medievalist armorer Tod Cutler's "Lockdown Longbow" series).

Now, Pascal-M is a lot more primitive than UCSD's bytecode Pascal. It's all upper case, no file I/O in version 1 (version 2 adds it), and of course it runs on a SLOW virtual machine. So why is this interesting as a target?

One thing that is interesting about this is the porting process. Basically, write the bytecode interpreter, including a loader to load a compiled bytecode program, then you can load the already-compiled compiler, then use it to compile the compiler AGAIN from the source, and if the original compiler you started with and the compiler that you created are exactly the same, you can be pretty sure that your bytecode interpreter has worked.

And while the interpreter should be written in assembly, a lot of it is straightforward. You write a bunch of primitives that do relatively straightforward things within the VM system model, each of them ending with a call to NEXTOP to execute the next bytecode. And while the Pascal-M documentation is rudimentary, it is originally based on the P2 VM p-code version of Pascal (which actually was NOT a bytecode) ... which is well documented, since in some computer science programs in some countries, students still "rebuild a Pascal compiler" as part of their first compilers class.

And the loader could literally be written in Basic.

While I have by no means started on the loader yet, I've got sketches for 35 bytecodes ... including the 8 that have an embedded parameter of 0-15 in the code, so occupy the first 128 bytes of the instruction set ... and have 21 to go. Now, that doesn't mean I'm 60% of the way done, because some of the ones I have not yet started on pack a wallop, like the opcode for the CASE statement which has the entire CASE jump table as it's operand. However, I may be over 25% of the way through the first draft of the interpreter in a combined few days of playing around with it.

Now, no doubt Pascal-M would be very slow ... m-code is not optimized for the 6502 to any degree other than being a byte code, where Nicholas Wirth's original pcode was based on 30 bit wide codes. But once the VM interpreter is written, there will be another working "high-ish for the 70s and 80s" language for the CX16.

Another appealing aspect of Pascal-M as a target is that almost the whole of Low RAM is available to it, and since it is a bytecode, the code density will be pretty good. The bytecode interpreter can actually be run out of a High RAM segment, so if any springboarding to other parts of High RAM can be done out of Golden RAM, the system will have 36K+ free for program code, stack, and heap.

Of course, being a VM, the virtual hardware can be redesigned in hardware. There is actually a bit of flexibility here, because there are three logical things ... the procedure CODE, the procedure stack data, both of which grow up, and the heap, which grows down, to work with.

That is, the m-code VM treats CODE and data, or "STORE", as different things, so if one is careful, it is possible to have in excess of 64K total addressable content, because CODE addresses and STORE addresses don't necessarily have to refer to the same memory location.

One model to look at is a "Big CODE" model, where the interpreter is placed into the high side of Low RAM, nestled somewhere up right before the I/O page at $9F00 ... and the CODE is run out of the High RAM window. Then the Heap and the data part of the Stack has 30K+ to work with while the code can extend for up to 64K. The amount of functionality that could be included in a program is increased by the fact that the Pascal system is designed to be able to load and unload distinct segments so, for example, a main program could load a set-up/initialized segment, and then free it up to load the procedures that are going to use the initialized data.

Another to look at is a "Big HEAP" model, where the bytecode operands that store things to general memory treat "STORE" addresses from $8000 up as coming from one of up to four High RAM segments, so the Heap extends from a virtual $FFFF to $8000, while the stack and program extends from $0801 until the start of the interpreter or $7FFF, whichever comes first.

Indeed, although I need to get further into the project than I am now to see if its possible, it might be possible to combine both of those. The VM codes that pull data from code generally push it onto the stack, and the access to data in the Heap is generally to and from the stack or the direct MOV instruction. So it may be that the trickiest part of combining a "HighRAM CODE" and "HighRAM Heap" model was already done when the High RAM Heap is developed, including the various cases that MOV must handle when moving data from one place to another, especially moving data between different parts of the Heap when they are not in the same HighRAM segment.

As some of us will recall quite well from Basic in the 80s, sometimes all that is needed to get something to run "fast enough" is to identify the worst bottlenecks and move them to assembly. So one of the appeals of starting with m-code is that it DOES have a "call assembled procedure" code. That's one of the 35 I have already sketched:

Quote

; BB X X, CAP Call assembly procedure

; The CAP instruction is used to call an assembly-language routine previously

; loaded into a specified address. The actual action of this instruction may

; be considered to be system-dependent.

; NOTE: The assembly procedure must end with RTS

CAP:   LDY #1

   LDA (PC),Y

   STA T

   INY

   LDA (PC),Y

   STA TH

   JSR TCALL

   LDA #2

   JMP NEXTCODE

TCALL:   JMP (T)

Now, to be clear, I am not reckoning that Pascal-M on the CX16 will "take over and reform the world!!!" ... I am not even expecting that it would ever be heavily used by anybody. However, it's still an appealing project to take something that ran on the KIM-1, MOS Technologies first 6502 computer ... pre-PET, pre Apple II ... and port it to the CX16.

Oh, by the way, that single KIM-1 card cannot possibly run Pascal-M ... it only has 1K of RAM! ... you would need something like a 32K RAM expansion card:

BruceMcF · Post by **BruceMcF** » Thu May 27, 2021 5:43 am

"Hits back of own neck" ~ I was judging the size of the compiler by the size of the object file ... but the "19K compiler" is not a binary, it's a text based object file format.

That is, the object file is in a text format of lines mostly starting with "P" and then a digit to pick which kind of object line it is and then bytes which are in hexadecimal format to convert to the binary in the CODE space. So the programs are under half the size that I was thinking. A compiler program with a 19K filesize is less than 10K. Indeed, it would almost certainly be less than 8K, since some of the object lines are patches to generate the code address of forward references, and some is static data which could be stored in top of the data storage area, with the heap growing down from the bottom of the static data.

So I might go with a "SMALL" model of interpreter sitting where it loads at $0801, stack growing up from the end of the interpreter, heap growing downward from $9DFF, a 256 byte "mark stack" at $9E00-$9EFF, and CODE at $A000-$BFFF. Then a "MEDIUM" model could have three High RAM banks for programs and CODE addresses in the range of $A000-$FFFF. If the Compiler is somewhere in the rank of a single High RAM Segment, then 24K would be probably bigger than any Pascal-M program I would be likely to write.

That also saves times in bringing the system up for the first time, since the first version of the object loader program could be a Basic program that gives a menu of m-code object files and when you pick one, it loads it and builds the program in the High RAM segment, sets up the zero page, and then loads the interpreter, which starts right into a warm start and starts executing the program at "Virtual" $A000. A loader in Forth or Assembly would be faster, but that would work for getting started.

snake · Post by **snake** » Thu May 27, 2021 8:13 pm

Have you seen this P65PAS?

https://github.com/t-edson/P65Pas

BruceMcF · Post by **BruceMcF** » Thu May 27, 2021 11:23 pm

4 hours ago, snake said:

Have you seen this P65PAS?

https://github.com/t-edson/P65Pas

No, but since Pascal-M would be hosted on the CX16, they are entirely different beasts. I am not playing around with writing an m-code interpreter because I have a desire to program the CX16 in Pascal, but as part of developing language tools hosted on the CX16.

rje · Post by **rje** » Fri May 28, 2021 1:28 pm

I've been struggling with an interpreter from the software-engineer side of things. Tokenizers are easy, VMs are easy, compilers... are tricky for me.

rje · Post by **rje** » Fri May 28, 2021 1:29 pm

By the way, the concept is totally cool and I'm on board.

BruceMcF · Post by **BruceMcF** » Wed Jun 09, 2021 8:46 am

Also, I never did much programming in Pascal, and I misread the Kim1 documentation on the m-code and the Pascal model. With the much clearer documentation of the FreePascal code for a generic m-code interpreter, I see that I was implementing the stack and heap the wrong way around. That is not a problem to fix up, though: it is actually easier doing it correctly, since with the (S) in zero page pointing to the bottom of the stack, (S),Y can easily reference all of the bytes in the built in operands -- at most 16bytes for operations on a pair of sets, which are 8byte membership bitmaps for up to 64 items.

With the relocatable bytecode allowing for an ability to load a program segment and then recover its space and load another one in other versions of P2 Pascal, I am now leaning toward keeping the HighRAM page available for an external code segment.

The procedure and functions that a program refers to are converted into an index number by the compiler, and are called by index number. The simplest way to implement that is to set aside two pages to hold vectors, and shift the index byte, using the carry bit to select between the two pages and the shifted lower 7bits of the index as the index to the address of the procedure/function.

So the idea is that if the address of the routine is below $8000, it's a routine in the base compiled code. If the address of the routine is above $8000, it's a routine at a location in a HighRAM page. Requiring it to be loaded at an aligned 8byte boundary means that 15bits can refer to a procedure stored anywhere in the first 1MB of potential HighRAM. Only the procedure and function call and return operations have to worry about whether it is a base or extended routine, and keeping track of which it is at present only requires one byte in zero page that has the #0 for a core procedure/function and the location HighRAM segment.

AFAIU, it's not Pascal-M that actually implemented the loadable code segments, but it was done in another P2 based Pascal, so there should be a scaffolding already in place for pursuing that approach: most importantly, already existing example Pascal code P2 code .

As far as using HighRAM for data, I am thinking that a RAMdisk is a pretty good approach. In this case, there ARE more developed versions of Pascal-M that have more file functions than Version 1, which has console input and output, a single stored input and a single storable output ... essentially a paper punch or cassette level of mass storage. That makes for an easy to implement RAMdisk at the start, where the program input file is loaded into HighRAM before loading the executable, and the file to save the resulting output file can be selected up front and the saving done when returning from executing the compiled Pascal program. So the program loader is the only part that needs to know how to work with actual CX16 file I/O and incorporating file support into the compiler can be postponed until later.

rje · Post by **rje** » Wed Jun 09, 2021 2:17 pm

Bruce, are you storing your work on Github? Because there is a non-zero chance that folks might see if there are bits that they could help with. Or just appreciate your work.

rje · Post by **rje** » Wed Jun 09, 2021 2:20 pm

And, while the two projects are not related, I am wondering if your Sweeter16 code can assist your work here.

No I don't have anything in particular in mind; I'm just thinking that these sorts of tools work best when you program bottom-up, starting with utilities that do simple things, then more complex things that use those simple things...

...which you've already mentioned, in a way, in your initial post.

BruceMcF · Post by **BruceMcF** » Wed Jun 09, 2021 3:32 pm

57 minutes ago, rje said:

Bruce, are you storing your work on Github? Because there is a non-zero chance that folks might see if there are bits that they could help with. Or just appreciate your work.

When I have something that can be assembled, I definitely will put it up on Github ... right now and for the next couple of weeks, I am still at the puttering around phase. I was able to borrow from xForth for the integer multiply and divide, but using a zero page indirect "LDA (S) : LDY #1 : ORA (S),Y : STA (S),Y : JSR DROP2" stack makes it very much it's own thing versus either SweetX16 or x4th.