- Improve the overall performance of your code.
- Reduce the size of your code footprint.
Let's consider an example. Imagine a program with a structure that defines properties for a sprite, and then defines an array sprites of 64 elements of this structure in memory:
Code: Select all
struct sprite {
unsigned int id;
unsigned char type;
unsigned int x;
unsigned int y;
};
struct sprite sprites[64];
Since the total size is larger than 256 bytes, the compiler cannot address your array using absolute indexed addressing mode. Instead, it needs to address your array using 2 zeropage memory addresses, applying indirect indexed addressing mode. For example lda ($04),y which clobbers the accumulator and the y register, and needs 2 zeropage memory addresses.
When you write code indexing your Array of Structures, the arithmetic the compiler needs to generate to position the zeropage offset to the right address, can become complex and long. It can result in code overhead.
Let us create some totally useless code, that uses the sprites array, indexes the array and retrieves values x and y from each sprite element, indexed by index, and moving the values into global variables x and y:
Code: Select all
unsigned int x;
unsigned int y;
for(unsigned int index=0; index<64; index++) {
x = sprites[index].x;
y = sprites[index].y;
}
At first it prepares the zeropages used as the offset for the indirect indexed address mode.
Since the size of the structure sprite is 7 bytes, it needs to calculate the offset 7 * index, which requires code resulting in bitshifts and additions, doing the following: $zp = ((((index << 1) + index) << 1) + index). For example, if index == 1, then shifting index to the left makes index == 2, then adding 1 makes index == 3, shifting to the left makes index == 6, and adding the last 1 makes index == 7. Taking an other example with index == 5, the calculation would be 5 << 1 makes index == 10, adding 5 makes index == 15, shifting to the left makes index == 30, adding 5 makes index == 35 (which is 7*5). This generates the code below:
Retrieve index (an absolute address but we use a label here) and shift the accumulator to the left, and store in $02/$03, so we need to do this for both the low and the high byte of index:
Code: Select all
lda.z index
asl
sta.z $02
lda.z index+1
rol
sta.z $02+1
Code: Select all
clc
lda.z $02
adc.z index
sta.z $02
lda.z $02+1
adc.z index+1
sta.z $02+1
Code: Select all
asl.z $02
rol.z $02+1
Code: Select all
clc
lda.z $02
adc.z index
sta.z $02
lda.z $02+1
adc.z i+1
sta.z $02+1
Now the compiler adds the offset of the memory location of the sprites array, stores this result in $02/$03.
Code: Select all
lda.z $02
clc
adc #<sprites
sta.z $02
lda.z $02+1
adc #>sprites
sta.z $02+1
So finally the compiler uses $02/$03 to load the unsigned integer of x from the sprite array element, and store it into the absolute memory location of the global variable x.
Code: Select all
ldy #3 // Offset value x in the structure sprite
lda ($02),y
sta.z x
iny
lda ($02),y
sta.z x+1
Code: Select all
ldy #5 // Offset value y in the structure sprite
lda ($02),y
pha
iny
lda ($02),y
sta.z y+1
pla
sta.z y
Code: Select all
struct sprite {
unsigned int id;
unsigned char type;
unsigned int x;
unsigned int y;
unsigned char unused1;
};
Code: Select all
lda.z index
asl
sta.z $02
lda.z index+1
rol
sta.z $02+1
asl.z $02
rol.z $02+1
asl.z $02
rol.z $02+1
However, before you decide to use AoS, first consider if you can use SoA, a dynamic choice which was discussed here.
Hopefully this has been an interesting read and see you in the next article.
Sven