Fast arctangent
Posted: Wed Aug 23, 2023 8:56 pm
When programming in assembly, there are constant trade-offs for memory, speed,
and accuracy. The amount of memory available in banked RAM gives the
Commander X16 the option of using large lookup tables to quickly get results
which would otherwise take a long time to derive in code.
FASTATAN uses one bank of RAM to perform the arctangent of two 16 bit signed
integer coordinates. The result will be 8, 9, or ten bits, depending on which
subroutine is chosen.
To use these arctangent functions, the X and Y coordinates must be 16 bit
signed integers stored in zero page. The X register must point to the X
coordinate, and the Y register points to the Y coordinate.
The result is in bigrees (256 bigrees = 360 degrees = 2 pi radians).
For the ten bit result, the 8 most significant bits are in the accumulator,
and the 9th and 10th bits are stored in bit 7 and 6 of the X register, as well
as bits 1 and 0 of the Y register.
For the 9 bit result, the 8 most significant bits are in the accumulator, and
the 9th bit is stored in bit 7 of the X register as well as bit 0 of the Y
register.
For the 8 bit result, the result is in the accumulator and the X and Y
registers will contain 00. The 8 or 9 bit results are rounded-off versions
of the ten bit result, so they take slightly longer.
If the input coordinates are (0000,0000), the result will be zeros in the
accumulator, X, and Y.
***note***
The subroutines in this bank use zero page addresses F0-FF for the
calculations. These addresses are used by BASIC, so the contents must be
saved before using the arctangent function (during initialization), and
restored when your program shuts down. If you are already using my earlier
FASTMATH code, that has the same couple of subroutines, so you only need to
do it once with either the FASTMATH code or the FASTATAN code, *not both*.
All the subroutines are listed below. The time in cycles includes the JSR
command which calls each subroutine. Remember to set the correct RAM bank.
ADDR: CONTENTS
A000: save zero page F0-FF into banked RAM. 253 cycles.
A003: load saved zero page data from banked RAM to F0-FF. 253 cycles.
A006: ATAN2(Y,X), 10 bit result. Average time: 304 cycles. Max: 384 cycles.
A01E: ATAN2(Y,X), 9 bit result. Average time: 321 cycles. Max: 401 cycles.
A036: ATAN2(Y,X), 8 bit result. Average time: 310 cycles. Max: 390 cycles.
examples:
x=32767, y=32767: result $20.0
x=0, y=32767: result $40.0
x=-32767, y=-32767: result $A0.0
and accuracy. The amount of memory available in banked RAM gives the
Commander X16 the option of using large lookup tables to quickly get results
which would otherwise take a long time to derive in code.
FASTATAN uses one bank of RAM to perform the arctangent of two 16 bit signed
integer coordinates. The result will be 8, 9, or ten bits, depending on which
subroutine is chosen.
To use these arctangent functions, the X and Y coordinates must be 16 bit
signed integers stored in zero page. The X register must point to the X
coordinate, and the Y register points to the Y coordinate.
The result is in bigrees (256 bigrees = 360 degrees = 2 pi radians).
For the ten bit result, the 8 most significant bits are in the accumulator,
and the 9th and 10th bits are stored in bit 7 and 6 of the X register, as well
as bits 1 and 0 of the Y register.
For the 9 bit result, the 8 most significant bits are in the accumulator, and
the 9th bit is stored in bit 7 of the X register as well as bit 0 of the Y
register.
For the 8 bit result, the result is in the accumulator and the X and Y
registers will contain 00. The 8 or 9 bit results are rounded-off versions
of the ten bit result, so they take slightly longer.
If the input coordinates are (0000,0000), the result will be zeros in the
accumulator, X, and Y.
***note***
The subroutines in this bank use zero page addresses F0-FF for the
calculations. These addresses are used by BASIC, so the contents must be
saved before using the arctangent function (during initialization), and
restored when your program shuts down. If you are already using my earlier
FASTMATH code, that has the same couple of subroutines, so you only need to
do it once with either the FASTMATH code or the FASTATAN code, *not both*.
All the subroutines are listed below. The time in cycles includes the JSR
command which calls each subroutine. Remember to set the correct RAM bank.
ADDR: CONTENTS
A000: save zero page F0-FF into banked RAM. 253 cycles.
A003: load saved zero page data from banked RAM to F0-FF. 253 cycles.
A006: ATAN2(Y,X), 10 bit result. Average time: 304 cycles. Max: 384 cycles.
A01E: ATAN2(Y,X), 9 bit result. Average time: 321 cycles. Max: 401 cycles.
A036: ATAN2(Y,X), 8 bit result. Average time: 310 cycles. Max: 390 cycles.
examples:
x=32767, y=32767: result $20.0
x=0, y=32767: result $40.0
x=-32767, y=-32767: result $A0.0