Self-Modifying Code on a Commodore VIC-20Posted: August 1, 2021 | Author: ThomasPowell | Filed under: retro | Tags: basic, commodore, ram, vic-20 | Leave a comment »
Note: All code listings are in lower case so that they are pastable into the VICE emulator. Otherwise, you will get graphics/uppercase PETSCII characters on paste.
Examining the structure of how the BASIC code is stored
User program RAM is in locations 4096 to 7680 (decimal) on a VIC 20. The storage format of the basic programs can be dumped with the following BASIC:
for i=4096 to 7680 - fre(1): ? i,chr$(peek(i));peek(i): next i
I’ve taken the extra step up adding a slightly more sophisticated version of the above at line 10000 in the below code so that I can
RUN 10000 to dump memory locations with paging and skipping control and non-printable characters.
10 print "hi" 20 n=peek(4104) 30 x=peek(4105) 40 if n >= 90 then n=65 50 n=n+1 60 x=int(26*rnd(1)+65) 70 poke 4104,n 80 poke 4105,x 90 goto 10 9999 end 10000 b=4096:i=b 10010 e=7680-fre(0) 10020 c=0 10030 ls=20 10040 ? i, 10050 ch=peek(i) 10060 ? ch; 10070 if(ch>=32 and ch<=127)or(ch>=160 and ch<=254)then ? chr$(ch); 10075 ? 10080 if c>ls then ? "continue";: input wt$: c=0 10090 c=c+1 10100 i=i+1 10110 if i>e then end 10120 goto 10040
You’ll notice in the above that we start with a null character (0) followed by 12, 16, 10 and 0. 12 and 16 are a pointer to the the memory location of the next line of code (in “little endian” order, so 16 * 256 + 12 = 4108)
The next bytes, at location 4099 and 4100, are 10 and 0. This is the line number for that line of code (again, in little endian format).
Once you get past these 2 2 byte numbers, you have a code…. 153: 153 is the VIC 20 BASIC Keyword Code for the PRINT statement. All syntactically significant tokens (keywords and symbols) are reduced to a single byte (and TAB and SPC functions actually include their left parenthesis as part of this code). The VIC-20 Programmer’s Reference Guide lists out these values (some of these are just their PETSCII codes if individual characters):
You’ll notice that space (32) and double quote (34) are explicitly expressed, as are the individual digits of any number literals.
At the very end of the line is a 0/null again to terminate the line. (Fun part of this experiment: Setting a byte in the middle of the line to 0 makes the rest of the line unreadable by the BASIC interpreter!)
Modifying the code
For an easy first attempt at this, I’m going to just change location 4105 and 4106, which are the letters in
10 print "hi"
In the below code, I’m cycling the original
H through the alphabet (65-90) and setting the original
I with random values:
20 n=peek(4104) 30 x=peek(4105) 40 if n >= 90 then n=65 50 n=n+1 60 x=int(26*rnd(1)+65) 70 poke 4104,n 80 poke 4105,x 90 goto 10
BREAK out of the program (Esc key in VICE emulator) after running and list the first few lines, you’ll see that the initial
This is obviously a very trivial exercise of self-modifying code, but any modifications that require anything aside from 1:1 in-place replacement requires more planning: The lines of a program are variable in length, which means that inserting code requires shifting subsequent code in memory. Also, shifting code in memory requires updating all pointers that pointed to the original locations. The next exercise will probably be adding code to the end of the program rather than trying to insert it in the middle.