Note: All code listings are in lower case so that they are pastable into the VICE emulator. Otherwise, you will get graphics/uppercase PETSCII characters on paste.
Examining the structure of how the BASIC code is stored
User program RAM is in locations 4096 to 7680 (decimal) on a VIC 20. The storage format of the basic programs can be dumped with the following BASIC:
for i=4096 to 7680 - fre(1): ? i,chr$(peek(i));peek(i): next i
I’ve taken the extra step up adding a slightly more sophisticated version of the above at line 10000 in the below code so that I can RUN 10000
to dump memory locations with paging and skipping control and non-printable characters.
10 print "hi"
20 n=peek(4104)
30 x=peek(4105)
40 if n >= 90 then n=65
50 n=n+1
60 x=int(26*rnd(1)+65)
70 poke 4104,n
80 poke 4105,x
90 goto 10
9999 end
10000 b=4096:i=b
10010 e=7680-fre(0)
10020 c=0
10030 ls=20
10040 ? i,
10050 ch=peek(i)
10060 ? ch;
10070 if(ch>=32 and ch<=127)or(ch>=160 and ch<=254)then ? chr$(ch);
10075 ?
10080 if c>ls then ? "continue";: input wt$: c=0
10090 c=c+1
10100 i=i+1
10110 if i>e then end
10120 goto 10040
You’ll notice in the above that we start with a null character (0) followed by 12, 16, 10 and 0. 12 and 16 are a pointer to the the memory location of the next line of code (in “little endian” order, so 16 * 256 + 12 = 4108)
The next bytes, at location 4099 and 4100, are 10 and 0. This is the line number for that line of code (again, in little endian format).
Once you get past these 2 2 byte numbers, you have a code…. 153: 153 is the VIC 20 BASIC Keyword Code for the PRINT statement. All syntactically significant tokens (keywords and symbols) are reduced to a single byte (and TAB and SPC functions actually include their left parenthesis as part of this code). The VIC-20 Programmer’s Reference Guide lists out these values (some of these are just their PETSCII codes if individual characters):
You’ll notice that space (32) and double quote (34) are explicitly expressed, as are the individual digits of any number literals.
At the very end of the line is a 0/null again to terminate the line. (Fun part of this experiment: Setting a byte in the middle of the line to 0 makes the rest of the line unreadable by the BASIC interpreter!)
Modifying the code
For an easy first attempt at this, I’m going to just change location 4105 and 4106, which are the letters in HI
10 print "hi"
In the below code, I’m cycling the original H
through the alphabet (65-90) and setting the original I
with random values:
20 n=peek(4104)
30 x=peek(4105)
40 if n >= 90 then n=65
50 n=n+1
60 x=int(26*rnd(1)+65)
70 poke 4104,n
80 poke 4105,x
90 goto 10
If you BREAK
out of the program (Esc key in VICE emulator) after running and list the first few lines, you’ll see that the initial PRINT
statement’s string has indeed changed:
What’s Next?
This is obviously a very trivial exercise of self-modifying code, but any modifications that require anything aside from 1:1 in-place replacement requires more planning: The lines of a program are variable in length, which means that inserting code requires shifting subsequent code in memory. Also, shifting code in memory requires updating all pointers that pointed to the original locations. The next exercise will probably be adding code to the end of the program rather than trying to insert it in the middle.