COURS206.TXT

*
68000 ASSEMBLER COURSE ON ATARI ST *
*
by The Fierce Rabbit (from 44E) *
*
Second series *
*
Course number 6 *

  SELF-MODIFYING CODE

  Another simple thing to use which greatly facilitates program-
  ming: self-modifiable programs. Like all the topics
  discussed so far, this one is not complicated but requires a bit
  of attention. However, I must admit that the first time I
  encountered this in a listing, it took me many hours before I could understand! The main diffi-
  culty lies not so much in understanding the subject itself
  but rather in choosing the method of explanation, I hope
  that this one will satisfy you!

  It is quite possible to imagine an addition with varia-
  bles. For example A=1, B=2 for an operation like A+B=C
  We also easily imagine that the values of A and B can
  change during the program to become, for example, A=2 and
  B=3 which keeps our operation A+B=C just as valid.
  But, how do we make this operation A+B=C suddenly become A-B=C or even A/B=C?

  That's where the difference lies between a high-level language and assembler. We have seen, in the first courses, that the assembler only translates instructions into numbers. Unlike compilers that 'arrange' instructions, the assembler only
  translates, instruction by instruction. We therefore
  end up with a sequence of numbers, these numbers being
  in the 'tube'. Just as we wrote in the tube to
  modify values given to variables, it is therefore quite possible to write in the tube to modify the numbers that
  are in fact instructions. Caution is obviously needed because the numbers we are going to write must be
  recognized by the 68000 as a new instruction and not just
  anything, which would lead to an error. Let's see
  concretely a simple example. We have a list of letters coded
  in word, and we want to display these letters one after
  the other.

  Here is a program that performs this operation.

           INCLUDE   "B:\START.S"
           LEA       TABLE,A6     in A6 because GEMDOS doesn't touch it
  START    MOVE.W    (A6)+,D0       retrieves the word
           CMP.W     #$FFFF,D0      is it the end flag?
           BEQ       END            yes, bye bye
           MOVE.W    D0,-(SP)       no, so pass it on the stack
           MOVE.W    #2,-(SP)       to display it
           TRAP      #1
           ADDQ.L    #4,SP
           MOVE.W    #7,-(SP)       waits for a key press
           TRAP      #1
           ADDQ.L    #2,SP
           BRA       START          and start again
  END      MOVE.W    #0,-(SP)
           TRAP      #1
  *--------------------------------*
           SECTION DATA
  TABLE     DC.W      65,66,67,68,69,70,$FFFF
           SECTION BSS
           DS.L      100
  STACK    DS.L      1
           END

  Imagine now that this display is in a subroutine,
  and that we want to display a letter with each call of this
  subroutine: We wait for a key press, if it's 'space',
  then we leave, otherwise we jump to the routine that displays a char-
  acter and then returns. Here is a first attempt:

           INCLUDE   "B:\START.S"
  START    MOVE.W    #7,-(SP)
           TRAP      #1
           ADDQ.L    #2,SP
           CMP.W     #" ",D0
           BEQ       END
           BSR       DISPLAY
           BRA       START

  END      MOVE.W    #0,-(SP)
           TRAP      #1
  *------------------------------*
  DISPLAY  LEA       TABLE,A6     table address
           MOVE.W    (A6)+,D0        retrieves the word
           MOVE.W    D0,-(SP)       pass it on the stack
           MOVE.W    #2,-(SP)       to display it
           TRAP      #1
           ADDQ.L    #4,SP
           RTS                      then returns
  *--------------------------------*
           SECTION DATA
  TABLE     DC.W      65,66,67,68,69,70,$FFFF
           SECTION BSS
           DS.L      100
  STACK    DS.L      1
           END

  Assemble and run the program. Observation: with each keystroke, you get an 'A' but not the other letters!!!
  Obviously, because each time we jump into our DISPLAY
  subroutine, it reloads the table address. The char-
  acter retrieved is therefore always the first one. To avoid this, we
  need to create a pointer that will advance in this table. In our
  example, it would have been sufficient to place LEA TABLE,A6 at the
  beginning of the program. A6 not modified by anyone, it would have
  worked.... until the 7th keystroke, A6 pointing
  then outside of the table! Moreover, we are here to learn and
  therefore we consider the case where, outside of the routine, all the
  registers are modified! It is therefore impossible to keep A6 as a point-
  ter. Here is the modified DISPLAY routine:

  DISPLAY  MOVEA.L   TAB_PTR,A0
           MOVE.W    (A0)+,D0
           CMP.W     #$FFFF,D0
           BNE       .HERE
           LEA       TABLE,A0
           MOVE.L    A0,TAB_PTR
           BRA       DISPLAY
  .HERE    MOVE.L    A0,TAB_PTR
           MOVE.W    D0,-(SP)
           MOVE.W    #2,-(SP)
           TRAP      #1
           ADDQ.L    #4,SP
           RTS

  In addition, we must add after INCLUDE (thus before the START label)
           LEA       TABLE,A0
           MOVE.L    A0,TAB_PTR
  and in the BSS section

  TAB_PTR  DS.L      1

  A little analysis after these changes! First of all, we hap-
  pily note that it works! In the beginning we set up a
  pointer.

           LEA       TABLE,A0     puts the table address in A0
           MOVE.L    A0,TAB_PTR    and saves it in TAB_PTR

  We now have in the tube across the label
  TAB_PTR a long word, this long word being the address of the be-
  ginning of the table. Then in the routine, we retrieve this address. Here
  a small remark is necessary because confusion is frequent: If
  we have:

  IMAGE    INCBIN    "A:\HOUSE.PI1"

  and we want to work with this image, we will

           LEA       IMAGE,A0

  A0 will then point to the image. On the other hand if we have :

  IMG_PTR  DC.L      IMAGE

  That is to say a label for a long word being the
  address of the image, by doing LEA IMG_PTR,A0 we do not recover
  in A0 the address of the image but in fact the address of
  the address of the image! To directly retrieve a pointer to
  the image you have to do:

           MOVEA.L   IMG_PTR,A0

  However, to retrieve the address of the table it would also have been
  possible to do:

           MOVEA.L   #TABLE,A0

  Having said that, let's continue our exploration: In TAB_PTR we have
  therefore the address of the beginning of the table. Waiting for a key press, we jump
  in the routine. Transfer the address contained in
  TAB_PTR in A0 then we retrieve the word contained in the tube at
  that address and put it in D0. As we have done this
  operation with (A0)+, A0 now points to the next
  word in the table. Let's test if the word retrieved is $FFFF, which would in-
  dicate the end of the table. If not, we jump to
  .HERE and save the new value of A0 in TAB_PTR.

  If the word retrieved is $FFFF, we reload TAB_PTR with the address
  from the top of the table, and it's off again like in 14!!!

  This pointer system, very frequently used, is simple to use
  and quite handy! However, let's consider another method, more
  twisted! First of all, let's remove the DISPLAY routine and replace
  it with the following:

  DISPLAY  MOVEA.L   #TABLE,A0
           MOVE.W    (A0)+,D0
           MOVE.W    D0,-(SP)
           MOVE.W    #2,-(SP)
           TRAP      #1
           ADDQ.L    #4,SP
           RTS

  Reassemble and run. It is quite obvious that it no longer works since at each call of the routine, we reload A0 with
  the TABLE address, so the word retrieved will always be the first
  one of the table. Let's go under MONST with Alt+D. Scroll down to the
  DISPLAY label. We find in front of MOVEA.L #TABLE,A0 etc....
  Exit with control+C then reassemble, but be careful before
  clicking on 'assemble', let's take a look at the options. We
  have by default DEBUG INFO indicating Extend. This means
  that the names of the labels will be incorporated into the program.
  This allows us to find the names of these labels when we are
  under MONST. Choose the NONE option for DEBUG INFO as-
  semble and return under MONST.

  Surprise, the names of the labels have disappeared and are replaced by
  numbers. This is logical since, in any case, the assembler
  translates our source into numbers. Let's find our DISPLAY routine.
  It is a bit harder since its label is no longer visible! To locate it, we can look for the beginning (after the start)
  CMP.W #$20,D0 which is the comparison with the space bar after
  the key press. Then, a BEQ towards the end and the BSR towards our
  routine. Note the address in front of the BSR and let's go there. The
  first line of our routine is MOVEA.L #$XXXXXXX,A0 XXXXXXX
  being the address of the table. I remind you that on a 68000 the pro-
  gram can be anywhere in memory, this address will
  therefore be different on different machines. For me, it's $924C6.
  I activate window 3 with Alt+3 then with alt+a I ask the
  window to position itself on this address. MONST shows me in
  the center the ASCII codes of the letters from my table ($41,$42 etc...)
  and to the right these letters in 'text'.

  In the continuation of this display routine, I will therefore put
  (for me) $924C6 in A0, this address being the one pointing to
  the 'A' from the table. What I would be interested in, is that, next time, it allows
  me to point to the 'B'. For that I would need:
           MOVEA.L   #$924C6,A0     for the 'A'

  and then 
           MOVEA.L   #$924C8,A0     for the 'B'.

  The letters being in the form of word in my table it requires an advance of 2!

  Let's return to window 2, in front of this MOVEA.L, let's look at
  the address at which it is located (left column), note
  this address, and also note the address of the following instruction
  (MOVE.W (A0)+,D0). Let's activate window 3, and place ourselves at
  the address of MOVEA.L.

  In my case, and since I had:
           MOVEA.L   #$924C6,A0     I find 207C 0009 24C6

  I deduce that these 3 words constitute the representation of my
  instruction MOVEA.L, since the address of the next word corresponds
  to the address of the following instruction. However, I find in this encoding,
  the address of my table. With a little imagination, I conceive
  easily that it is possible to write directly in the 'tube' and
  for example modify the word which has for current value 24C6.
  If I add 2 to it, my instruction will become 207C 0009 24C8
  which will be equal to MOVEA.L #$924C8,A0 and which will make me point to the
  second word of the table!!!!!!!!

  Here is the self-modifiable version of the DISPLAY routine.

  DISPLAY MOVEA.L    #TABLE,A0
           MOVE.W    (A0),D0
           CMP.W     #$FFFF,D0
           BNE       HERE
           MOVE.L    #TABLE,DISPLAY+2
           BRA       DISPLAY
  .HERE    ADD.W     #2,DISPLAY+4
           MOVE.W    D0,-(SP)
           MOVE.W    #2,-(SP)
           TRAP      #1
           ADDQ.L    #4,SP
           RTS

  Note: TAB_PTR no longer serves us, and neither does the LEA table from the
  beginning.

  Assemble with NONE in DEBUG INFO, then go under MONST, step through and watch the line

           MOVEA.L   #TABLE,A0    change!

  Let's explain very clearly what happens.

  We place TABLE in A0 and then we retrieve the word. Let's assume
  first of all that it's not $FFFF, we then jump to
  .HERE. So we must add 2 to increase the address and point
  next time to the second letter of the table. We have seen
  that when encoded the line MOVEA.L etc... holds over 3 words so 6
  bytes. The addition of 2 must therefore apply to the 3rd word. The beginning of
  this word is byte 4. For this reason, we give as a destination of the addition DISPLAY+4.

  If we had retrieved $FFFF, it would have been necessary to reinitialize our
  line MOVEA.L with

           MOVE.L    #TABLE,DISPLAY+2.

  Why +2? Because the address of the table is a long word and
  that, in the encoding of the instruction, it starts on the second
  word. You must, therefore, skip a single word which means 2 bytes.

  In the same vein, it is entirely possible to modify a program more deeply. Here is a glaring example.
  (see listing number 4)

  Knowing that the instruction RTS (Return from Subroutine) is coded
  with $4E75 and that the instruction NOP (No Operation) is coded by
  $4E71, by placing a NOP or an RTS, in fact changes the end of the
  routine. NOP does nothing at all. It is an operation that does
  nothing in that nothing changes, but this instruction
  consumes a little time. So it will be useful to us to achieve
  small waits (very useful for graphic effects for example).

  Follow the unfolding of this program under MONST to see
  the modifications happening. A more complex case:

           MOVE.W    #23,D0
           MOVE.W    #25,D1
  VARIANT  ADD.W     D0,D1
           MULU.W    #3,D1
           SUB.W     #6,D1
           MOVE.W    D1,D5

  After assembling this little piece of program, go under
  MONST and take a look at window 3. By pointing at
  VARIANT and looking at the addresses in front of the instructions, we
  deduce that:

           ADD.W     D0,D1     is converted to $D240
           MULU.W    #3,D1     is converted to $C2FC $0003
           SUB.W     #6,D1     is converted to $0441 $0006

  If we now take:
           MOVE.W    #23,D0
           MOVE.W    #25,D1
  VARIANT  MULU.W    D0,D1
           SUB.W     #8,D1
           ADD.W     #4,D0
           MOVE.W    D1,D5

  We assemble, go under MONST:

           MULU.W    D0,D1     is converted to $C2C0
           SUB.W     #8,D1     is converted to $0441 $0008
           ADD.W     #4,D0     is converted to $0640 $0004

  So, if in a program using this 'routine' I do:

           LEA       VARIANT,A0
           MOVE.W    #$D240,(A0)+
           MOVE.L    #$C2FC0003,(A0)+
           MOVE.L    #$04410006,(A0)+

  I will get the first version:
           ADD.W     D0,D1;
           MULU.W    #3,D1;
           SUB.W     #6,D1

  whereas if I do:

           LEA       VARIANT,A0
           MOVE.W    #$C2C0,(A0)+
           MOVE.L    #$04410008,(A0)+
           MOVE.L    #$06400004,(A0)+

  I will get the second version!

  Try with the following program, following it under MONST:
  Note: this program has no end so exit with Control+C:

           LEA       VARIANT,A0
           MOVE.W    #$D240,(A0)+
           MOVE.L    #$C2FC0003,(A0)+
           MOVE.L    #$04410006,(A0)+

           LEA       VARIANT,A0
           MOVE.W    #$C2C0,(A0)+
           MOVE.L    #$04410008,(A0)+
           MOVE.L    #$06400004,(A0)+

           MOVE.W    #23,D0
           MOVE.W    #25,D1
  VARIANT  MULU.W    D0,D1
           SUB.W     #8,D1
           ADD.W     #4,D0
           MOVE.W    D1,D5
           END

  Remarks: It is entirely possible to envision more than 2 ver-
  sions of the same part of the program. If the sizes of these different
  versions differ, it is not serious because it is always
  possible to fill in with NOPs. The applications of this kind of
  'trick

COURS206.TXT

Navigation menu