Pl NR
------------------------------ CHAPTER number 2: * FIRST STEPS IN ASSEMBLY * ------------------------------ *** SOFTWARE USE OF THE ASSEMBLER *** ---------------------------------------------- - As we saw in the previous chapter, programming in assembly language is only possible with: an EDITOR, an ASSEMBLER, and a LINKER. 1) The EDITOR: .It is the EDITOR that allows you to enter your listing: ---------- You will write your program using the functions of the editor. The editor will save and load your listings. The saved files are ASCII files (like the files in this digital book) and can be modified at will with the editor. Such a file is called SOURCE CODE. The functions of the editor vary with the utility (DEVPAC, METACOMCO, PROFIMAT...) and simplify the input of the text (listing). .A file from an editor cannot be executed! It will first have to be ASSEMBLED and then LINKED. 2) The ASSEMBLER .ASSEMBLY constitutes the second step: ------------ The ASCII codes (text) of the listing are translated by the Assembler and are converted (encoded) into BINARY (binary is recognized directly by the computer, unlike ASCII text) Once processed (Assembled), the file (listing) is saved on the diskette in the format of a BINARY file called OBJECT CODE. 3) The LINKER .LINK EDITION is the last step: --------- The OBJECT CODE is loaded and the LINKER will integrate the MACRO INSTRUCTIONS that make up the library if the source file requires it. --> ------------- Let me explain: -------------- .In assembly language, you can create MACRO-INSTRUCTIONS (see previous chapter). .A MACRO is nothing more than a new instruction (parametrizable) which will be included by the LINKER every time it encounters its name in the OBJECT CODE. An example: ------- .You create a MACRO that displays 'HELLO' on the screen: This MACRO actually delimits a PROGRAM whose function is to display 'HELLO' on the screen and you name it 'WRITE': Beginning of the Macro WRITE . . . the listing that allows to display . 'HELLO' on the screen . End of the Macro Each time the LINKER encounters 'WRITE' (the name of the MACRO), in the OBJECT CODE, it will rewrite the 'WRITE' program segment in place of the MACRO. .A MACRO is therefore not a subroutine, it only allows for better readability in the listing. .A macro will be completely rewritten each time: so we lose a little bit more memory, but it is much faster than a subroutine. .Macros allow for better readability of the listing as You no longer have to type the ROUTINE corresponding to the name of the Macro, the Linker takes care of that. .You can therefore create a LIBRARY of MACROS (i.e. files defining MACROS) to call when you need them in your listing... The LINKER takes care of loading the MACROS used in the listing (if the listing contains them...) and rewriting them completely. It will be sufficient to indicate (at the beginning of the listing) the name of the file which contains the MACROS used, or to define the MACROS used at the beginning of the listing. (WE WILL SEE IN DETAIL...) .We, we will create MACROS for the functions of GEM, XBIOS, BIOS and VDI (MACROS that handle calling the desired functions by passing the necessary parameters for speed and user-friendliness) -------------- .The LINKER will also create the BASE PAGE, which contains information about the program essential for the operating system to be able to load and execute the LINKED program. This is why LINKAGE is essential even if the listing does not contain Macros. (I will detail when I talk about GEMDOS) .The resulting file will be saved and can be executed. SUMMARY: ------ * EDITOR * * ASSEMBLER * * LINKER * |listing| -------> |object code| -------> |executable prg| *** SOME BASIC NOTIONS *** ------------------------------------- Here are some VERY IMPORTANT notions: 1) BINARY: ----------- . Usually, for calculations, we use a decimal system, i.e. this system is composed of 10 digits: 0,1,2,3,4,5,6,7,8,9 :This is a BASE of 10. If we want to express a set whose quantity exceeds 9: we increase the digit by one unit and we obtain a number:10,11,12,13,14... .The BINARY system (or BASE 2), consists of 2 digits: 0 and 1 Thus, if we want to express a set whose quantity exceeds 1 (i.e. >1 in decimal system!!) we increase by one unit. Therefore: This is how we count in BINARY:0,1,10,11,100,101,111,1000,1001,1011... .We will assume that each component of the computer's memory is represented (or coded) in BINARY. .The 0 and 1 of the BINARY system present in the components of the computer's structure are called BITS. .A BIT can be zero (or off):0 OR activated (or on):1 -- A BIT is the smallest modifiable and recognizable structure by the computer. .We define a group of 8 BITS as an BYTE (The BYTE is also called BYTE in English, do not confuse it with 'bit'!) We define a group of 2 BYTES as a WORD (so 16 BITS) We define a group of 2 WORDS as a LONG WORD (so 4 BYTES or 32 BITS) These groupings are arbitrary and deal with consecutive BITS (that follow each other) in memory. CONSEQUENCES: The internal components of the computer can be expressed ------------ in BITS, BYTES, WORDS or LONG WORDS. SUMMARY: ------- - A BIT takes on the value of either 1 (activated) or 0 (turned off) --- - An BYTE is a group of 8 consecutive BITS in memory ----- - A WORD is a group of 2 consecutive BYTES in memory --- - A LONG WORD is a group of 2 WORDS consecutive in memory -------- To simplify, we could say that the computer's memory is composed of a multitude of small boxes (BITS) that can take either 1 or 0 as values depending on the actions of the Microprocessor and can be grouped (arbitrarily) in the form of BYTES, WORDS or LONG WORDS. EXAMPLE: --- { 0 is a BIT excerpts from { 01011101 is a BYTE or 8 BITS memory { 0101110111010110 is a WORD or 16 BITS { 01011101110101101000100100101101 is a LONG WORD or 32 B 2) HEXADECIMAL: ------------- .HEXADECIMAL is a BASE 16 system, the 16 components of this system are:0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F Thus, to express a set whose quantity exceeds F (i.e. 16 in BASE 10), we change units: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,10,11,12, ,13,14,15,16,17,18,19,1A,1B,1C,1D,1E,1F,20,21,22... As you can see, the real utility of this system is that a large number is represented by a minimum of digits. .An BYTE is represented in HEXA. by 2 digits .A WORD is represented in HEXA. by 4 digits .A LONG WORD is represented in HEXA. by 8 digits 3) NOTATIONS: ---------- - In an ASM listing, digits coded in BINARY are represented with the prefix ' % ', those coded in HEXADECIMAL with the prefix ' $ '. (The use of the BASE 10 does not need to be indicated) After Linking, all digits will automatically be coded in Hexadecimal. EXAMPLE: %01001011 is a digit coded in BINARY (and an BYTE) ---- $1F0A is a digit coded in HEXA. (and a WORD) 101001 is a digit coded in DECIMAL ! 4) REMARKS: ---------- - The leftmost BIT of an BYTE, a WORD, a LONG WORD is called M.S.B (Most Significant Bit) or Most significant bit. - This bit allows for signing (positive/negative) a number. It is activated for a negative number. - It is only used to sign a number! EXAMPLE: $FFFFFFFF is a L-W that is -1 in decimal ---- %1111111111111111 is a WORD that is -1 in decimal %11111110 is an BYTE that is -2 in decimal and %00000010 is an BYTE that is 2: To invert the sign of the BYTE, we have 'extended' the leftmost 1 to the MSB. :Very theoretical examples, don't panic, it's not really useful in practice (and frankly useless, I assure you...) because there are instructions that perform all sorts of operations on the BITS, BYTES, WORDS and L-W... (and no one forces you to work in Binary system!) - According to our definitions: (the MSB is only used to sign the bit, the BYTE, the word or the long word) therefore: * -2^7 ó BYTE < 2^7 -128 ó BYTE < 128 * -2^15 ó WORD < 2^15 -32768 ó WORD < 32768 * -2^31 ó L-W < 2^31 -2147483648 ó L-W < 2147483648 This, on the other hand, is something to remember: Overflows (BYTE too large...) cause errors. 5) RECAPITULATION: --------------- .A component of the computer's structure: REGISTER,VARIABLE... (terms to be explained soon) can take the form of a BIT, BYTE, WORD or LONG WORD. In theory... We will study the exceptions: The choice is more limited in reality. CONSEQUENCES: .A VARIABLE defined as a WORD cannot therefore ------------- have a value greater than 32768. (for example) NB: This variable can only be a Long WORD (If its --- value can be contained in a L-W , otherwise that would cause an error since the computer does not recognize any component whose quantity exceeds the L-W.) (in theory...) REMARKS: .A Variable that can be contained in an BYTE, ---------- so whose value is <128 can also be contained in a WORD or a L-W. But it is necessary to pay attention to what the component (here the variable) can be defined as (BIT, BYTE, WORD, L-W...). .Therefore, for each component of the computer's internal structure indicate the different forms it can take (BIT, BYTE, WORD, L-W...). We will do this with precision. This is why it is necessary to know the different forms that can be used for programming (Registers, variables...) and also by what these forms are defined (BIT, BYTE...). This is why Assembler is very strict in its programming. 6) IMPORTANT: ---------- Sets such as BYTES, WORDS, L-W are composed of BITS, these BITS are numbered like this (in BYTES, WORD, L-W, REGISTERS, VARIABLES...): From RIGHT to LEFT. 31 30 . . . . . . 15 . . . 9 8 7 6 5 4 3 2 1 0 [*][ ][.][.][.][ ][ ][ ][ ][*][.][.][.][ ][ ][*][ ][ ][ ][ ][ ][ ][*] - An BYTE is therefore numbered from right to left from 0 to 7 - A WORD, from 0 to 15 - A L-W, from 0 to 31 - Bit 0 is the MOST SIGNIFICANT BIT (or LSB: Least significant Bit) - Bit 7 (15 for a WORD, 31 for a L-W) is the MOST SIGNIFICANT BIT (or MSB) THUS: ------ <----------- MOST SIGNIFICANT BITS 31<-- ...[][][][][][][][][][][][][]... -->0 LEAST SIGNIFICANT BITS ----------> THEREFORE: A WORD is composed of 2 BYTES, one of lower WEIGHT (bits 0 to 7) and ----- one of higher WEIGHT: (bits 8 to 15) A L-W is composed of 2 WORDS, one of lower WEIGHT (bits 0 to 15) and one of higher WEIGHT (bits 16 to 31) THIS NOTION IS FUNDAMENTAL ----------------------------- *** THE MEMORY *** ------------------ - It is assumed that memory is a sequence of numbers coded in BINARY. :We now know that these numbers can be grouped into BYTES, WORD, L-W. - Programming in assembly language allows you to change the contents of the memory. Memory is changed when the computer performs a task (a calculation, any search...). In this case, the computer takes care of organizing its memory, it is according to this order that one or other action takes place, the user therefore does not intervene: This is what happens at every moment, a computer is therefore never at rest (it checks if the disk is changed, erases the screen fifty times a second). Memory of the computer can also be changed by the user, this is the purpose of programming. The consequence is a planned state (programmed) of the memory, so such and such action is performed. This is possible by using instructions specific to the language used or by modifying directly a portion of the memory. This last operation is carried out very easily in ASM and with a precision at the level of the BIT. We can therefore Move an BYTE, a WORD, a L-W (which we have defined) in the memory.(where it is possible...) In order to navigate in the vast memory of our computer, we have divided the memory and named each portion of the memory by an ADDRESS (like addresses on a street...) Theoretically, it would be possible to place a BIT, BYTE, WORD or L-W at any ADDRESS, but in reality this is not possible. - The user only has access to a part of the memory. (we will see which part) - It is necessary to take into account the parity and oddness of ADDRESSES: One cannot put anything at any address. IN FACT: If we SCHEMATICIZE the structure of memory, it looks like a band of limited length (start and end) and a WIDTH of 16 BITS, the different parts of which are numbered and addressed every 8 BITS. (at each BYTE) etc... bits: 15 7 0 ----- .| | | x-2|--------|--------| | x|10001011|--------| BYTE: 10001011 at address x | x+2|--------|10001101| BYTE: 10001101 at x+3 BYTEs | x+4|--------|--------| | x+6|00101100|10110011| WORD: 0010110010110011 | .|--------|--------| at x+6 BYTEs \|/ .|--------|--------| ' x+12|11011111|01110100| L-W: 110111110111010001011011- SENS des .|01011011|01110100| 01110100 adresses .|--------|--------| at x+12 BYTEs CROISSANTES .|--------|--------| ----------- x+20|--------|-------1| BIT: 1 at address x+21 BYTEs | .|--------|--------| | .|OCT FORT| FAIBLE | \|/ .|--------| ------ | ' 15 7 0 ETC... +-------+ |MEMORY |: (Example of organization) +-------+ - Take a good look at this example, it is very easy to understand and this scheme must be in your head every time you program in ASM. - You can see that: * Memory is addressable at the BYTE level * ---------------------------------------------- * :Between 2 different addresses X and X+1 or X-1, there is * * 8 BITS or one BYTE. * bits n° 7 6 5 4 3 2 1 0 X-1 ----------> [ ][ ][ ][ ][ ][ ][ ][ ] X ----------> [ ][ ][ ][ ][ ][ ][ ][ ] X+1 ----------> [ ][ ][ ][ ][ ][ ][ ][ ] * Memory can be represented as a WELL into which we DROP data: BITS, BYTE, WORDS, L-W: The width of this well is a WORD (so 16 BITS) The depth depends on the size of the memory. The purpose of the game: Drop our data into the well without deforming the data. i.e.: If you throw a WORD: 1010101110110110 (=2 bytes, 1 high-weight (on the right) and one low-weight (on the left)) at an even address x. 1010101110110110 | \|/ |--------|--------| EVEN ADDRESSES |--------|--------| ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 you obtain: EVEN ADDRESSES x|10101011|10110110| ODD ADDRESSES ---------------x+2|--------|--------| ----------------- |--------|--------| bit nr° 15 7 0 - The WORD has been placed at address x (EVEN) . The HIGH-WEIGHT Byte is at address x (EVEN) . The LOW-WEIGHT Byte is at address x+1 (ODD) however, if you throw this word at an odd address: 10101011 10110110 | \|/ |--------|--------| EVEN ADDRESSES |--------|--------| ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 you obtain: x-3|--------|--------| EVEN ADDRESSES x-1|--------|10101011|x ODD ADDRESSES --------------- |10110110|--------|x+2 ----------------- |--------|--------| bit nr° 15 7 0 - The WORD has been placed at address x (ODD) . The HIGH-WEIGHT Byte is at an ODD address: x . The LOW-WEIGHT Byte is at an EVEN address: x+1 In this case, the rules of our 'game' are no longer respected. If we had to throw another WORD at address x-3 in our well, its HIGH-WEIGHT Byte could not hold and would tumble down to x-1! CAUTION, in reality this would not happen, I am simply simplifying my explanation... In fact, to be able to move (place) WORDS in memory, one must ensure that the destination address is EVEN! For our example, the WORD must therefore be placed in memory (in our 'well') in this form: +----------------+ |1010101110110110| +----------------+ | \|/ |--------|--------| EVEN ADDRESSES x|10101011|10110110|x+1 ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 .The HIGH-WEIGHT Byte (10101011) is at address x which is EVEN .The LOW-WEIGHT Byte (10110110) is at address x+1 which is ODD * The same applies when it's an L-M: -------------------------------------------- Here's how to place an L-M, here too the destination address must be EVEN, to avoid the same complications as in our previous example. +-----------------+ |1110001001100100-| |1010110011010001 | +-----------------+ | \|/ |--------|--------| EVEN ADDRESSES |--------|--------| ODD ADDRESSES --------------- |--------|--------| ----------------- x|11100010|01100100|x+1 x+2|10101100|11010001|x+3 |--------|--------| bit nr° 15 7 0 .The HIGH-WEIGHT Word (1110001001100100) is at address x, which is EVEN .The LOW-WEIGHT Word (1010110011010001) is at address x+2, which is EVEN * For a BYTE or a BIT ----------------------- - Parity no longer matters! Thus, for a Byte: we have: -------------- 01000110 | \|/ |--------|--------| EVEN ADDRESSES x|01000110|--------|x+1 ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 .The Byte (01000110) is at an address x, EVEN OR: -- 01000110 | \|/ |--------|--------| EVEN ADDRESSES x-1|--------|01000110|x ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 .The Byte (01000110) is at an address x, ODD For a BIT: we have ------------ 0 | \|/ |--------|--------| EVEN ADDRESSES x|-------0|--------|x+1 ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 .The BIT (0) is at an address x, EVEN OR: --- 0 | \|/ |--------|--------| EVEN ADDRESSES x-1| -------|-------0|x ODD ADDRESSES --------------- |--------|--------| ----------------- bit nr° 15 7 0 .The BIT (0) is at an address x, ODD *** SUMMARY *** ---------------------- * A BIT can be either at an EVEN address or an ODD address --- ----- ------- * A BYTE can be either at an EVEN address or an ODD address ----- ----- ------- * A WORD is always located at an EVEN address in memory --- ----- * An L-M is always located at an EVEN address in memory --- ----- (Not following this rule leads to a computer crash!) * STRUCTURE of MEMORY: -------- etc... bits: 15 7 0 ----- .| | | x-2|--------|--------|x-1 | x|00001011|--------| BYTE: 00001011 at address x | x+2|--------|10110010| BYTE: 10110010 at x+3 bytes | x+4|--------|--------| (odd) | x+6|10001011|10110110| WORD: 1000101110110110 | .|--------|--------| at x+6 bytes (even) \|/ .|--------|--------| ' x+12|01101100|00000000| L-M: 011011000000000010111010- DIRECTION .|10111010|01110111| 01110111 of INCREASING .|--------|--------| at x+12 bytes (even) addresses .|--------|--------| ----------- x+20|--------|-------1| BIT: 1 at address x+21 bytes | .|--------|--------| (odd) | .|HIGH-WT| LOW-WT | \|/ .|--------| ------ | ' 15 7 0 | etc... | EVEN Addresses | | | ODD Addresses --------------- ----------------- +-------+ |MEMORY | +-------+ Example of memory organization. ------------------------------------- (Bytes, Words, L-Ms, fictitious Bits) *** ORGANIZATION OF MEMORY *** ---------------------------------- - We distinguish: .The program space, containing the instructions of your programs coded in BINARY. .The data space, containing the program's data and the memory areas that you have initialized. (in BIN) .These 2 zones are in the RAM (Random Access Memory), and one cannot write on the ROM (Read-Only Memory) because it contains all the information needed for the computer to function correctly. RAM is the memory available for writing to the programmer (it can, of course, also be erased.) Approximately 512 KB of RAM is available on a 520ST (one Kilobyte = 1024 Bytes) and 1 Megabyte or 1024 KB on a 1040ST. ----------------------- - You are now ready to begin programming itself: remember well everything that has been said here (note it down somewhere if you find it useful), read it again if your head is spinning, because in the next chapter, we start with the serious stuff. PIECHOCKI Laurent 8,impasse Bellevue Continue in DEBUTS.DOC 57980 TENTELING ----------
Back to ASM_Tutorial