------------------------------
CHAPTER number 2:
* FIRST STEPS IN ASSEMBLY *
------------------------------
*** SOFTWARE USE OF THE ASSEMBLER ***
----------------------------------------------
- As we saw in the previous chapter, programming in assembly language is only possible with: an EDITOR, an ASSEMBLER, and a LINKER.
1) The EDITOR: .It is the EDITOR that allows you to enter your listing:
----------
You will write your program using the functions of the editor. The editor will save and load your listings.
The saved files are ASCII files (like the files in this digital book) and can be modified at will with the editor.
Such a file is called SOURCE CODE.
The functions of the editor vary with the utility (DEVPAC, METACOMCO, PROFIMAT...) and simplify the input of the text (listing).
.A file from an editor cannot be executed!
It will first have to be ASSEMBLED and then LINKED.
2) The ASSEMBLER .ASSEMBLY constitutes the second step:
------------
The ASCII codes (text) of the listing are translated by the Assembler and are converted (encoded) into BINARY (binary is recognized directly by the computer, unlike ASCII text)
Once processed (Assembled), the file (listing) is saved on the diskette in the format of a BINARY file called OBJECT CODE.
3) The LINKER .LINK EDITION is the last step:
---------
The OBJECT CODE is loaded and the LINKER will integrate the MACRO INSTRUCTIONS that make up the library if the source file requires it. -->
-------------
Let me explain:
.In assembly language, you can create MACRO-INSTRUCTIONS
(see previous chapter).
.A MACRO is nothing more than a new instruction (parametrizable)
which will be included by the LINKER every time it encounters its name
in the OBJECT CODE.
An example:
.You create a MACRO that displays 'HELLO' on the screen:
This MACRO actually delimits a PROGRAM whose function
is to display 'HELLO' on the screen and you name it 'WRITE':
Beginning of the Macro
WRITE .
.
. the listing that allows to display
. 'HELLO' on the screen
.
End of the Macro
Each time the LINKER encounters 'WRITE' (the name of the MACRO),
in the OBJECT CODE, it will rewrite the 'WRITE' program segment in place of the MACRO.
.A MACRO is therefore not a subroutine, it only allows for better
readability in the listing.
.A macro will be completely rewritten each time: so we lose
a little bit more memory, but it is much faster than a subroutine.
.Macros allow for better readability of the listing as
You no longer have to type the ROUTINE corresponding to the name of the
Macro, the Linker takes care of that.
.You can therefore create a LIBRARY of MACROS (i.e. files
defining MACROS) to call when you need them in your listing...
The LINKER takes care of loading the MACROS used in the listing
(if the listing contains them...) and rewriting them completely.
It will be sufficient to indicate (at the beginning of the listing) the name of the file which
contains the MACROS used, or to define the MACROS used
at the beginning of the listing.
(WE WILL SEE IN DETAIL...)
.We, we will create MACROS for the functions of GEM, XBIOS,
BIOS and VDI (MACROS that handle calling the desired functions by passing the necessary parameters for speed and user-friendliness)
--------------
.The LINKER will also create the BASE PAGE, which contains
information about the program essential for the operating system to be able to load and execute
the LINKED program.
This is why LINKAGE is essential even if
the listing does not contain Macros.
(I will detail when I talk about GEMDOS)
.The resulting file will be saved and can be
executed.
SUMMARY:
------
* EDITOR * * ASSEMBLER * * LINKER *
|listing| -------> |object code| -------> |executable prg|
*** SOME BASIC NOTIONS ***
-------------------------------------
Here are some VERY IMPORTANT notions:
1) BINARY:
-----------
. Usually, for calculations, we use a decimal system,
i.e. this system is composed of 10 digits: 0,1,2,3,4,5,6,7,8,9
:This is a BASE of 10.
If we want to express a set whose quantity exceeds 9: we increase
the digit by one unit and we obtain a number:10,11,12,13,14...
.The BINARY system (or BASE 2), consists of 2 digits: 0 and 1
Thus, if we want to express a set whose quantity exceeds 1
(i.e. >1 in decimal system!!) we increase by one unit.
Therefore:
This is how we count in BINARY:0,1,10,11,100,101,111,1000,1001,1011...
.We will assume that each component of the computer's memory is represented
(or coded) in BINARY.
.The 0 and 1 of the BINARY system present in the components of
the computer's structure are called BITS.
.A BIT can be zero (or off):0 OR activated (or on):1
--
A BIT is the smallest modifiable and recognizable structure
by the computer.
.We define a group of 8 BITS as an OCTET (The OCTET is also called BYTE
in English, do not confuse it with 'bit'!)
We define a group of 2 OCTETS as a WORD (so 16 BITS)
We define a group of 2 WORDS as a LONG WORD (so 4 OCTETS or 32 BITS)
These groupings are arbitrary and deal with consecutive BITS
(that follow each other) in memory.
CONSEQUENCES: The internal components of the computer can be expressed
------------ in BITS, OCTETS, WORDS or LONG WORDS.
SUMMARY:
- A BIT takes on the value of either 1 (activated) or 0 (turned off)
---
- An OCTET is a group of 8 consecutive BITS in memory
-----
- A WORD is a group of 2 consecutive OCTETS in memory
---
- A LONG WORD is a group of 2 WORDS consecutive in memory
--------
To simplify, we could say that the computer's memory is composed
of a multitude of small boxes (BITS) that can take either 1 or 0 as values depending on the actions of the Microprocessor and can be grouped (arbitrarily) in the form of OCTETS, WORDS or LONG WORDS.
EXAMPLE:
--- { 0 is a BIT
excerpts from { 01011101 is a BYTE or 8 BITS
memory { 0101110111010110 is a WORD or 16 BITS
{ 01011101110101101000100100101101 is a LONG WORD or 32 B
2) HEXADECIMAL:
-------------
.HEXADECIMAL is a BASE 16 system, the 16 components of this system
are:0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
Thus, to express a set whose quantity exceeds F (i.e. 16 in
BASE 10), we change units: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,10,11,12,
,13,14,15,16,17,18,19,1A,1B,1C,1D,1E,1F,20,21,22...
As you can see, the real utility of this system is that a large number is represented by a minimum of digits.
.An OCTET is represented in HEXA. by 2 digits
.A WORD is represented in HEXA. by 4 digits
.A LONG WORD is represented in HEXA. by 8 digits
3) NOTATIONS:
----------
- In an ASM listing, digits coded in BINARY are
represented with the prefix ' % ', those coded in
HEXADECIMAL with the prefix ' $ '.
(The use of the BASE 10 does not need to be indicated)
After Linking, all digits will automatically be coded in
Hexadecimal.
EXAMPLE: %01001011 is a digit coded in BINARY (and an OCTET)
$1F0A is a digit coded in HEXA. (and a WORD)
101001 is a digit coded in DECIMAL !
4) REMARKS:
----------
- The leftmost BIT of an OCTET, a WORD, a LONG WORD is called
M.S.B (Most Significant Bit) or Most significant bit.
- This bit allows for signing (positive/negative) a number.
It is activated for a negative number.
- It is only used to sign a number!
EXAMPLE: $FFFFFFFF is a L-W that is -1 in decimal
---- %1111111111111111 is a WORD that is -1 in decimal
%11111110 is an OCTET that is -2 in decimal and
%00000010 is an OCTET that is 2: To invert the sign of the OCTET,
we have 'extended' the leftmost 1 to the MSB.
:Very theoretical examples, don't panic, it's not really useful
in practice (and frankly useless, I assure you...) because there
are instructions that perform all sorts of operations on the
BITS, OCTETS, WORDS and L-W... (and no one forces you to work in
Binary system!)
- According to our definitions: (the MSB is only used to sign the bit,
the octet, the word or the long word)
therefore: * -2^7 ó OCTET < 2^7
-128 ó OCTET < 128
* -2^15 ó WORD < 2^15
-32768 ó WORD < 32768
* -2^31 ó L-W < 2^31
-2147483648 ó L-W < 2147483648
This, on the other hand, is something to remember: Overflows (OCTET too large...)
cause errors.
5) RECAPITULATION:
---------------
.A component of the computer's structure: REGISTER,VARIABLE... (terms
to be explained soon) can take the form of a BIT, OCTET, WORD or LONG WORD.
In theory... We will study the exceptions: The choice is more limited in
reality.
CONSEQUENCES: .A VARIABLE defined as a WORD cannot therefore
------------- have a value greater than 32768. (for example)
NB: This variable can only be a Long WORD (If its
--- value can be contained in a L-W , otherwise that
would cause an error since the computer does not recognize
any component whose quantity exceeds the L-W.)
(in theory...)
REMARKS: .A Variable that can be contained in an OCTET,
---------- so whose value is <128 can also be contained in a
WORD or a L-W.
But it is necessary to pay attention to what the component
(here the variable) can be defined as (BIT, OCTET,
WORD, L-W...).
.Therefore, for each component of the computer's internal structure
indicate the different forms it can take (BIT, OCTET, WORD, L-W...).
We will do this with precision.
This is why it is necessary to know the different forms that can
be used for programming (Registers, variables...)
and also by what these forms are defined (BIT, OCTET...).
This is why Assembler is very strict in its programming.
6) IMPORTANT:
----------
Sets such as OCTETS, WORDS, L-W are composed of BITS, these BITS
are numbered like this (in OCTETS, WORD, L-W, REGISTERS, VARIABLES...):
From RIGHT to LEFT.
31 30 . . . . . . 15 . . . 9 8 7 6 5 4 3 2 1 0
[*][ ][.][.][.][ ][ ][ ][ ][*][.][.][.][ ][ ][*][ ][ ][ ][ ][ ][ ][*]
- An OCTET is therefore numbered from right to left from 0 to 7
- A WORD, from 0 to 15
- A L-W, from 0 to 31
- Bit 0 is the MOST SIGNIFICANT BIT (or LSB: Least significant Bit)
- Bit 7 (15 for a WORD, 31 for a L-W) is the MOST
SIGNIFICANT BIT (or MSB)
THUS:
------ <----------- MOST SIGNIFICANT BITS
31<-- ...[][][][][][][][][][][][][]... -->0
LEAST SIGNIFICANT BITS ---------->
THEREFORE: A WORD is composed of 2 OCTETS, one of lower WEIGHT (bits 0 to 7) and
----- one of higher WEIGHT: (bits 8 to 15)
A L-W is composed of 2 WORDS, one of lower WEIGHT (bits 0 to 15) and
one of higher WEIGHT (bits 16 to 31)
THIS NOTION IS FUNDAMENTAL
-----------------------------
*** THE MEMORY ***
------------------
- It is assumed that memory is a sequence of numbers coded in BINARY.
:We now know that these numbers can be grouped into
OCTETS, WORD, L-W.
- Programming in assembly language allows you to change the contents
of the memory.
Memory is changed when the computer performs a task (a calculation,
any search...).
In this case, the computer takes care of organizing its memory, it is according to this order that one or other action takes place, the user therefore does not intervene: This is what happens at every moment, a computer is therefore never at rest (it checks if the disk is changed, erases
the screen fifty times a second).
Memory of the computer can also be changed by the user, this is the
purpose of programming. The consequence is a planned state (programmed)
of the memory, so such and such action is performed.
This is possible by using instructions specific to the language used or by modifying directly a portion of the memory. This last operation is carried out very easily in ASM and with a precision at the level of the BIT.
We can therefore Move an OCTET, a WORD, a L-W (which we have defined)
in the memory.(where it is possible...)
In order to navigate in the vast memory of our computer, we have divided the
memory and named each portion of the memory by an ADDRESS (like addresses on a street...)
Theoretically, it would be possible to place a BIT, OCTET, WORD or L-W at
any ADDRESS, but in reality this is not possible.
- The user only has access to a part of the memory.
(we will see which part)
- It is necessary to take into account the parity and oddness of ADDRESSES:
One cannot put anything at any address.
IN FACT: If we SCHEMATICIZE the structure of memory, it looks like
a band of limited length (start and end) and a WIDTH of 16 BITS, the different parts of which are numbered and addressed every 8 BITS. (at each OCTET)
etc...
bits: 15 7 0
----- .| | |
x-2|--------|--------|
| x|10001011|--------| OCTET: 10001011 at address x
| x+2|--------|10001101| OCTET: 10001101 at x+3 octets
| x+4|--------|--------|
| x+6|00101100|10110011| WORD: 0010110010110011
| .|--------|--------| at x+6 octets
\|/ .|--------|--------|
' x+12|11011111|01110100| L-W: 110111110111010001011011-
SENS des .|01011011|01110100| 01110100
adresses .|--------|--------| at x+12 octets
CROISSANTES .|--------|--------|
----------- x+20|--------|-------1| BIT: 1 at address x+21 octets
| .|--------|--------|
| .|OCT FORT| FAIBLE |
\|/ .|--------| ------ |
' 15 7 0
ETC...
+-------+
|MEMORY |: (Example of organization)
+-------+
- Take a good look at this example, it is very easy to understand and this scheme
must be in your head every time you program in ASM.
- You can see that:
* Memory is addressable at the OCTET level *
----------------------------------------------
* :Between 2 different addresses X and X+1 or X-1, there is *
* 8 BITS or one OCTET. *
bits n° 7 6 5 4 3 2 1 0
X-1 ---------->
[ ][ ][ ][ ][ ][ ][ ][ ]
X ---------->
[ ][ ][ ][ ][ ][ ][ ][ ]
X+1 ---------->
[ ][ ][ ][ ][ ][ ][ ][ ]
* Memory can be represented as a WELL into which we
DROP data: BITS, OCTET, WORDS, L-W:
The width of this well is a WORD (so 16 BITS)
The depth depends on the size of the memory.
The purpose of the game: Drop our data into the well without deforming
the data.
Let's say: You drop a WORD into an even address and it will effectively be composed of two OCTETS at even and odd addresses. So, putting information in memory has to be