CHAPTER 3 Macro - Assembler
Jump to navigation
Jump to search
Here we are yet again. Chapter 3 of the huge Devpac Docs typed by the Animal house and edited by Sewer Rat for Doc Disk Number 8. CHAPTER 3 MACRO ASSEMBLER INTRODUCTION GenST is a powerful, fast, full specification assembler, available instantly from within the editor or as a stand alone program. It converts the text typed or loaded into the editor, optionally together with the files read from the disk, into a binary file suitable for immediate execution or linking, or into a memory image for immediate execution from the editor. INVOKING THE ASSEMBLER FROM THE EDITOR The assembler is invoked from the editor by clicking on Assemble from the Program menu, or by pressing Alt-A. A dialogue box will appear which looks like this (almost), ------------------------------------------------- Assembly Options Program type Exec GST DRI Symbols case Dependant Independant Debug info None Normal Extended List None Screen Printer Disk Output to None Memory max 10_k Disk |__________________________________ Cancel Assemble __________________________________________________ PROGRAM TYPE This lets you select between executable GST or DRI format output. The differences between these are detailed later. SYMBOLS CASE This lets you select whether labels are case dependant or not. If case Dependant is selected then Test and test would be different labels, if the case Independant is selected then they would be the same. DEBUG INFO If you wish to debug your program using your original symbols you can select Normal or Extended debug modes. The advantage of extended debug is that up to 22 characters of each symbol are included in the debug information, whereas normal mode restricts symbols to eight characters. LIST selecting Printer will divert the listing to the current printer port, or selecting Disk will send the listing to a file based on the source filename, but with the extension .LST. OUTPUT TO This lets you select where the output file is to be created. None means it is 'thrown away', ideal for syntax checking a program; Memory means it is assemble into a buffer allowing it to be run or debugged instantly form the editor without having to create a disk file: Disk means a file will be created. The selection of the name of this file can be left to the assembler, using rules described shortly. The first two options may be specified in the source file using the OPT directive. Having selected your required options you should click on the Assemble button (or press Return) to start the assembly. At the end of assembly you should press any key to return to the editor. If any errors occurred the cursor will be positioned on the first offending line. STAND ALONE ASSEMBLER If the .TTP version of the assembler is invoked, the one without a command line, the programmer will be asked for one, confirming to the rules below, or press Return to abort. At the end of assembly there will be a pause, pressing any key will exit the program. If a command line has been supplied the assembler will not wait for a key as it assumes it has been run from the CLI or batch file. COMMAND LINE FORMAT The command line should be the form mainfile <-options> [-options] The mainfile should be the name of the file requiring assembly and if no extension is specified defaults to ,S. Options should follow this donated by a - sign then an alphabetical character. Allowed options are shown below together with equivalent OPT directives : B no binary file should be created C case insensitive labels (OPT C-) D debug (OPT D+) L GST linkable code (OPT L+) L2 DRI linkable code (OPT L2) O specify output filename (follows immediately after O) P specify listing filename (should follow immediately after P) defaults to source filename with extension .LST Q pause for key press after assembly X extended debugging (OPT X+) The default is to create an executable binary file with the name based on the source file and output file type, no listing, with no case sensitive labels. For example, test -b assembles test.s with no binary output file test -om:test.prg -p assembles test.s into a binary file m:test.prg and a listing file to test.lst test -L2dpprn: assembles test.s into DRI linkable code with debug and a listing to the parallel port. (A listing to the serial port can be obtained by specifying AUX: as the listing name. OUTPUT FILENAME GenST has certain rules regarding the calculation of the output filename, using a combination of that specified at assembly time (either in the Disk: filename field in the dialogue box or using the -O option on the command line) and the Output directive: If an output filename is explicitly given at assembly time then name = explicit filename else if the Output directive has not been used then name = source filename + .PRG, .BIN or .O else if the Output directive specifies an extension then name = source filename + extension in Output else name = name in Output ASSEMBLY PROCESS GenST is a two-pass assembler; during the first pass it scans all the text in memory and from disk if required, building up a symbol table. If syntax errors are found on the first pass assembly these will be reported and assembly will stop at the end of the first pass, otherwise, during the second pass the instructions are converted into bytes, a listing may be produced if required and a binary file can be created on the disk. During the second pass any further errors and warnings will be shown, together with a full listing and symbol table if required. During assembly, any screen can be paused by pressing Ctrl-S, pressing Ctrl-Q will resume it. Assembly may be aborted by pressing Ctrl-C, although doing so will make any binary file being created invalid as it will be incomplete and should not be executed. ASSEMBLY TO MEMORY To reduce development time GenST can assemble programs to memory, allowing immediate execution or debugging from the editor. To do this a program buffer is used., the size of which is specified in the Assembly Options dialogue box. If no debug option is specified the size given can be just a little larger than the output program, but if either form of debug is required a much larger buffer may be needed. A program running from memory is just like any other normal GEMDOS program and should terminate using either pterm or pterm0 GEMDOS calls, foe example clr.w -(a7) trap #1 Programs may self-modify if required as a re-executed program will be in its original state. The program buffer size and current assembly options can be made the default on re-loading the editor if Save Preferences is used. BINARY FILE TYPES There are six types of binary files which may be produced by GenST, for different types of applications. They are distinguished by the extension on the filename: .PRG GEM-type application i.e. one that uses windows .TOS TOS-type application i.e. one that doesn't use windows .TTP TOS-type application that requires a command line .ACC desk accessory program file .BIN non-executable file suitable for linking with GST- format files and libraries .O non-executable file suitable for linking with DRI- format files and libraries It can also assemble executable code directly to memory when using the integrated version allowing very fast edit-assemble- debug-run times. The first three are double-clickable, can be run from the Desktop and are known as executable. They differ in the initialisation performed before the execution. With .PRG files the screen is cleared to the Desktop's pattern, while with the other two the screen clears to white, the flashing cursor appears and the mouse is disabled. When you double-click a .TTP file the Desktop will prompt you for a command line to pass to it. .ACC files are executable but cannot be double-clicked on from the desktop. They will only run successfully when executed by the AES during the boot sequence of the machine .BIN and .O files cannot be run immediately, but have to be read into a linker, usually with other sections, and are known as linkable object modules. There are two different linker formats on the ST, .BIN files are GST format, .O files are DRI format. The difference between these is discussed later in this chapter. The above extensions are not absolute rules; for example, if you have a TOS type program you may give it a .PRG extension and use the Install Application function from the Desktop, but it's usually much easier to use the normal extensions. One exception is for programs that are to be placed in an AUTO folder so they execute during the boot sequence. They have to be TOS type programs, but need the extension .PRG for the boot sequence to find them. ** Certain versions of the French ST ROM's do not recognise .TTP files from the Desktop so they have to be renamed .TOS then installed as TOS Takes Parameters. TYPES OF CODE Unlike most 8-bit operating systems, but like most 16-bit systems, an executable program under GEMDOS will not be loaded at a particular address but, instead, be loaded at an address depending on the exact free memory configuration at that time. To get around the problem of absolute addressing the ST file format includes relocation information allowing GEMDOS to relocate the program after it has loaded it but before running it. For example the following program segment move.I #string, a0 . . . string dc.b 'Press any key',0 places the absolute address of string into a register, even though at assembly time the real address of string cannot possibly be known. Generally the programmer may treat addresses as absolute even though the real addresses will not be known to him, while the assembler (or linker) will look after the necessary relocation information. ** For certain programs, normally games or cross machine development an absolute address may be required, for this reason the ORG directive is supported. The syntax of the assembler will now be described. ASSEMBLER STATEMENT FORMAT Each line that is to be processed by the assembler should have the following format; Label Mnemonic Operand(s) Comment Start move.l d0,(a0)+ store the result Exceptions to this are comment lines, which are lines starting with an asterisk or semi-colon, and blank lines, which are ignored. Each field has to be separated from the others by white space - any number or mixture of space and tab characters. LABEL FIELD The label should normally start at line 1, but if a label is required to start at another position then it should be followed immediately by a colon (:). Labels are allowed on all instructions, but are prohibited on some assembler directives, and absolutely required on others. A label may start with the characters A-Z, a-z, or underline (_), and may continue with a similar set together with the addition of the digits 0-9 and the period (.). Labels starting with a period are local labels, described later. Macro names and register equate symbols may not have periods in them, though macro names may start with a period. By default the first 127 characters of labels are significant, though this can be reduced if required. Labels should not be the same as register names, or the reserved words SR, CCR or USP. By default labels are case-sensitive though this may be changed. Some example legal labels are ; test, Test, TEST, _test, _test.end, test5, _test5 Some illegal labels are; 5test, _&e, test>, There are certain reserved symbols in GenST, denoted by starting with two underline characters. These are __LK, __RS and __G2. MNEMONIC FIELDS The mnemonic field comes after the label field and can consist of 68000 assembler instructions, assembly directives or macro calls. Some instructions and directives allow a size specifier, separated from the mnemonic by a period. Allowed sizes are .B for byte, .W for word, .L for long and .S for short. Which size specifiers are allowed in each case depends on the particular instruction or directive. GenST is case-sensitive to mnemonic and directive names, so Move is the same as move and the same as mOvE, for example. OPERAND FIELD For those instructions or directives which require operands, this field contains one or more parameters, separated by commas. GenST is case-sensitive regarding register names so they may be in either or mixed case. COMMENT FIELD Any white space not within quotes found after the expected operand(s) is treated as a delimiter before the start of the comment, which will be ignored by the assembler. EXAMPLES OF VALID LINES move.l d0,(a0)+ comment is here loop TST.W d0 lonely.label rts * this is a complete line of comment ; and so is this indented: link A6,#-10 make room a_string: dc.b 'spaces allowed in quotes' a string EXPRESSIONS GenST allows complex expressions and supports full operator precedence, parenthesis and logical operators. Expressions are in two types -absolute and relative - and the distinction is important. Absolute expressions are constant values which are known at assembly time. Relative expressions are program addresses which are not known at assembly-time as the GEMDOS loader can put the program where it likes in memory. Some instructions and directives place restrictions on which types are allowed and some operators cannot be used with certain type- combinations. OPERATORS The operators available, in decreasing order of precedence, are: monadic minus (-) and plus (+) bitwise not (~) shift left (<<) and shift right (>>) bitwise And (&), Or (!), and Xor (^) multiply (*) and divide (/) addition (+) and subtraction (-) equality (=), less than (<), greater than (>) The comparison operators are signed and return 0 if false or -1 ($FFFFFFFF) if true. The shift operators take the left hand operand and shift it the number of bits in the right hand operand and vacated bits are filled with zeros. This precedence can be overridden by the parentheses ( and ). With operators of equal precedence, expressions are evaluated from left-to-right. Spaces in expressions (other than those within quotes as ASCII constants) are not allowed as they are taken as the separator to the comment. All expression evaluation is done using 32-bit signed-integer arithmetic, with no checking of overflow. NUMBERS Absolute numbers may be in various forms; decimal constants, e.g. 1029 hexadecimal constants, e.g. $12f octal constants, e.g. @730 binary constants, e.g. %1100010 character constants, e.g. 'X' $ is used to denote hex numbers, % for binary, @ for octal and single ' or double quotes " for character constants. CHARACTER CONSTANTS Whichever quote is used to mark the start of a string must also be used to denote its end and quotes themselves may be used in strings delimited with the same quote character by having it occur twice. Character constants can be upto 4 characters in length and evaluate to right-justified longs with null-padding if required. For example, her are some character constants and their ASCII and hex values: "Q" Q $00000051 "hi" hi $00006869 "Test" test $54657374 "it's" it's $6974277C 'it''s' it's $6974277C Strings used in DC.B statements follow slightly different justification rules, detailed with the directive later. Symbols used in expressions will be either relative or absolute, depending on how they were defined. Labels within the source will be relative, while those defined using the EQU directive will be the same type as the expression to which they are equated. The use of an asterisk (*) denotes the value of the program counter at the start of the instruction or directive and is always a relative quantity. ALLOWED TYPE COMBINATIONS The table below summarises for each operator the results of the various type combinations of parameter an d which combinations are not allowed. An R denotes a Relative result, an A denotes absolute and a * denotes that the combination is not allowed and will produce an error message if attempted. A op A A op R R op A R op R Shift operators A * * * Bitwise operators A * * * Multiply A * * * Divide A * * * Add A R R * Subtract A * R A Comparisons A * * A ALLOWED TYPE COMBINATIONS ADDRESSING MODES The available addressing modes are shown in the table below. Please note that GenST is case-insensitive when scanning addressing modes, so D0 and a3 are both valid registers. FORM MEANING EXAMPLE Dn data register direct D3 An address register direct A5 (An) address register indirect (A1) (An)+ address reg indirect + post-increment. (A5)+ -(An) address reg indirect + pre-increment. -(A0) d(An) address reg indirect with displacement 20(A7) d(An,Rn.s)address register indirect with index 4(A6,D4.L) d.W absolute short address $0410.W d.L absolute long address $12000.L d(PC) program counter relative with offset NEXT(PC) d(PC,Rn.s)program counter relative with index NEXT(PC,A2.W) #d immediate data #26 n denotes register number from 0 to 7 d denotes a number R denotes index register, either A or D s denotes size, either W or L, when omitted defaults to W When using address register indirect with index the displacement may be omitted, for example move.l (a3,d2.l),d0 will assemble to the same as move.l 0(a3,d2.l),d0 SPECIAL ADDRESSING MODES CCR condition code register SR status register USP user stack pointer In addition to the above, SP can be used in place of A7 in any addressing mode, e.g. 4(SP,D3.W) The data and address registers can also be denoted by use of the reserved symbols R0 through R15. R0 to R7 are equivalent to D0 to D7, R8 to R15 are equivalent to A0 to A7. This is included for compatibility with other assemblers. LOCAL LABELS GenST supports local labels, that is labels which are local to a particular area of the source code. These are denoted by starting with a period and are attached to the last non-local label, for example; lenl move.l 4(sp),a0 .loop tst.b (a0)+ bne.s .loop rts len2 move.l 4(sp),a0 .loop tst.b -(a0) bne.s .loop rts There are two labels called .loop in this code segment but the first is attached to lenl, the second to len2. The local labels .W and .L are not allowed to avoid confusion with the absolute addressing syntax. SYMBOLS AND PERIODS Symbols which include the period character can cause problems with GenST due to absolute short addressing. The Motorola standard way of denoting absolute short addresses causes problems as periods are considered to be part of a label, best illustrated by an example: move.l vector.w,d0 where vector is an absolute value, such as a system variable. This would generate an undefined label error, as the label would be scanned as vector.w. To get around this, the expression, in this case a symbol, may be enclosed in brackets, e.g. move.l (vector).w,d0 though the period mat still be used after numeric expressions, e.g. move.l $402.w,d0 ** GenST version 1 also supported the use of \ instead of a period to denote short word addressing and this is still supported in this version, but this is not recommended due to the potential for \W and \L to be mistaken for macro parameters. INSTRUCTION SET WORD ALIGNMENT All instructions with the exception of DC.B and DS.B are always assembled on a word boundary. Should you require a DC.B explicitly on a word boundary, the EVEN directive should be used before it. Although all instructions that require it are word- aligned, labels with nothing following them are not word aligned and can have odd values. This is best illustrated by an example; nop this is always word aligned dc.b 'odd' start tst.l (a0)+ bne.s start The above code would not produce the required result as start would have an odd value. To help in finding such an instructions the assembler will produce an error if it finds an odd destination in a BSR or BRA operand. Note that such checks are not made on any other instructions, so it is recommended that you precede such labels with EVEN directives if you require them to be word- aligned. A common error is deliberately not to do this, as you know the preceding string is an even number of bytes long. All will be well until the day you change the string... INSTRUCTION SET EXTENSIONS The complete 68000 instruction set is supported and certain shorthands are automatically accepted, detailed below. A complete description of the instruction set including syntax and addressing modes can be found in any 68000 reference guide or in the supplied pocket guide. CONDITION CODE The altenate condition codes HS and LO are supported in Bcc, DBcc and Scc instructions, equivalent to CC and CS, respectively. BRANCH INSTRUCTIONS To force a short branch use Bcc.B or Bcc.S, to force a word branch use Bcc.W or to leave the optimiser use Bcc. Bcc.L is supported for compatibility with GenST 1 with a warning as it is, strictly speaking, a 68020 ibnstruction. A BRA.S to the immediately following instruction is not allowed and is converted, with a warning, to a NOP. A BSR.S to the immediately following instruction is not allowed and will produce an error. BTST INSTRUCTION BTST is unique among bit-test instructions in supporting PC- relative addressing modes. CLR INSTRUCTION CLR An is not allowed, use SUB.L An,An instead (though note that the flags are not affected). CMP INSTRUCTION If the source is immediate then CMPI is used, else if the destination is an address register then CMPA is used, else if both addressing modes are post-incremented then CMPM is used. DBcc INSTRUCTION DBRA is accepted as an equivalent to DBF. ILLEGAL INSTRUCTION This generates the op-cope word $4AFC. LINK INSTRUCTION If the displacement is positive or not even a warning is given. MOVE FROM CCR INSTRUCTION This is a 68010 and upwards instruction, converted with a warning to MOVE from SR. MOVEQ INSTRUCTION If the data is in the range 128-255 inclusive a warning will be given. It may be disabled by specifying a long size on the instruction. ASSEMBLER DIRECTIVES Certain pseudo-mnemonics are recognised by GenST. These assembler directives, as they are called, are not (normally) decoded into opcodes, but instead direct the assembler time. These actions have the effect of changing the object code produced, or the format of the listing. Directives are scanned exactly like executable instructions and some may be preceded by a label (for some it is obligatory) and may be followed by a comment. If you put a label on a directive for which it is not relevant, the result will be undefined but will usually result in the label being ignored. Each directive will now be described in turn. Please note that the case of a directive name is not important, though they generally are shown in upper case. The use of angled brackets ( < > ) in descriptions denote optional items, ellipses ( ... ) denote repeated items. ASSEMBLY CONTROL END This directive signals that no more text is to be examined on the current pass of the assembler. It is not obligatory. INCLUDE filename This directive will cause source code to be taken from a file on disk and assembled exactly as though it were present in the text. The directive must be followed by a filename in normal GEMDOS format. If the filename has a space in it the name should be enclosed in single or double quotes. A drive specifier, directory and extension may be included as required, e.g. include b:constants/header.s Include directives may be nested as deeply as memory allows and if any error occurs when trying to open the file or read it, assembly will be aborted with a fatal error. If no drive or path name is specified, that of the main source file will be used when trying to open the file. ** The more memory the better, GenST will read the whole of the file in one go if it can and not bother to re-read the file during pass 2. INCBIN filename This takes a given binary file and includes it, verbatim, into the output file. Suggested uses include screen data, sprite date and ASCII files. OPT option <,option>... This allows control over various options within GenST and each one is denoted by a single character normally followed by a + or - sign. Multiple options may be specified, separated by commas. The allowed options are: OPTION C - CASE SENSITIVITY AND SIGNIFICANCE. By default, GenST is sensitive to label case and labels are significant to 127 characters. This can be overridden , using C- for case-sensitivity or C+ for case-insensitivity. The significan� ce may be specified by specifying a decimal number between the C and the sign, for example C16+ denotes case insensitive labels with 16 character significance. This option may be used at any time in a program but normally only makes sense at the very beginning of a source file. OPTION D - DEBUGGING INFORMATION The GEMDOS binary file format supports the inclusion of a symbol table at the end, which may be read by debuggers such as MONST and can be extremely useful when debugging programs. By default this is switched off but it may be activated with D+ and de-activated with D-. The first 8 characters only of all relative labels are written to the file and will be upper-cased if GenST is in case-sensitive mode. The 8 character limit is due to the DIR standard file format and may be improved on by using the extended debug option, described below. OPTION L - LINKER MODE. The default for GenST is to produce executable code but L+ will make it produce GST linkable code, L2 will make it produce DIR linkable code, or L- will make it revert to executable. This directive must be the very first line, in the first text file. OPTION M - MACRO EXPANSIONS. When an assembly listing is produced, macro calls are shown in the same form as in the source. If you wish the instructions within macros to be listed, use M+, while M- will disable the option. You can use this directive as often as required. OPTION O - OPTIMISING. GenST is capable of optimising certain statements to faster and smaller versions. By default all optimising is off but each type can be abled and disabled as required. This option has several forms: OPT O1+ will optimise backward branches to short if within range, can be disabled with O1-. OPT O2+ will optimise address register indirect with displacement addressing modes to addresss register indirect, if the displacemant evalues to zero, and can be disabled with O2-. For example, move.l next(a0),d3 will be optimised to move.l (a0),d3 If the value of next is zero. OPT O+ will turn all optimising on. OPT O- will turn all optimising off. OPT O1-, OPT O2- will disable the relevant optimisation. OPT OW- will disable the warning messages generated by each optimisation, OW+ will enable them. If any optimising has been done during an assembly the number of optimisations made and bytes saved will be shown at the end of assembly. OPTION P - POSITION INDEPENDANT CHECKS. With this enabled with P+ GenST will check that all code generated is position-independant, generating errors on any lines which require relocation. It can be disabled with P- and defaults to off. OPTION S - SYMBOL TABLE When a listing is turned on a symbol table will be produced at the end. If you wish to change this S- will disable it, while S+ will re-enable it. If you use this directive ore than once the last one will be taken into account. OPTION T - TYPE CHECKING. GenST can often spot programming errors as it checks the types of certain expressions. For some applications or styles of programming this can be more of a hindrance than a help. So T- will turn checks off. T+ turning them back on. For example the program segment main bsr initialise lea main(a6),a0 move.l (main).w,a0 will normally produce an error as main is a relative expression where as the assembler expects an absolute expression in both cases. However if this code is designed to run on another 68000 machine this may be perfectly valid, so the type checking should be disabled. OPTION W - WARNINGS If you wish to disable the warnings that GenST can produce, you can do so with W-. To re-enable them use W+. This directive can be used as often as required. OPTION X - EXTENDED DEBUG This is a special version of option D which uses the HiSoft Extended Debug format to generate debugging information with symbols of up to 22 character significance. OPTION SUMMARY The defaults are shown in brackets after each options description; C case-sensitivity and significance (C127+) D include debugging info, (D-) L- produce executable code (default) L+ produce GST linkable code L2 produce DIR linkable code M expand macros enlisting (M+) O optimising control (O-) P position independant code checks (P-) S symbol table listing T type checking (T+) W warning control (W+) X extended debug (X-) For example - opt m+,s+,w- will turn macro expansion on, enable the symbol table list and turn warnings off. <label> EVEN This directive will force the program counter to be even, i.e. word aligned. As GenST automatically word-aligns all instructions (except DC.Bs and DS.Bs). It should not be required very often but can be useful for ensuring buffers and strings are word-aligned when required CNOP offset,alignment This directive will align the program counter using the given offset and alignment. An alignment of 2 means word-aligned, an alignment of 4 means long-word-aligned and so on. The alignment is relative to the start of the current section. For example, cnop 1,4 aligns the program counter a byte past the next long-word boundary. <label> DC.B expression<,expression> ... <label> DC.W expression<,expression> ... <label> DC.L expression<,expression> ... These directives define constants in memory. They may have one or more operands, separated by commas. The constants will be aligned on word boundaries for DC.W and DC.L no more than 128 bytes can be generated by a single DC directive. DC.B treats strings slightly different to those in normal expressions. While the rules described previously about quotation marks still applies. No padding of the bytes will occur, and the length of any string can be upto 128 bytes. Be very careful of spaces in any DC directives, as a space is the delimiter before a comment. For example, the line dc.b 1,2,3 ,4 will only generate 3 bytes - the ,4 will be taken as a comment. <label> DS.B expression <label> DS.W expression <label) DS.L expression These directives will reserve memory locations and the contents will be initialised to zeros. If there is a label then it will be set to the start of the area defined, which will be on a word boundary for DS.W and DS.L directives. There is no restriction on the size though the larger the area the longer it will take to save to disk. For example, all of these lines will reserve 8 bytes of space, in different ways : ds.b 8 ds.w 4 ds.l 2 <label> DCB.B number,value <label> DCB.W number,value <label> DCB.L number,value This directive allows constant blocks of data to be generated of the size specified, number specifies how many times the value should be repeated. FAIL This directive will produce the error user error. It can be used for such things as warning the programmer if an incorrect number of parameters have been passed to a macro. OUTPUT filename This directive sets the normal output filename though can be overridden by specifying a filename at the start of assembly. If filename starts with a period then it is used as an extension and the output name is built up as described previously. __G2 (reserved symbol) This is a reserved symbol that can be used to detect whether GenST is being used to assemble a file using the IFD conditional. The value of this symbol depends on the version of the assembler and is always absolute. REPEAT LOOPS It is often useful to be able to repeat one or more instructions a particular number of times and the repeat loop construct allows this. <label> REPT expression ENDR Lines to be repeated should be enclosed within REPT and ENDR directives and will be repeated the number of times specified in the expression. If the expression is zero or negative then no code will be generated. It is not possible to nest repeat loops. For example REPT 512/4 copy a sector quickly move.l (a0)+,(a1)+ ENDR ** Program labels should not be define within repeat loops to prevent label defined twice errors. LISTING CONTROL LIST This will turn the assembly listing on during pass 2, to whatever device was selected at the start of the assembly (or to the screen of None was initially chosen). All subsequent lines will be listed until an END directive is reached, the end of the text is reached, or a Nolist directive is encountered. Greater control over listing sections of program can be achieved using LIST+ or LIST- directives. A counter is maintained, the state of which dictates whether listing is on or off. A LIST+ directive adds one to the counter and a LIST- subtracts one. If the counter is zero or positive then listing is on, if it is negative then listing is off. The default starting value is -1 (i.e. listing off) unless a listing is specified when the assembler was invoked, when it is set to zero. This system allows a considerable degree of control over listing particularly for include files. The normal LIST directive sets the counter to 0, NOLIST sets it to -1. NOLIST This will turn off any listing during pass 2. When a listing is requested onto a printer or to disk, the output is formatted into pages, with a header at the top of every page. The header itself consists of a line containing the program title, date, time and page number, then a line showing the program title, then a line showing the sub-title, then a blank line. The date format will be printed in the form DD/MM/YY, unless the assembler is running on a US Atari ST, then it is MM/DD/YY. Between pages a form-feed character (ASCII FF, value 12) is issued. PLEN expression This will set the page length of the assembly listing and defaults to 60. The expression must be between 12 and 255. LLEN expression This will set the line width of the assembly listing and defaults to 132. The value of the expression must be between 38 and 255. TTL string This will set the title printed at the top of each page to the given string, which may be enclosed in single quotes. The first TTL directive will set the title of the first printed page. If no title is specified the current include file name will be used. SUBTTL string Sets the subtitle printed at the top of each page to the given string, which may be enclosed in single quotes. The first such directive will set the sub-title of the first printed page. SPC expression This will output the number of blank lines given in the expression in the assembly listing, if active. PAGE Causes a new page in the listing to be started. LISTCHAR expression<,expression> ... This will send the characters specified to the listing device (except the screen) and is intended for doing things such as setting condensed mode on printers. For example, on Epsons and compatibles the line listchar 15 will set the printer to 132-column mode. FORMAT parameter<,parameter> ... This allows exact control over the listed format of a line of source code. Each parameter controls a field in the listing and must consist of a digit from 0 to 2 inclusive followed by a + (to enable the field) or a - (to disable it): 0 line number, in decimal 1 section name/number and program counter 2 hex data words, up to 10 words unless printer is less than 80 characters wide, when up to three words are listed. LABEL DIRECTIVES label EQU expression This directive will set the value and type of the given label to the results of the expression. It may not include forward references, or external labels. If there is any error in the expression, the assignment will not be made. The label is compulsory and must not be a local label. label = expression Alternate form of EQU statement label EQUR register This directive allows a data or address register to be referred to by a user-name, supplied as the label to this directive. This is known as a register equate. A register equate must be defined before it is used. label SET expression This is similar to EQU, but the assignment is only temporary and can be changed with a subsequent SET directive. Forward references cannot be used in the expression. It is especially useful for counters within macros, for example, using a line zcount set zcount+1 (assuming zcount is set to 0 at the start of the source). At the start of pass 2 all SET labels are made undefined, so their values will always be the same on both passes label REG register-list This allows a symbol to be used to denote a register list within MOVEM instructions, reducing the likelihood of having the list at the start of a routine different from the list at the end of the routine. A label defined with REG can only be used in MOVEM instructions. <label> RS.B expression <label> RS.W expression <label> RS.L expression These directives let you set up lists of constant labels, which is very useful for data structures and global variables and is best illustrated by a couple of examples Let's assume you have a data structure which consists of a long word, a byte and another long word, in that order. To make your code more readable and easier to update should the structure change, you could use lines such as rsreset d_next rs.l 1 d_flag rs.b 1 d_where rs.l 1 and then you could access them with lines like move.l d_next(a0),a1 move.l d_where(a0),a2 tst.b d_flag(a0) As another example let's assume you are referencing all your variables off register A6 (as done in GenST and MonST) you could define them with lines such as onstate rs.b 1 start rs.l 1 end rs.l 1 you then could reference them with lines such as move.b onstate(a6),d1 move.l start(a6),d0 cmp.l end(a6),d0 Each such directive uses its own internal counter, which is reset to 0 at the beginning of each pass. Every time the assembler comes across the directive it sets the label according to the current value (with word alignment if it is .W or .L) then increments it according to the size and magnitude of the directive. If the above definitions were the first RS directives, onstate would be 0, start would be 2 and end would b e6. RESET This directive will reset the internal counter as used by the RS directive. RSSET expression This allows the RS counter to be set to a particular value. __RS (reserved symbol) This is a reserved symbol having the current value of the RS counter. CONDITIONAL ASSEMBLY Conditional assembly allows the programmer to write a comprehensive source program that can cover many conditions. Assembly conditionals may be specified through the use of arguments, in the case of macros, and through the definition of symbols in EQU or SET directives. Variations in these can then cause assembly of only those parts necessary for the specified conditions. There are a wide range of directives concerned with conditional assembly. At the start of the conditional blocks there must be one of many IF directives and at the end of each block there must be an ENDC directive. Conditional blocks may be nested up to 65535 levels. Labels should not be placed on IF or ENDC directives as the directives will be ignored by the assembler. IFEQ expression IFNE expression IFGT expression IFGE expression IFLT expression IFLE expression These directives will evaluate the expression, compare it with zero and then turn the conditional assembly on or off depending on the result. The conditions correspond exactly to the 68000 condition codes. For example, if the label DEBUG had the value 1, then with the following code, IFEQ DEBUG logon dc.b 'enter a command:'.0 ENDC IFNE DEBUG opt d+ labels please logon dc.b 'Yeah, gimme man:',0 ENDC the first conditional would turn assembly off as 1 is not EQ to 0, while the second conditional would turn it on as 1 is NE to 0. ** IFNE corresponds to IF in assemblers with only one conditional directive. The expressions used in these conditional statements must evaluate correctly. IFD label IFND label These directives allow conditional control depending on whether a label is defined or not. With IFD, assembly is switched on if the label is defined, whereas with IFND assembly is switched on if the label is not defined. These directives should be used with care otherwise different object codes could be generated on pass 1 and pass 2 which will produce incorrect code and generate phasing errors. Both directives also work on reserved symbols. IFC 'string1','string2' This directive will compare two strings, each of which must be surrounded by single quotes. If they are identical assembly is switched on, else it i switched off. The comparison is case sensitive. IFNC 'string1','string2' This directive is similar to the above, but only switches assembly on if the strings are not identical. This may at first appear somewhat useless, but when one or both of the parameters are macro parameters it can b every useful, as shown in the next section. ELSEIF This directive toggles conditional assembly from on to off, or vice versa. ENDC This directive will terminate the current level of conditional assembly. If there are more IF's then ENDC's an error will be reported at the end of assembly. IIF expression instruction This is a short form of the IFNE directive allowing a single instruction or directive to be assembled conditionally. No ENDC should be used with IIF directives. MACRO OPERATIONS GenST fully supports Motorola-style macros, which together with conditional assembly allows you greatly to simplify assembly- language programming and the readability of your code. A macro is a way for a programmer to specify a whole sequence of instructions or directives that are used together very frequently. A macro is first defined, then its name can be used in a macro call like a directive with up to 36 parameters. label MACRO This starts a macro definition and causes GenST to copy all following lines to a macro buffer until an ENDM directive is encountered. Macro definitions may not be nested. ENDM This terminates the storing of a macro definition, after a MACRO directive. MEXIT This stops prematurely the current macro expansion and is best illustrated by the INC example given later. NARG (reserved label) This is not a directive but a reserved symbol, Its value is the number of parameters passed to the current macro, or 0 if used when not within any macro. If GenST is in the case sensitive mode then the name should be all upper-case. Once a macro has been defined with the Macro directive it can be invoked by using its name as a directive, followed by upto 36 parameters. In the macro itself the parameters may be referred to by using a backslash character (\) followed by an alpha-numeric (1-9, A-Z or a-z) which will be replaced with the relevant parameter when expanded or with nothing if no parameter was given. There is also the special macro parameter \0 which is the size appended to the macro call and defaults to W if none is given. If a macro parameter is to include spaces or commas then the parameter should be enclosed in between < and > symbols; in this case a > symbol may be included within the parameters by specifying >>. A special form of macro expansion allows the conversion of a symbol to a decimal or hexadecimal sequence of digits, using the syntax \<symbol> or \$<symbol>, the latter denoting hex expansion. The symbol mast be defined and absolute at the time of expansion. The parameter \@ can be useful for generating unique labels with each macro call and is replaced when the macro is expanded by the sequence _nnn where nnn is a number which increases by one with every macro call. It may be expanded up to 5 digits for very large assemblies. A true \ may be included in a macro definition by specifying \\. A macro call may be spread over more than one line, particularly useful for macros with large numbers of parameters. This can be done by ending a macro call with a comma then starting the next line with an & followed by tabs or spaces then the continuation of the parameters. In the assembly listing the default is to show just the macro call and not the code produced by it. However, macro expansion listings can be switched on and off using the OPT M directive described previously. Macro calls may be nested as deeply as memory permits, allowing recursion if required. Macro names are stored in a separate symbol table to normal symbols so will not clash with similarly named routines, and may start with a period. MACRO EXAMPLES EXAMPLE 1 - CALLING THE BDOS As the first example, the general GEMDOS calling-sequence for the BDOS is; put the word parameters on the stack invoke a TRAP #1 correct the stack afterwards A macro to follow these specifications could be, call_gemdos MACRO move.w #\1,-(a7) function trap #1 lea \2(a7),a7 correct stack ENDM The directives are in capitals only to make them stand out: they don't have to be. If you wanted to call this macro to use GEMDOS function 2 (print a character) the code would be move.w #'X',-(a7) call_gemdos 2,4 When this macro call is expanded, \1 is replaced with 2 and \2 is replaced with 4. \0, if it occurred in the macro, would be W as no size is given on the call. So the above call would be assembled as: move.w #2,-(a7) trap #1 lea 4(a7),a7 EXAMPLE 2 - AN INC INSTRUCTION] The 68000 does not have the INC instruction like other processors but the same effect can be achieved using an ADDQ #1 instruction. A macro may be used to do this, like so: inc MACRO IFC '','\1' fail missing parameter! MEXIT ENDC addq.\0 #1,\1 ENDM An example call would be inc.1 a0 which would expand to addq.1 #1,a0 The macro starts by comparing the first parameter with an empty string and causing an error message to be issued using FAIL if it is equal. The MEXIT directive is used to leave the macro without expanding the rest of it. Assuming there is a non-null parameter, the next line does the ADDQ instruction, using the \0 parameter to get the correct size. EXAMPLE 3 - A FACTORIAL MACRO Although unlikely actually to be used as it stands, this macro defines a label to be the factorial of a number. It shows how recursion can work in macros. Before showing the macro, it is useful to examine how the same thing would be done in a high level language such as Pascal. function factor(n:integer):integer; begin if n>0 then factor:=n*factor(n-1) else factor:=1 end; The macro definition for this uses the SET directive to do the multiplication n*(n-1)*(n-2) etc. in this way * parameter 1= label, parameter 2= 'n' factor MACRO IFND \1 \1 set 1 set if not yet defined ENDC IFGT \2 factor \1,\2-1 work out next level down \1 set \1*(\2) n = n * factor(n-1) ENDC ENDM * a sample call factor test,3 The net result of the previous code is to set test to 3! (3 factorial). The reason the second set has (\2) instead of just \2 is that the parameter will not normally be just a simple expression, but a list of numbers separated by minus signs, so it could assemble to test set test*5-1-1-1 (i.e. test*5-3) instead of the correct test set test*(5-1-1-1) (i.e. test *2) EXAMPLE 4 - CONDITIONAL RETURN INSTRUCTION The 68000 lacks the conditional return instruction found on other processors, but macros can be defined to implement them using the \@ parameter. For example, a return if EQ macro could look like: rtseq MACRO bne.s \@ rts \@ ENDM The \@ parameter has been used to generate a unique label every time the macro is called, so will generate in this case labels such as _002 and _017. EXAMPLE 5 - NUMERIC SUBSTITUTION Suppose you have a constant containing the version number of your program and wish it to appear as ASCII in a message: showname MACRO dc.b \1,'\<version>',0 ENDM . . version equ 42 showname <'Real Ale Search Program v'> will expand to the line dc.b Real Ale Search Program v','42',0 Note the way the string parameter is enclosed in < >s as it contains spaces. EXAMPLE 6 - COMPLEX MACRO CALL Suppose your program needs a complicated table structure which can have a varying number of fields. A macro can only be written to using those parameters that are specified, for example: table_entry macro dc.b .end\0-* length byte dc.b \1 always IFNC '\2','' dc,w \2,\3 2nd and 3rd together ENDC dc.l \4,\5,\6,\7 IFNC '\8','' dc.b '\8' text ENDC dc.b \9 .end\@ dc.b 0 ENDM * a sample call table_entry &42,,,t1,t2,t3,t4, & <Enter Name:>,%0110 This is a non-trivial example of how macros can make a programmer's life so much easier when dealing with complex data structures. In this case the table consists of a length byte, calculated in the macro using \@, two optional words, four longs, an optional string, a byte, then a zero byte. The code produced in this example would be : dc.b .end_001 dc.b $42 dc.l t1,t2,t3,t4 dc,b 'Enter Name:' dc,b %0110 .end_001 dc.b 0 OUTPUT FILE FORMATS GenST is very flexible in terms of output file formats. These are detailed in this section together with notes on the advantages and disadvantages of each. Certain directives take different actions, depending on what output file format is specified. The exact details of using each format will now be described. EXECUTABLE FILES These are directly executable, for example, by double clicking from the desktop. The file may include relocation information and/or symbolic information. Normal file extensions for this of type file .PRG, .TOS, .TTP and .ACC. ADVANTAGES true BSS sections, reduced development time. DISADVANTAGES messy if more than one programmer. GST LINKABLE FILES When writing larger programs, or when writing assembly language modules for use from the high level language, a programmer needs to generate a linkable file. The GST link format is supported by most of the high level languages produced in Britain as well as others, for example HiSoft Basic, Lattice C, Prospero Fortran and Prospero Pascal. GST format files normally have the extension .BIN. Advantages great degree of freedom - imported labels can be used practically anywhere including within arbitrary expressions, libraries can be created from the assembler, import methods means the assembler can detect type conflicts. Disadvantages library format means selective library linking can be slow, true GEMDOS sections not supported as standard (though LinkST can create true BSS sections). DRI LINKABLE FILES This is the original linker format for the Atari ST created by Digital Research originally for CP/M 68k. It is supported, often via a conversion utility, by the majority of US high level languages. DRI format files normally have the extension .O. Advantages selective libraries are faster to link than GST format, GEMDOS sections are fully supported. Disadvantages very restrictive on use of imported labels; object files twice as big as executable files, 8 character limit on symbols. CHOOSING THE RIGHT FILE FORMAT If you wish to link with a high level language there isn't usually much choice - you have to use whichever format is supported by the language. If you are writing entirely in assembly language then the normal choice has to be executable - it is fast to assemble, no linking required, and allows assembly to memory for decreased development time. If you are writing a larger program, say bigger than 32k object, or writing a program as a team, then linkable code often makes most sense. We recommend GST-linkable over DRI because of the much greater flexibility of the format. OUTPUT FILE DIRECTIVES This section details those directives whose actions depend on the output file format chosen. The file format itself can be chosen by one of the following methods; command line options using GENST2.TTP; clicking on the radio buttons in the Assembly Options dialogue box from the editor, or with the OPT L directive at the beginning of the source file. Icons are used to denote those sections specific to a file format, viz *EXEC* Executable-code, also assembled to memory code. *GST* GST-linkable code *DRI* DRI-linkable code. MODULES AND SECTIONS MODULE modulename This defines the start of a new module. The module name should be contained within quotes if it contains spaces. There is a default module called ANON_MODULE so the directive is not obligatory. *EXEC* This directive is ignored *DRI* This directive is ignored *GST* This directive allows assembly-language library files to be created using multiple modules. Each module is like a self- contained program segment, with its own imports and exports. Relative labels are local to there own modules, so you can use two labels with the same name in different modules with no danger of a clash. Absolute labels are global to all modules, ideal for constants and the like. SECTION sectionname This defines a switch to the named section. The program may consist of several sections which will be concatenated together with other sections of the same name in the final executable file. By default assembly starts in the TEXT section. You may switch to any section during the assembly. *EXEC* Allowed section names are TEXT, the normal program area, DATA, for initialised data, and BSS a special area of memory reserved by the GEMDOS program loaded. It is initialised to zero and takes up no space within the disk file. When in a BSS section no code generating instructions are allowed except the DS directive. Using a BSS section for global variables can save valuable disk space. *DRI* The rules described above for executable files apply. *GST* There are no rigid rules about section names. Sections with the same names from different files will be concatenated by the linker. The default ordering of sections is the order they are first used in. IMPORTS AND EXPORTS With both linkable types of program it is crucial to be able to import and export symbols, both relative symbols (i.e. program references) and absolute symbols (i.e. constants). The GST format distinguishes between these types whereas the DRI format does not. The GST format allows the assembler to type check, often finding programming errors that would otherwise be missed. XDEF export<,export>... This defines labels for export to other programs (or modules). If any of the labels specified are not defined an error will occur. It is not possible to export local labels. *EXEC* This directive is ignored *DRI* Note that all symbols will be truncated (without warning) before exporting. OPT C8 is therefore recommended. XREF import<,import>... XREF.L import<,import>... This defines labels to be imported from other programs or modules. If any of the labels specified are defined an error will occur. The normal XREF statement should be used to import a relative label (i.e. program reference), while XREF.L should be used to import absolute labels (i.e. constants). Importing a label more than once will not produce an error. *EXEC* This directive is ignored *DRI* The DRI format does not actually need to know the type of imports but it is recommended that both forms of XREF are used to allow the assembler to type check. If you do not type your imports you should turn type checking off using OPT T-. DRI labels are only significant to the first 8 characters. *GST* Care should be taken to import labels of the correct type otherwise the relocation information will not be correct. USING IMPORTS IN EXPRESSIONS *EXEC* There are no imports! *DRI* Imports may be used in expressions but only one import per expression is allowed. The results of an expression with an import in, must be of the form import+number or import-number. Imports can be combined with arbitrarily complex expressions, so long as the complex expression lexically precedes it, for example move.l 3+(1<<count+5)+import *GST* Imports may be used in expressions, with up to ten per expression. They may only be added or subtracted from each other though can be combined with arbitrarily complex expressions, so long as the complex expression lexically precedes it, for example move.l 3+(1<<count+5)+import1-import2 Where exactly an expression involving an import can be used depends on the file format. The following file shows which are allowed. Expression GST DRI Example PC-byte Y N move.w import(pc,d3.w) bsr.s import PC-word Y Yž move.w import(pc),a0 bsr import byte Y N move.b #import,do word Y Y move.w import(a3),d0 long Y Y move.l import,d0 ž so long as it is not a reference to a different section in the same program, which is not allowed. *DRI* *GST* Note that a reference to a symbol in a different section is regarded as an import and subject to the above rules. COMMENT comment string *EXEC* This directive is ignored *DRI* This directive is ignored *GST* The directive passes the following string, exactly as entered, into the .BIN file and will be shown by the linker. ORG expression This will make the assembler generate position dependant code and set the program counter to the given value. Normal GEMDOS program,s do not need an ORG statement even if position dependant. It is included to allow code to be generated for the ROM port or for other 68000 machines. More than one ORG statement is allowed in a source file but no padding of the file is done. *EXEC* It should be used with great care as the binary file generated will probably not execute correctly when double clicked. as no relocation information is written out. The binary file produced has the standard GEMDOS header at the front, but no relocation information. *DRI* This directive is not allowed, absolute code generation is an option in the linker. *GST* This sends the ORG directive to the linker which will pad the file with zeros to the given address. *NOTE* This directive is very unlikely to make sense when assembling to memory. OFFSET <expression> This switches code generation to a special section to generate absolute labels. The optional expression sets the program counter for the start of this section. No bytes are written to the disk and the only code generating directive allowed is DS. Labels defined in this section will be absolute. For example, to define some of the system variables of the ST; offset $400 etv_timer ds.l 1 will be $400 etv_critic ds.l 1 404 etv_term ds.l 1 408 ext_extra ds.l 5 40C memvalid ds.l 1 420 memcntlr ds.w 1 424 __LK (reserved symbol) This is a reserved symbol that can be used to detect which output mode is specified. The value of this symbol is alwayus absolute and one of the following, 0 executable 1 GST linkable 2 DRI linkable other values are reserved for future expansion. DRI DEBUG OPTION Normally only explicitly XDEF'ed labels are included in the symbol table within the output file. However the format allows what it calls local labels (not to be confused with GenST local labels) which are not true exports and cannot be referred to in any other modules but will be included in the symbol table in the file output file for debugging purposes. OPT D+ will cause all relative labels to be output as DRI local labels. �9[....�....�....�....�.........�.........�.........�.........�...�]0110 WRITING GST LIBRARIES When using multiple MODULEs to generate a GST format library file care must be taken with backward references to imports. within a library file, Higher level routines should be first, lower level routines last. For example the source file skeleton overleaf will not link when used as a selective library. MODULE low_level XDEF low_output low_output etc MODULE high_level XDEF high_output XREF low_output high_output etc This is because the second module references a label defined in an earlier module, which is not allowed. The corrected version is: MODULE high_level XDEF high_output XREF low_output high_output etc MODULE low_level XDEF low_output low_output etc SIMPLE FILE FORMAT EXAMPLES This section shows a (non-functional and incomplete) example of the use of each file format. Executable SECTION TEXT start lea string(pc),a0 move.l a0,save_str bsr printstring bra quit SECTION DATA string dc.b 'Enter Your Name,0 SECTION TEXT printstring move.l a0,-(sp) move.w #9,-(sp) trap #1 addq.l #6,sp rts SECTION BSS save_str ds.l 1 END DRI Linkable XREF.L quit SECTION TEXT start move.l #string,a0 move.l a0,save_str bsr printstring bra quit SECTION DATA string dc.b 'Enter Your Name,0 SECTION TEXT printstring move.l a0,-(sp) move.w #9,-(sp) trap #1 addq.l #6,sp rts SECTION BSS save_str ds.l 1 END note the way the instruction has been changed as a PC- relative reference is not allowed between sections. GST Linkable MODULE TESTPROG COMMENT needs work XREF.L quit SECTION TEXT start lea string(pc),a0 move.l a0,save_str bsr prinrstring bra quit SECTION DATA string dc.b 'Enter Your Name,0 SECTION TEXT printstring move.l a0,-(sp) move.w #9,-(sp) trap #1 addq.l #6,sp rts SECTION BSS save_str ds.l 1 END DIRECTIVE SUMMARY Assembly Control END terminate source code INCLUDE read source file from disk INCBIN read binary file from disk OPT option control EVEN ensure PC even CNOP align PC arbitrarily DC define constant DS define space DCB define constant block FAIL force assembly error Repeat Loop REPT start repeat block ENDR end repeat block Listing Control LIST enable list NOLIST disable list PLEN set page length LLEN set line length TTL set title SUBTTL set sub-title PAGE start new page LISTCHAR send control character FORMAT define listing format Label Directives EQU define label value EQUR define register equate SET define label value temporarily REG define register list RS reserve space RSRESET reset RS counter RSSET set RS counter Conditional Assembly IFEQ assemble if zero IFNE assemble in non-zero IFGT assemble if greater than IFGE assemble if greater than or equal to IFLT assemble if less than IFLE assemble if less than or equal to IFD assemble if label defined IFND assemble if label not defined IFC assemble if strings same IFNC assemble if strings different ELSEIF switch assembly state IIFC immediate IF Macros MACRO define macro ENDM end macro definition Output File Directives MODULE start new module SECTION switch section XDEF define label for export XREF define label for import COMMENT send linker comment ORG set absolute code generation OFFSET define offset table Reserved Symbols NARG number of macro parameters __G2 internal version number __RS RS counter __LK output file type