FASM PREPROCESSOR GUIDE
Table of contents:
0. About this document
I wrote this doc because i saw many people are asking many questions on FASM
board because not understanding idea or some particular feature of
preprocessor. (I dont discourage you to ask anything on forum, not
understanding something is not shame and if your question isn't too hard
someone will surely reply it).
Please if you can't understand something from tutorial tell me about it on
my tutorial's
thread on FASM board or with email if you like.
1. What is preprocessor
Preprocessor is program (or usualy part of compiler), which modifies your
source code before it is compiled. For example, if you use some piece of
code very often, you can give it some name, and tell preprocessor to replace
name with the piece of code each time.
Another example is, when you want to simulate some instruction which doesn't
exist, you can replace it with set of instructions with same effect
automatically using preprocessor.
Preprocessor scans source and replaces some things with other. But how you
tell it what should it preprocess? For this "preprocessor directives" are
used. We will discuss them now.
Preprocessor doesn't know anything about instructions, compiler directives
etc., it has it's own set of directives and just ignores parts of source not
meant for it.
2. Basic
First I describe basic preprocesing, which is done on file before any other
preprocessing.
2.1. Comment ;
Like in most assemblers, comments in FASM start with semicolon (;
).
Everything up to end of line is ignored and removed from the source.
For example source
;fill 100h bytes at EDI with zeros
xor eax,eax ;zero value in eax
mov ecx,100h/4
rep stosd
will after preprocessing become
xor eax,eax
mov ecx,100h/4
rep stosd
NOTE: ;
can be also comprehended as preprocessor operator
which deletes text behind it up to end of line.
NOTE: line that contains only comment won't be deleted, like in my
example. It will become empty line. I skipped empty line because of text
structure. This will be important in next chapter.
2.2. Line Break \
If line seems to be too long for you, you can "break" it with \
symbol
(or preprocessor operator). If line ends with \
symbol then next line
will be appended to current line.
Example:
db 1,2,3,\
4,5,6,\
7,8,9
will be preprocessed to
db 1,2,3,4,5,6,7,8,9
Of course \
inside strings or inside comments doesn't concatenate lines,
inside string it is taken as string character (like everything except
ending quote) and comments are deleted up to the end of line without
inspecting what's inside them.
There can't be anything behind \
in line, except for blank space and
comment.
In previous chapter i mentioned that line which contains only comment
won't be deleted, it will just become empty line. That means, that code
like this:
db 1,2,3,\
; 4,5,6,\ - commented
7,8,9
will after preprocessing become
db 1,2,3
7,8,9
and so it will cause error. To solve situation like this, you must put
line break before comment:
db 1,2,3,\
\; 4,5,6 - validly commented
7,8,9
which will become
db 1,2,3,7,8,9
like we wanted.
2.3. Directive include
It's syntax:
include <quoted string - file name>
It will insert text file into sources. It allows you to break source into
more files. Of course inserted text will be preprocessed too. File path and
name should be quoted (enclosed in '
,'
or "
,"
).
Examples:
include 'file.asm'
include 'HEADERS\data.inc'
include '..\lib\strings.asm'
include 'C:\config.sys'
You can also access environment variables enclosed in %
,%
:
include '%FASMINC%\win32a.inc'
include '%SYSTEMROOT%\somefile.inc'
include '%myproject%\headers\something.inc'
include 'C:\%myprojectdir%\headers\something.inc'
TODO: 1.52 paths system (someone could describe it for me...)
2.4. Strings preprocessing
You may have problem to include '
in string declared using '
s
or "
in string declared using "
s. For this reason you must place
the character twice into string, in that case it won't end string and begin next as you
may think, but it will include character into string literaly.
For example:
db 'It''s okay'
will generate binary containing string It's okay
.
It's same for "
.
3. Equates
3.1. Directive "equ"
Simplest preprocessor command. It's syntax
<name1> equ <name2>
This command tells preprocessor to replace every following <name1> with
<name2>.
Example: source
count equ 10 ;this is preprocessor command
mov ecx,count
.
is preprocessed to
mov ecx,10
Example:
mov eax,count
count equ 10
mov ecx,count
is preprocessed to
mov eax,count
mov ecx,10
because preprocessor only replaces count
behind equ
directive.
Even this works:
10 equ 11
mov ecx,10
preprocessed to
mov ecx,11
also note that name1
can be any symbol. Symbol is just set of
chars, terminated by blank character (space, tab, end of line) or comment
(;
) or line-break (\
) or operator (including assembly
time operators, not only preprocessor's operators). It can't be operator or
special symbol (like ,
or }
etc.)
name2
can be anything, not only one symbol, everything up to end
of line is taken. It can be even empty, then <name1> is replaced by blank
space.
Example:
10 equ 11,12,13
db 10
to
db 11,12,13
3.2. Directive restore
You can also tell preprocessor to stop replacing particular equate. This is
done with restore
operator:
restore <name>
Where <name> is some equation. Behind this command <name> will
no longer be replaced as specified by equ
Example:
mov eax,count
count equ 10
mov eax,count
restore count
mov eax,count
will become
mov eax,count
mov eax,10
mov eax,count
Note that replacements are "stacked" that means if you define two equates
for one symbol, and then restore it (once), the first one will be used.
Example:
mov eax,count
count equ 1
mov eax,count
count equ 2
mov eax,count
count equ 3
mov eax,count
restore count
mov eax,count
restore count
mov eax,count
restore count
mov eax,count
to
mov eax,count
mov eax,1
mov eax,2
mov eax,3
mov eax,2
mov eax,1
mov eax,count
If you try to restore non-existing equation nothing will happen.
Example:
mov eax,count
restore count
mov eax,count
to
mov eax,count
mov eax,count
4. Simple macros without arguments
4.1. Simple macro definition
You can create your own instruction / directive using "macro".
macro <name>
{
<body>
}
After preprocessor finds macro
directive, it defines macro, which means
each following occurence of line starting with <name> will be replaced by
<body>. <name> can be one symbol, <body> can be anything except
}
which denotes end of macro body.
Example:
macro a
{
push eax
}
xor eax,eax
a
to
xor eax,eax
push eax
Example:
macro a
{
push eax
}
macro b
{
push ebx
}
b
a
to
push ebx
push eax
Of course, macro doesn't have to be indented like in my example, you can use
this too:
macro push5 {push dword 5}
push5
to
push dword 5
Or
macro push5 {push dword 5
}
to same result. You are free about indenting macros.
4.2. Nested macros
You can nest macros. That means, if you redefine macro, then the last one is
used. But if you use original macro in last one, it will work. Look at
example:
macro a {mov ax,5}
macro a
{
a
mov bx,5
}
macro a
{
a
mov cx,5
}
a
to
mov ax,5
mov bx,5
mov cx,5
Or this example:
macro a {1}
a
macro a {
a
2}
a
macro a {
a
3}
a
to
1
1
2
1
2
3
4.3. Directive purge
(macro undefinition)
You can also undefine macro, like you undefined equate. This is done by
purge
directive followed by macro name:
a
macro a {1}
a
macro a {2}
a
purge a
a
purge a
a
to
a
1
2
1
a
If you try to purge
non-existing macro nothing will happen.
4.4. Macros behaviour
Macro name will be replaced by macro body not only if line is starting with
macro, but everywhere where instruction mnemonics (like add
,
mov
) is accepted. It is because main purpose of macro is to
simulate instructions. Only exception is instruction prefix, macro is not
accepted after instruction prefix.
Example:
macro CheckErr
{
cmp eax,-1
jz error
}
call Something
a: CheckErr ;here macro name is preceded by label definition, but it
;will be replaced
to
call Something
a:
cmp eax,-1
jz error
Example:
macro stos0
{
mov al,0
stosb
}
stos0 ;this is place for instruction, will be rplaced
here: stos0 ;this is place for instruciton too
db stos0 ;this in not place for instruction and so won't be replaced
to
mov al,0
stosb
here:
mov al,0
stosb
db stos0
You can also "overload" instruction with macro, because preprocessor doesnt'
know about instructions, it allows macro name to be instruction mnemonics.
macro pusha
{
push eax ebx ecx edx ebp esi edi
}
macro popa
{
pop edi esi ebp edx ecx ebx eax
}
these 2 save 4 bytes for every pusha because they don't push ESP. But
overloading istruction isn't very good, because someone reading your code
may get fooled if he don't knows instruction is overloaded.
You can also overload assembly-time directive:
macro use32
{
align 4
use32
}
macro use16
{
align 2
use16
}
5. Macros with fixed number of arguments
5.1. Macro with one argument
You can also define macro argument. This argument is represented by some
symbol, which will be replaced in macro body by passed argument.
macro <name> <argument> { <body> }
Example:
macro add5 where
{
add where,5
}
add5 ax
add5 [variable]
add5 ds
add5 ds+2
to
add ax,5
add [variable],5
add ds,5 ;there is no such instruction, but its not task of preprocessor
;to check it. It will be preprocessed to this form, and will
;throw error at assembling stage.
add ds+2,5 ;like previous, but this is also syntactically wrong, so it'll
;throw error at parsing stage.
(of course there won't be those comments in preprocessed file :)
5.2. Macros with more arguments
Macros can have more arguments, separated with comma (,
):
macro movv where,what
{
push what
pop where
}
movv ax,bx
movv ds,es
movv [var1],[var2]
preprocessed to
push bx
pop ax
push es
pop ds
push [var2]
pop [var1]
If more arguments have same name, first one is used :).
If you specify less arguments than you listed in macro declaration, then
value of not-specified arguments is blank:
macro pupush a1,a2,a3,a4
{
push a1 a2 a3 a4
pop a4 a3 a2 a1
}
pupush eax,dword [3]
to
push eax dword [3]
pop dword [3] eax
If you want to include comma (,
) in macro argument, you must enclose argument in
brackets <
,>
.
macro safe_declare name,what
{
if used name
name what
end if
}
safe_declare var1, db 5
safe_declare array5, <dd 1,2,3,4,5>
safe_declare string, <db "hi, i'm stupid string",0>
to
if used var1
var1 db 5
end if
if used array5
array5 dd 1,2,3,4,5
end if
if used string
string db "hi, i'm stupid string",0
end if
You can use <
and >
in macro body too, of course:
macro a arg {db arg}
macro b arg1,arg2 {a <arg1,arg2,3>}
b <1,1>,2
is preprocessed to
db 1,1,2,3
5.3. Directive "local"
You may want to declare some label inside macro body:
macro pushstr string
{
call behind ;pushes address of string and jumps to behind
db string,0
behind:
}
but if you use this macro twice, label "behind" will be defined twice and
that will be error. You can solve this by making label "behind" local to
macro. For that preprocessor directive "local" is used.
local <name>
It must be inside macro body. It makes all following occurences of <name>
inside macro body local to macro. Thus, if macro is used twice
macro pushstr string
{
local behind
call behind
db string,0
behind:
}
pushstr 'aaaaa'
pushstr 'bbbbbbbb'
call something
this won't cause any problems. This is done by replacing behind
with
behind?XXXXXXXX
where XXXXXXXX
is some hexadecimal number
generated by preprocessor. Last example can be for example preprocessed to
call behind?00000001
db 'aaaaa',0
behind?00000001:
call behind?00000002
db 'bbbbbbbb',0
behind?00000002:
call something
Note that you can't directly access names containing ?
, as it is special
symbol for fasm, for this reason it is used with local names. For example
aa?bb
is considered as symbol aa
, special symbol
?
and symbol bb
.
If you want more local labels you dont have to use two locals, you can list
them all in one local
directive, separated by commas (,
):
macro pushstr string ;does same job as previous macro
{
local addr,behind
push addr
jmp behind
addr db string,0
behind:
}
It is always good to start all macro local label names with two dots (..
)
which means they wont change current global label. For example:
macro pushstr string
{
local behind
call behind
db string,0
behind:
}
MyProc:
pushstr 'aaaa'
.a:
will be preprocessed to
MyProc:
call behind?00000001
db 'aaaa',0
behind?00000001:
.a:
and so it will create behind?00000001.a
label instead of
MyProc.a
. But names that start with two dots (..
) do not change
current global label, so in following case MyProc.a
would be declared:
macro pushstr string
{
local ..behind
call ..behind
db string,0
..behind:
}
MyProc:
pushstr 'aaaa'
.a:
5.4. Operator #
(symbol concatenation)
Other fasm's macrolanguage feature is manipulation with symbols. This is
done with symbol concatenation operator #
, which concatenates two
symbols into one, for example a#b
will become ab
,
or aaa bbb#ccc ddd
-> aaa bbbccc ddd
.
This operator can be used only inside macro body, and concatenating symbol
will be done after replacing macro arguments, so you can use this to
create some symbol from macro argument.
Example:
macro string name, data
{
local ..start
..start:
name db data,0
sizeof.#name = $ - ..start
}
string s1,'macros are stupid'
string s2,<'here i am',13,10,'rock you like a hurricane'>
to
..start?00000001:
s1 db 'macros are stupid',0
sizeof.s1 = $ - ..start?00000001
..start?00000002:
s2 db 'here i am',13,10,'rock you like a hurricane',0
sizeof.s2 = $ - ..start?00000002
so for all strings defined by macro, symbol "sizeof.<name of string>
gets defined.
This operator can also concatenate quoted strings:
macro name name
{
db 'name: '#b,0
}
debug '1'
debug 'barfoo'
to
db 'name: 1',0
db 'name: barfoo',0
this is usefull when passing argument from macro to macro:
macro pushstring string
{
local ..behind
call ..behind
db string,0
..behind:
}
macro debug string
{
push MB_OK
push 0 ;empty caption
pushstring 'debug: '#string ;"pushstring" takes one argument
push 0 ;no partent window
call [MessageBox]
}
Note that you can't use #
in arguments of local
, because local
is processed
before #
. For that reason, code like this won't work:
macro a arg
{
local name_#arg
}
a foo
5.5. Operator `
There is also operator `
which transfers symbol following it to quoted
string. This operator can be used only inside macro.
Example:
macro proc name
{
name:
log `name ;log can be some macro which takes string as argument
}
proc DummyProc
to
DummyProc:
log 'DummyProc'
Or one little more complicated example using #
:
macro proc name
{
name:
log 'entering procedure: '#`name
}
proc DummyProc
retn
proc Proc2
retn
to
DummyProc:
log 'entering procedure: DummyProc'
retn
Proc2:
log 'entering procedure: Proc2'
retn
6. Macros with group argument
6.1. Declaring macro with group argument
Macros can have so-called "group argument". It allows you non-fixed number of
arguments. Group argument is enclosed in brackets [
,]
in macro definition:
macro name arg1,arg2,[grouparg]
{
<body>
}
Group argument must be last argument in macro defintion. Group argument can
contain multiple arguments, like:
macro name arg1,arg2,[grouparg] {}
name 1,2,3,4,5,6
here value of group argument (grouparg
) are values 3,4,5 and 6. 1 and 2
are values of arg1
and arg2
.
6.2. Directive common
To work with group arguments, you use some preprocessor directives. These
directives can be used only in body of macro with group argument. First such
directive is common
. It means that behind this directive group argument
name in macro body will be replaced by all arguments:
macro string [grp]
{
common
db grp,0
}
string 'aaaaaa'
string 'line1',13,10,'line2'
string 1,2,3,4,5
to
db 'aaaaaa',0
db 'line1',13,10,'line2',0
db 1,2,3,4,5,0
6.3. Directive forward
But you can work with arguments in group argument separately. For this
forward
preprocessor directive is used. Part of macro body behind forward
directive is preprocessed for each argument in group argument:
macro a arg1,[grparg]
{
forward
db arg1
db grparg
}
a 1,'a','b','c'
a -1,10,20
to
db 1
db 'a'
db 1
db 'b'
db 1
db 'c'
db -1
db 10
db -1
db 20
forward
is default for macros with group arguments, so previous macro can
as well be
macro a arg1,[grparg]
{
db arg1
db grparg
}
6.4. Directive reverse
reverse
is same as forward
, but processess arguments
in group argument from last to first:
macro a arg1,[grparg]
{
reverse
db arg1
db grparg
}
a 1,'a','b','c'
to
db 1
db 'c'
db 1
db 'b'
db 1
db 'a'
6.5. Combining group control directives
These 3 directives divide macro to blocks. Each block is processed after
previous one. For example:
macro a [grparg]
{
forward
f_#grparg: ;symbol concatenation operator #, see chapter 4.4
common
db grparg
reverse
r_#grparg:
}
a 1,2,3,4
to
f_1:
f_2:
f_3:
f_4:
db 1,2,3,4
r_4:
r_3:
r_2:
r_1:
6.6. Behavior of directive local
inside macro with group
argument
There is one more very nice feature with labels local to macro (listed
with local
preprocessor directive, see chapter 4.3). If
local
directive is defined inside forward
or
reverse
block, then unique label is defined for each argument in
group, and same labels are used for theirs arguments in following
forward
or reverse
blocks. Example:
macro string_table [string]
{
forward ;table of pointers to strings
local addr ;declare label for this string as local
dd addr ;pointer to string
forward ;strings
addr db string,0
}
string_table 'aaaaa','bbbbbb','5'
to
dd addr?00000001
dd addr?00000002
dd addr?00000003
addr?00000001 db 'aaaaa',0
addr?00000002 db 'bbbbbb',0
addr?00000003 db '5',0
Another example, with reverse
block:
macro a [x]
{
forward
local here
here db x
reverse
dd here
}
a 1,2,3
to
here?00000001 db 1
here?00000002 db 2
here?00000003 db 3
dd here?00000003
dd here?00000002
dd here?00000001
so labels will be used with same arguments in both forward
and
reverse
blocks.
6.7. Macro with multiple group arguments
You can also have more multiple arguments. In that case macro definition
wont look like
macro a [grp1],[grp2]
because then it would be unclear which arguments belong to which group.
For that reason you declare them like:
macro a [grp1,grp2]
here every odd argument belongs to grp1, every even to grp2.
Example:
macro a [grp1,grp2]
{
forward
l_#grp1:
forward
l_#grp2:
}
a 1,2,3,4,5,6
to
l_1:
l_3:
l_5:
l_2:
l_4:
l_6:
Another example:
macro ErrorList [name,value]
{
forward
ERROR_#name = value
}
ErrorList \
NONE,0,\
OUTOFMEMORY,10,\
INTERNAL,20
to
ERROR_NONE = 0
ERROR_OUTOFMEMORY = 10
ERROR_INTERNAL = 20
Of course there can be more than 2 group arguments:
macro a [g1,g2,g3]
{
common
db g1
db g2
db g3
}
a 1,2,3,4,5,6,7,8,9,10,11
to
db 1,4,7,10
db 2,5,8,11
db 3,6,9
7. Preprocessor conditionals
In fact, there is no preprocessor conditional syntax in FASM (too bad). But
assembly directive if
can be used in conjuction with preprocessor to
acheive same results as with preprocessor conditionals (but this way it
wastes more time and memory).
As you know, if
is assembly-time statement. That means statement is
checked after preprocessing, and that allows some special conditional
operators to work.
I won't describe it's assembly-time behavior (conditional operators like
&
, |
etc), read FASM's docs for this. I will describe
only operators that are used with preprocessor here.
7.1. Operator eq
Simplest is eq
. It just compares two symbols if they are same. Value of
abcd eq abcd
is true, value of abcd eq 1
is false etc. It is useful to
compare symbol that will be preprocessed, like:
STRINGS equ ASCII
if STRINGS eq ASCII
db 'Oh yeah',0
else if STRINGS eq UNICODE
du 'Oh yeah',0
else
display 'unknown string type'
end if
after preprocessing it will be
if ASCII eq ASCII
db 'Oh yeah',0
else if ASCII eq UNICODE
du 'Oh yeah',0
else
display 'unknown string type'
end if
so first condition (ASCII eq ASCII
) is true, so only db 'Oh yeah',0
will
get assembled.
Other case:
STRINGS equ UNICODE ;only difference here, UNICODE instead of ASCII
if STRINGS eq ASCII
db 'Oh yeah',0
else if STRINGS eq UNICODE
du 'Oh yeah',0
else
display 'unknown string type'
end if
after preprocessing it will be
if UNICODE eq ASCII
db 'Oh yeah',0
else if UNICODE eq UNICODE
du 'Oh yeah',0
else
display 'unknown string type'
end if
now first condition (UNICODE eq ASCII
) will be false, second one (UNICODE eq
UNICODE
) will be true and so du 'Oh yeah',0
will get assembled.
Better usage of this is checking macro arguments, like
macro item type,value
{
if type eq BYTE
db value
else if type eq WORD
dw value
else if type eq DWORD
dd value
else if type eq STRING
db value,0
end if
}
item BYTE,1
item STRING,'aaaaaa'
to
if BYTE eq BYTE
db 1
else if BYTE eq WORD
dw 1
else if BYTE eq DWORD
dd 1
else if BYTE eq STRING
db 1,0
end if
if STRING eq BYTE
db 'aaaaaa'
else if STRING eq WORD
dw 'aaaaaa'
else if STRING eq DWORD
dd 'aaaaaa'
else if STRING eq STRING
db 'aaaaaa',0
end if
so only these two commands will get assembled:
db 1
db 'aaaaaa',0
eq
(like all other preprocessor operators) can also work with blank
arguments. That means, for example, that if eq
is true, and
if 5 eq
is
false etc.
Example macro:
macro mov dest,src,src2
{
if src2 eq
mov dest,src
else
mov dest,src
mov src,src2
end if
}
7.2. Operator eqtype
Other operator is eqtype
. It compares whether symbols are of same type.
Types are:
individual quoted strings (those not being a part of numerical expression)
floating point numbers
any numerical expression (note that any unknown word will be treated as
label, so it also will be seen as such expression),
addresses - the numerical expressions in square brackets (with size
operators and segment prefixes)
instruction mnemonics
registers
size operators
near/far operators,
use16/use32 operators
blank space
Example of macro which allows SHL instruction with memory variable as count,
like shl ax,[myvar]
macro shl dest,count
{
if count eqtype [0] ;if count is memory variable
push cx
mov cl,count
shl dest,cl
pop cx
else ;if count is of another type
shl dest,count ;just use original shl
end if
}
shl ax,5
byte_variable db 5
shl ax,[byte_variable]
to
if 5 eqtype [0]
push cx
mov cl,5
shl ax,cl
pop cx
else
shl ax,5
end if
byte_variable db 5
if [byte_variable] eqtype [0]
push cx
mov cl,[byte_variable]
shl ax,cl
pop cx
else
shl ax,[byte_variable]
end if
and so, due to conditions, it will be assembled to
shl ax,5
byte_variable db 5
push cx
mov cl,[byte variable]
shl ax,cl
pop cx
Note that shl ax,byte [variable]
wouldn't work with this macro, because
condition byte [variable] eqtype [0]
isn't true, read further.
eqtype
operands doesn't work only with two operands. It just compares
whether types of operands on left side and same to type of operands on right
side of eqtype
. For example if eax 4 eqtype ebx name
is true (name
is label and thus it is number too).
Example of extending mov
intruction so it allows moving between memory
variables:
macro mov dest,src
{
if dest src eqtype [0] [0]
push src
pop dest
else
mov dest,src
end if
}
mov [var1],5
mov [var1],[var2]
will be preprocessed to
if [var1] 5 eqtype [0] [0] ;false
push 5
pop [var1]
else
mov [var1],5
end if
if [var1] [var2] eqtype [0] [0] ;true
push [var2]
pop [var1]
else
mov [var1],[var2]
end if
and assembled to
mov [var1],5
push [var2]
pop [var1]
Anyway, better (more readable) way to write such macro is to use &
operator (not described in this document, see FASM documentation), like:
macro mov dest,src
{
if (dest eqtype [0]) & (src eqtype [0])
push src
pop dest
else
mov dest,src
end if
}
above example using eqtype
with four arguments was meant only to
demonstrate possibilities, &
should be used if possible.
Note that currently you can use incomplete expressions as argument of
eqtype
, it is sufficent if parser recognizes it's type, but this is
undocumented behavior so i won't describe it anymore.
7.3. Operator "in"
FASM also includes another operator. It can be used if you use more eq
s:
macro mov a,b
{
if (a eq cs) | (a eq ds) | (a eq es) | (a eq fs) |\
(a eq gs) | (a eq ss)
push b
pop a
else
mov a,b
end if
}
Instead of many |
ed eq
s, you can use in
operator. It compares symbol on the left side with more symbols in list on the
right side. Symbol list must be enclosed in brackets ("<" and ">"),
symbols inside list should be separated with comma (,
).
macro mov a,b
{
if a in <cs,ds,es,fs,gs,ss>
push b
pop a
else
mov a,b
end if
}
in
also works with more symbols on both sides (like eq
):
if dword [eax] in <[eax], dword [eax], ptr eax, dword ptr eax>
8. Structures
Structures are almost same as macros. You declare them with struc
directive:
struc <name> <arguments> { <body> }
Difference is that when you use structure in code, it must be preceded by
some label (structure name). For example
struc a {db 5}
a
doesn't work. Structure is only recognized when preceded by name, like:
struc a {db 5}
name a
will, like macro, get preprocessed to
db 5
Reason of name is, that name (the one before structure) will be appended
before every symbol inside structure body starting with .
. For
example:
struc a {.local:}
name1 a
name2 a
will be
name1.local:
name2.local:
This way you can define something like structures you know from other
languages. Example:
struc rect left,right,top,bottom ;has arguments, like macros
{
.left dd left
.right dd right
.top dd top
.bottom dd bottom
}
r1 rect 0,20,10,30
r2 rect ?,?,?,?
to
r1.left dd 0
r1.right dd 20
r1.top dd 10
r1.bottom dd 30
r2.left dd ?
r2.right dd ?
r2.top dd ?
r2.bottom dd ?
You can also use nice trick with which you don't have to specify arguments
(and 0 will be used instead):
struc ymmud arg
{
.member dd arg+0
}
y1 ymmud 0xACDC
y2 ymmud
to
y1.member dd 0xACDC+0
y2.member dd +0
as described in 4.2, if argument is not specified it's value is blank inside
macro/structure body. We also used that +
is both binary (with two
operands) and unary (with one operand) operator.
NOTE: Often there is defined macro or structure called struct
(not struc
), which declares structure or extends structure
declaration. Don't mistake struct
with struc
.
9. Fixes
By the time FASM was evolving, it still missed one very useful feature - ability
to declare macro inside macro, eg. to result of unrolling macro will became macro
definition. Something like hypothetical
macro declare_macro_AAA
{
macro AAA
{
db 'AAA',0
} ;end of "AAA" declaration
} ;end of "declare_macro_AAA" declaration
Problem here is that when macro declare_macro_AAA
is read by
preprocessor, first found }
is taken as it's end, not the one we
wanted. It is similar with other preprocessor symbols / operators (for example,
#
, `
, forward
, local
,
etc.), they were processed in expanding outer macro, so they couldn't be used
in inner macro declaration.
9.1. Explaination of fixes
By the time, another preprocessor directive was added. It does same job like
equ
, but BEFORE other preprocessing (except things listed in
chapter 2, these are done in some pre-preprocessing stage,
but this internal stuff, not very much of interest). This directive is
fix
.
It has same syntax like equ
(<symbol> fix
<anything>
), but replacing fixed symbols in line is done before
any other preprocessing (except things listed in chapter
2, again).
Preprocessing is done line by line, left to right, so if we have code
a equ 1
b equ a
a b
Then preprocesisng happens like this:
Preprocessing line 1:
a
- Preprocessor finds unknown word, skips it.
equ
- "equ" is second word of line, so it remembers "a" equals rest of line ("b") and deletes line
Preprocessing line 2:
b
- Preprocessor finds unknown word, skips it.
equ
- "equ" is second word of line, so it remembers "b" equals rest of line ("a") and deletes line
Preprocessing line 3:
a
- Preprocessor replaces "a" with "1"
b
- Preprocessor replaces "b" with "a"
So it becomes:
1 a
But if we have
a fix 1
b fix a
a b
then it looks like:
Fixing line 1: No symbols to be fixed
Preprocessing line 1:
a
- Preprocessor finds unknown word, skips it.
fix
- "fix" is second word of line, so it remembers "a" is fixed to rest of line ("b") and deletes line
Fixing line 2: "a" is fixed to "1", so line becomes "b fix 1"
Preprocessing line 2:
b
- Preprocessor finds unknown word, skips it.
fix
- "fix" is second word of line, so it remembers "b" is fixed to rest of line ("1") and deletes line
Fixing line 3: "a" is fixed to "1", "b" is fixed to "1" so line becomes "1 1"
Preprocessing line 3:
1
- Preprocessor finds unknown word, skips it.
1
- Preprocessor finds unknown word, skips it.
This was only example to see how fixing works, usually it isn't used in this
manner.
9.2. Using fixes for nested macro declaration
Now back to declaring macro inside macro - First, we need to know how
are macros preprocessed. You can quite easily make it out yourself - on macro
declaration macro body is saved, and when macro is being expanded preprocessor
replaces line with macro usage by macro body and internally declares equates to
handle arguments and continues with preprocessing of macro body. (of course it
is more complicated but this is enough for understanding fixes).
So where was problem with declaring macro inside macro? First time compiler found
"}" inside macro body it took it as end of macro body declaration, so there wasn't
any way to include "}" in macro body. So we can easily fix :) this
macro a
{
macro b
%_
display 'Never fix before something really needs to be fixed'
_%
}
%_ fix {
_% fix }
a
b
Now preprocessing looks like (simplified)
1. Preprocessor loads declaration of macro "a"
2. Preprocessor loads declaration of fixes "%_" and "_%"
3. Preprocessor expands macro "a"
4. Preprocessor loads macro "b" declaration ("_%" and "%_" are fixed in each
line before being handled by rest of preprocessor)
5. Preprocessor expands macro "b"
Here you see how important is placing of declaration of fixes, because macro
body is fixed too before it's loaded by preprocessor. For example this won't
work:
%_ fix {
_% fix }
macro a
{
macro b
%_
display 'Never fix before something really needs to be fixed, here you see it'
_%
}
a
b
Because "%_" and "_%" will be fixed before loading macro "a", so loading macro
body will end at "_%" fixed to "}" and second "}" will remain there.
NOTE: Character "%" isn't special character for FASM's preprocessor, so
you use it just like any normal character, like "a" or "9". It has special
meaning AFTER preprocessing, and only when it is only char in whole word ("%"
not "anything%anything").
We also need to fix other macro-releated operators:
%_ fix {
_% fix }
%local fix local
%forward fix forward
%reverse fix revese
%common fix common
%tostring fix `
Only #
is special case, you can fix it, but there is a easier way. Every time
preprocessor finds multiple #
s, it removes one, so it is something like (this won't
really work)
etc...
###### fix #####
##### fix ####
#### fix ###
### fix ##
## fix #
So instead of using symbol fixed to "#" you can just use "##" etc.
9.3. Using fixes for moving parts of codes
You can also use fixes to move parts of code. In assembly programming is this
useful especially when you break code into modules, but you want to have data
and code grouped in separate segment/section, but defined in one file.
Right now this part of tutorial is TODO, I hope I will write it soon,
for now you can look at JohnFound's Fresh's macro library, file
INCLUDE\MACRO\globals.inc
.
I know fixes are confusing, and you have to learn inner working of preprocessor, but
they give you much coding power. Privalov wanted FASM to be as much powerful as
possible, even at price of comprehensibility.
FINAL WORDS
Don't forget to read FASM documentation. Almost everything from this
tutorial is there, maybe writen in way little harder for learning but better
as reference. It is not so long, nor hard to remember - 99% of FASM users
was learning it from these docs and from forum.