CHAPTER 3 - Labels & Addresses & Variables
Okay, let's get to variables. In previous chapter i wrote that variable is
general term for space which stores some value. Registers are variables for
example. But there is limited number of registers (VERY limited, some 8 + few
special), and this is nearly always not enough. For this reason memory (RAM -
random access memory) is used.
NOTE: when someone says "variable" he almost always means memory
Problem is that you have to know WHERE in memory is some value stored.
Position in memory (called "address") is given by number. But it is quite hard
to remember this number (address) for every variable.
Another problem with addresses is that when you change your program, address
can be changed too, and so you would have to correct this number everywhere
where it is used. For this reason addresses are represented by "labels".
Label is just some word (not string, it is not enclosed in apostrophes),
which, in your program, represents address in memory. When you compile your
program, compiler will replace label with proper address. Label consists of
alphabet characters ("a" to "z", "A" to "Z") numbers ("0" to "9"), underscores
("_") and dots ("."). But first character of label can't be number or dot.
Label also can't have same name as directive or instruction (instruction
mnemonics). Labels are case sensitive in FASM ('a' is NOT same as 'A').
Example of labels:
number which gives position in memory
|is label, different from "a"|
|is not label, because is starts with dot (labels starting with
dot have special meaning in FASM, which you will learn later)|
|is not label because it starts with number|
|is not label for same reason|
|is not label, because it contains space|
|is not label, because "mov" is instruction mnemonics|
You can define label using directive "label". This directive should be
followed with label itself (label name). For example:
Placeholder for some address, eg. placeholder for some number, because address is number. In FASM you can
use label same way as any other number (not really, but it doesn't matter for you too much now).
|is label definition, it defines label "name" |
|is label definition, it defines label "_name"|
|is not label definition, because "label" can't be name of label
as decribed in previous paragraph
this will define label that will represent address of data defined behind it
Shorter way to define label is just writing label name followed by colon
directive label followed by label name
but we won't use this way for some time.
3.2. Variable definition
Now how we can return to problem with variables: how to define variable in
memory. Program you create (compiled program, in machine code) is loaded to
memory at execution time, where processor executes it instruction by
instruction. Look at this program:
This program will probably crash, because after processor executes
db 'this is a string'
al,10 then it reaches string. But in program there is no difference between
string and instructions in machine code. Both are translated into array of numeric
values (bytes). There is no way processor can differ whether numeric value is
translation of string or translation of instruction. In this example,
processor will execute instructions whose numeric representation (in machine
code) is same as ASCII representation of string "this is a string".
Now look at this:
This program will not crash, because before reaching bytes defined by string
processor reaches instruction
db 'this is a string'
int 20h, which ends execution of program. So
bytes defined with string will not be executed, it will just take some space.
This is way how you can define variable - define some data at place where
processor won't try to execute it (behind
int 20h in this case).
So code with byte-sized variable of value 105
Last line defines byte variable containing 105.
Now how to access variable? First we must know address of variable. For this
we can use label (described above, reread it if you have forgotten)
So we already know address of variable. It is represented by label
my_first_variable. Now how to access it? You may think it is, for example
but no! Remember i told that label (
my_first_variable in this
case) stands for address of variable. So this instruction will move address of
al register, not variable's contents. To access
contents of variable (or contents of any memory location) you must enclose it's
address in brackets (
]). So to access contents
of our variable, and copy it's value to
al we use
Now we will define two variables:
So to copy value of
al we use
To set value of
variable1 (exact: to set value of variable which is stored
at address represented by
variable1) to 10 we could try
but this will cause error (try it if you want). Problem is that you know that
you are changing variable at address
But what is size of variable? In previous two cases byte-size could be
determined because you used
al register which is byte sized, so
compiler decided that variable at
variable1 is byte sized too,
because you can't move between operands with different sizes. But in this case,
value 10 can be of any size, so it can't decide size of memory variable. To
solve this we use "size operators". We will talk about two size operators for
word. You can put size operator before
instruction operand when accessing it to let compiler know what the variable
Another way to make this is
mov byte [variable1],10
in this case compiler knows that moved value 10 is byte sized so it decides
that variable is byte-sized too (because we can move byte sized value only to
byte sized variable).
mov [variable1], byte 10
But it would be hard to always remember and always write size of variable when
you access it. For this reason you can assign size of variable to label when
you define it. Just write size operator behind label name in definition:
label variable1 byte
now everytime you use
label variable1 word
[variable1] it will have same meaning as
byte [variable1] (or
word [variable1] in second
mov [variable1],10 will work, in first case it will
store value 10 to byte at address
variable1, in second case it
will store to word.
NOTE: You can't move value between variables with different size:
mov byte [variable1], word 10
label variable1 word
NOTE: You can't access two memory locations in one instruction (except for
same special instructions). This is wrong, it won't be compiled:
This will cause you some problems in the beginning but it will force you
to write faster code, and that is biggest reason to code assembly.
NOTE: size operator assigned to label at definition has lower priority than
size operator before access to variable in instruction, so:
will access BYTE, while
mov byte [variable],10
label variable word
will access WORD.
I think you noticed that having two lines to define one variable is little too
much. There is a shorter way to define variable:
is same as
variable1 db 100
notice that size of variable is defined too. In general, if data definiton
label variable1 byte
dw directive) is preceded by label, then
it will define this label too, and assign size of defined data as size of
It can be used with words too
Some example of using variables:
variable2 dw 100
character_to_write db 'a'
3.3. Addresses and basics of segmentation
Now we will discuss addresses little more. I have told that address is number
(!) which gives some position in memory. You have learnt how to represent this
number with labels, so numeric addresses were maintained by compiler. But you
still don't know anything about format of this number. I will try to explain
it a little in this chapter.
As you probably know, data in memory are stored in "bits" which can have value
0 or 1. You can consider memory as a (one dimensional) array of bits. 8
consecutive bits make one byte. Address is number (index, position in array)
of byte. For example address "0" is address of first bit of memory (or address
of first byte), address "1" is address of eight bit (or address of second byte)
of memory etc. Easiest to comprehend is to take memory as (one dimensional)
array of bytes
Address in .COM files is word-sized number, so
is wrong. It may work if
var1 is lesser than 256 so it fits into
byte sized register, but in general, store addresses in word-sized variables,
we will talk about them little later.
Now some examples on addresses. Check this file
here address represented by
variable1 is 0,
stands for 1,
variable3 is 2.
OK, this looks nice but it is not true at all. Problem is that there are
usually more programs loaded in memory at same time (operating system, mouse
driver, you program etc.). When using this way, program would have to know
WHERE in memory will it be loaded so it can access it's variables. For this
reason addresses are "relative". It means that for every program that is
loaded is reserved some region in memory called "segment". All addresses in
memory accessed by this program is then relative to begginning of this area.
 doesn't mean first byte of memory, but first byte of segment.
How this works? Processor has few special registers (segment registers) which
holds address of segment (address of first byte of segment). Every time you
access memory in your program then contents of this segment register is added
to address given by you so
consecutive region of memory reserved for one program
mov al, accesses first byte of your
NOTE: I have told that memory addresses in .COM programs are words. That
means they can be in range 0 to 65535. So maximal size of one segment is
65536 bytes. This can be "tricked" by changing contents of segment registers,
but don't care about this now.
NOTE: Segment is region in memory. But term "segment" is often used for
address of beginning of this region. Sad but true.
So absolute address in memory has two parts: segment (exact: address of
beggining of segment) and second part, word sized value called "offset" which
is address relative to segment (address of beginning of segment).
NOTE: (important) I said labels represent address of variable. In fact,
labels in FASM represent offset of variable. That is why it called "flat"
(you will comprehend this later (much much later :))
address relative to segment, or address "inside" segment. (first definition is more
exact, but second is easier to comprehend)
I won't get deeper into segment registers, how is address of begginning of
segment stored in them (there IS difference), take segment registers as some
kind of black box for now, it works and we can ignore it now.
3.4. 'org' directive explained
As your program is loaded, it often needs some external info from program that
runned it. Best example is command line arguments, or it may need know WHO
runned him etc. This value must be, of course, stored in same segment in
program. In .COM files these data (passed to your program by program that
runned you) is stored in first 256 bytes of segment. So your program is loaded
from offset 256.
NOTE: 256 byte structure in beginning of .COM file is called "PSP" which
stands for "program segment prefix"
Now imagine this .COM program:
(notice - no
variable1 db 0
org 256 directive). Instruction
[variable1] takes 3 bytes,
int 20h takes 2 bytes, so
variable1 will stand for offset 5. So instruction
mov al,. So this instruction access
6th byte of segment (first byte is at offset 0). But I already told you that
in first 256 bytes of segment are stored some informations, and your program
is loaded behind them, from offset 256. So you don't want to
to be 5, you want it to be 256+5. And this is what
does. It sets "origin" of file addresses.
org 256 will tell FASM
to add 256 to offset held by every label defined behind this directive (before
org directive). And this is exactly what we want in .COM
So code upwards won't access variable you want, it will access something in
PSP (first 256 bytes of segment). To make it work properly use:
variable1 db 0
org affects labels at time of defintion (for example
label variable byte or
variable db 0), not when
they are used (like at
mov ax,[variable]). That means, that if you
change addresses "origin" with
org directive after defining some
label, then label will still hold same value before and behind
I won't tell you about data contained in PSP, you dont have to care about