CHAPTER 2 - First program
You may have asked why am i fooling with creating some text files when you
want to learn assembly. But text files are just some "arrays" of bytes. You
didn't just learn how to create text file, you learnt how to define file
containing any data you want. And this is what runnable program is - special
"data" file, array of numeric values, called "machine code". You only have to
know meaning of these values :). Of course it is very hard to remember all
values and their's meanings, and this is what assembler is for. It translates
programs from human acceptable language to machine code. So you only have to
learn this human acceptable language :)
Now we will care about DOS .COM programs (rarely called "memory image", you
will learn why later, when you get into thing). These are most-simple
executable (runable) files under DOS and Windows.
array of numeric values, that represents instructions to processor (CPU)
So let's create first .COM file, which won't do anything.
Compile this to .COM file and run it. Nothing should happen. Now let's look
what that two lines means. (This will be funny...)
I won't explain what this directive does now. Just put this line in the
begginning of every .COM file! It doesn't define any data, even does nothing
you can notice now. We will get to this later.
This is "instruction". Instruction is command for processor, which is stored
in created file as one or more bytes. When you run .COM executable file,
processor walks thru it and decodes instructions from machine code and does
what these instructions instruct it to. Instruction
int 20h says
that this is end of execution of file. So first instruction in this says
processor to stop execution, so executable file does nothing, as you saw.
(by the way - int 20h is NOT instruction to processor which ends execution of
.COM program. It is instructs processor to call some system procedure.
System procedure is chosen by number following
single command to processor
int, in this case number 20h
(it IS sort of number) which means procedure to end .COM file .
int can be
followed by another number and another system procedure will be called. But
for now we can abstract from this, forget about it and take
int 20h as
instruction to stop program.)
So "machine code" is set of "instructions". Differ between directives and
instructions. Directive is command for compiler, how it should define data
and what data should it define. Instructions are defined data which encodes
what processor will do when you execute program. For example
db 0,0 is
directive which defines two zero bytes, but it is instruction in case it is
executed, because two zero bytes have special meaning for processor (don't
care what is their meaning).
org 256 is directive, not instruction,
because it doesn't define any data. You will get into this by practice.
int 20h is simple, it don't need any arguments (=parameters, or
values which changes it's effect). But what if some instruction DOES need any
arguments? For this reason processor has it's own "variables" (variable is
general term for space which stores some value). These variables are called
"registers". First registers we'll learn are
dh which are byte sized (they contain value in
range 0 to 255).
(by the way
"internal" processor's variable
int 20h takes argument in AL register, but, again, we can
abstract from this. And, in fact, value 20h is instruction argument too, but
we abstracted from this before. This is what i was talking about when i wrote
that "it will be funny").
Now how to set value of register? There is a instruction which does this, for
this instruction sets value of
al register to 10.
stands for "move". Destination of "moving" follows
mov (separated with spaces), in this case it is
register. Then comes source of moving separated with comma (
this case it is number 10. So this instruction "moves" value 10 to register
Another example of moving:
This copies value in
bl register to
al register. It
won't change value in
bl register. Source of
mov always stays
NOTE: You will often (always?) see some people talking about
mov instruction. But
mov is not instruction, and
int is not an instruction too.
mov al,bl or
int 20h is instruction for example.
int is called "instruction mnemonics". But accept this, everyone calls it
"instruction" and you probably will too after some time (and i probably
will too, sorry :). Arguments of instruction (part of instruction without
instruction mnemonics, like
mov al,10 are called "instruction operands" (or "instruction
instruction mnemonics (this term is not so improtant now)
Now let's go to usage of registers. We will use
int 21h instruction
which can do MANY things depending on value in
ah register. We
won't learn meaning of all values, now we will talk only about value 2. If
value 2 is in
ah register when
int 21h instruction is
executed then character in
dl (more extactly: character whose ASCII
code is in
dl) is writen to screen (console).
NOTE: if you are using some Windows and file manager (like Total Commander)
You will see window appear for very short time and then disappear. But our
character is displayed in this window and you probably can't notice that.
You must run shell (
cmd on XP,
command on older
windozes) and run your program from it. Anyway, if you can't handle this, forget
about assembly for a while and learn using your operating system. And then, dont
forget to return to assembly!
Okay so let's look on program which writes character "a":
sets value of "ah" register to 2, this should be clear
this moves "a" character into
dl register. (In fact, there is nothing like
"character a" in assembly. You could have noticed i wrote registers can
contain numeric values. Nothing about characters. Way this works is that compiler
translates character enclosed in apostrophes into it's numeric (ASCII) code,
which is recognized by
int 21h as code for this character. In assembly
character "a" means ASCII code for character "a")
In this case, when "ah" contains value 2, this writes character in "dl"
And we can't forget to stop execution. Otherwise program will most-probably
NOTE: in assembly character enclosed in apostrophes is same as ASCII
code for this character
So writing multiple characters ("ab") is:
we don't have to set
ah to 2 again for second
because value 2 will reamin in
ah from previous settings. Value in
dl will remain too, so code
will write "aaa".