CHAPTER 4 - Endian encodings & word registers
4.1. Endian encodings
We should already have some quite exact idea about byte variables. You already
know they are 8 bit large (not so important now) and that they can contain
numeric value from 0 to 255. About word variables you know that they are 16
bits long and they contain value 0 to 65535.
Either you see it or not - word is same size as two bytes. Now let's think
about how to store value in two bytes. Both bytes can contain value 0 to 255.
Combination of this, we get 256*256, that is 65536. But how is this value
stored in these bytes? Let's say one of bytes (byte #1) contains 0. Then
other byte (byte #2) can hold value 0 to 255. So now we store values from 0
to 255 in our word. Now, when byte #1 contains 1, we can store another 256
number, 256 to 511. When byte #1 contains 2 we can store another 256 number,
512 to 767 etc. So totally it is 256*256, as i said, 65536. It is like in
decimal numbers: every digit is value 0 to 9, and "true" value of digit
depends on it's position. Last digit holds value 0 to 9, next (?previous?)
digit hold 10*(0 to 9), next 100*(0 to 9) etc. It is same in words: One of
bytes hold value 0 to 255, other holds value 256*(0 to 255). The one which
holds 0..255 is called "low order byte", other (which holds 256*(0..255)) is
called "high order byte".
terms: low order byte, high order byte
Examples (word value = high order byte : low order byte)
0 = 0 : 0
1 = 0 : 1
255 = 0 : 255
256 = 1 : 0
257 = 1 : 1
511 = 1 : 255
512 = 2 : 0
513 = 2 : 1 (513 / 256 = 2, 513 mod 256 = 1)
65535 = 255 : 255 (65535 / 256 = 255, 65535 mod 256 = 255)
Last problem remains: Order of these bytes. (eg: which is first, low order
byte or high order byte?). This is different on different computers. On IBM
PCs (and compatible) low order byte is first, high order byte comes then. For
byte [variable] is low order byte and
[variable + 1] is high order byte. (addition + 1 to offset in
variable is done by compiler, value of
is constant, so
variable + 1 is constant as well). It means next byte behind offset in
variable, i think this clear enough to need any more explaination).
NOTE: When low order byte is first then it is called "little endian encoding",
when high order byte then it is called "big endian encoding", but these terms
are not important, especially for beginner asm coder.
4.2. Word registers
Processor has except byte registers (like
dl...) some word registrs too, of course. You know, word is
combination of two bytes, and this is same for registers. Word registers are
combination of byte registers.
First word registers we'll learn are
ax is combination of
al is low order byte,
ah is high order byte. Same for bx = bh:bl, cx = ch:cl, dx = dh:dl.
If you would like "emulate" register
ex in memory it will be:
label ex word
el db 0
eh db 0
el would be low order byte, so it is first.
terms: word register
NOTE: letters a,b,c,d stays for "accumulator", "base", "counter" and "data",
it has nothing to do with alphabetical order. Real order of these registers
is ax,cx,dx,bx but it is not important until you want to generate/change
machine code yourself.
word registers: ax, bx, cx, dx
Now, if you want to set value in register
ax to 52 you use
but you also could use
dx to 12345
but it could be (no reason to do it this way in real coding, this is just
to demonstrate word to byte:byte relations)
because 48 is equal to 12345 / 256, 57 is 12345 modulo 57 (modulo is remainder
NOTE: You know that instruction operand can be number (numeric constant), like
"0", "256", "12345" etc. But every assembler i know allows you to put some
expression as operand. During compilation value of expression is evaluated
and expression is "replaced" by it's result. So
mov dx,(1 + 5) is
mov dx,6. Or better, code that is upwards can be writen as
mov dh,12345/ 256
mov dl,12345 mod 256
/ is operator for division,
mod is operator which
returns remainder from division (modulo). You don't have to know these operators
now, anyway you should already know something about expressions).
Processor has also other word registers,
di. But you can't directly access byte parts of
this registers, you must access whole word. This is limitation of processor,
there's nothing to do with it. For example if you want set high order byte of
si to 17 you must (?) do it like this:
So first you copy value of
ax. High order byte
ax can be dirctly accessed (it is
ah register) so
set it. Low order word remains. Then copy value back from
si. High order word is changed to 17, low order word remains
sp always has special function,
bp usually has special function (in code generated by most (all?)
di can be used whenever you want.
This means you shouldn't change
bp unless you
know what you are doing.
4.3. String output using int 21h/ah=9
This should be part of chapter 3 about addresses, but you need to know
dx register which is explained here.
Here we will talk about another usage of
int 21h. You already
should know that when
ah contains 2 then
writes character in
dl. But if we want to display some longer text
we must set
dl for every char and this is bad method. Wouldn't it
be better if we just store string we want to display somehere in file (like we
did in chapter 1) and then just display it from here?
For this we can use
int 21h with value 9 in
address of string in
dx register. Something like:
But another problems comes out - how to determine length of string, eq. number
of characters to display from given address. There are more methods about this,
we will talk about simplest one, used by int 21h/ah=9. There is just some
special character reserved as end-of-string marker. For int 21h/ah=9 it is
character "$". So to store string "Hello World", you define "Hello World$",
where "$" means end of string. Example of displaying string:
This program will display "Hello World".
db 'Hello World$'
This method of marking end of string has limitation - you can't display
character "$". For example:
will of course display only "It costed 50". This case can be solved this way:
db 'It costed 50$, maybe more$'
first part (first
db 'It costed 50$'
db ', maybe more$'
int 21h) will write "It costed 50", then
int 21h/ah=2, will write "$" and second
will write ", maybe more". We won't care about this limitation anymore for now,
this was just to improve explaination.
int 21h/ah=9. As you maybe already realized, this will display
every character (exact: every character whose ASCII code is in byte) from
dx to first character "$" behind address in
NOTE: ASCII codes 0 to 31 (i think) have special meaning for
21h/ah=9. These codes have characters assigned to them (smiling faces,
diamonds etc.), but int 21h/ah=9 doesnt display them but does something other.
For example character with ascii code 7 will case it to beep for a short while.
It should write "Beep" and then beep.
Another common values are 10 and 13. 10 cases cursor to return to first column
of current row. 13 causes cursor to move one row down (if bottom of screen is
reached then screen is scrolled). So combination of this causes cursor to move
to first column of next row. These two should (but doesn't always) work in any
order, but you always should put 13 first. These two characters are often
called EOL (end of line). Try this example:
it should write:
db 'Line 1',13,10,'Line 2$'
NOTE: ASCII code 13 is called CR (carriage return) and code 10 is called LF
Another example on addresses (previous chapter), but with word registers.
Check yourself whether you comprehended chapter 3:
Here we load
text db 'Hello World$'
address_of_text dw text
dx register with contents of
address_of_text variable, which holds value
and as we know,
text is placeholder for offset of 'Hello World$'
string. So word-sized variable
address_of_text holds offset
of that string. And thus loading
dx with contents of
address_of_text will load it with offset of string we want to
write. I hope you got it.