A CRASH COURSE IN PROTECTED MODE


                                                      By Adam Seychell


     After my release of DOS32 V1.2 a lot of people were asking for basic
help in protected mode programming. If you already know what a selector is
then there is probably no need for you to read this file.
 
     Ok you know all about the 8086 ( or a 386 in real mode ) architecture
and what to know about this fantastic protected mode stuff. I'll start off
saying that I think real mode on the 386 is like driving a car that is stuck
in first gear. There is the potential of a lot of power but it is not being
used. It really degrades the 386 processor and was not designed to normally
operate in this mode. Even the Intel data book states  "Real mode is required
primarily to set up the processor for Protected Mode operation".




                      SEGMENTATION OF THE INTEL 80x86 

     A segment is a block of memory that starts at a fixed base address and
has a set length.  As you should already know that *every* memory reference
by the CPU requires both a SEGMENT value and a OFFSET value to be specified.
The OFFSET value is the location relative to the base address of the segment.
The SEGMENT value contains the information for the segment. I am going to
explain very basically how this SEGMENT value is interpreted by the 80386 to
give the parameters of segments. 


   In protected mode this SEGMENT value is interpreted completely different
than in real mode and/or Virtual 86 mode. The SEGMENT values are now called
"selectors". You'll see why when finished reading this file. So whenever you
load a segment register you are loading it with a selector.



The Selector is word length and contains three different fields.

Bits  0..1       Request Privilege level        ( just set this to zero )

Bit   2          Table Indicator        0 = Global Descriptor Table
                                        1 = Local Descriptor Table

Bits 3..15      The INDEX field value of the desired descriptor in the GDT   
                This index value points to a descriptor in the table. 









                     The  GLOBAL DESCRIPTOR TABLE  (GDT)

    The Global Descriptor Table ( GDT ) is a table of DESCRIPTORS and it is
stored in memory. The address of this table is given in a special 386
register called the global descriptor table register. There always must be a
GDT when in protected mode because it is in this table where all of the
segments are defined.
    Each DESCRIPTOR ( stored in the GDT ) contains the complete information
about a segment. It is a description of the segment.  Each Descriptor is 64
bits long and contains many different fields. I'll explain the fields later. 
The INDEX field  ( stored in bits 3..15 of any segment register ) selects a
descriptor to use for the type of segment wanted. So the only segments the
programmer can use are the available descriptors in the GDT.


Example:
    Suppose you what to access location 012345h in your data segment and
you were told that the descriptor for your data segment is descriptor
number 6 in the Global Descriptor Table.  Assume that the Global Descriptor
Table has already been set up and built for you  ( example, as in DOS32).

Solution:
    We need to load a segment register (SS,DS,FS,GS,ES)  with a value so
that it will select (or index ) descriptor number 6 of the GDT. Then
reference the address with a instruction that will use this loaded
segment register.

One of the segment registers (FS,DS,GS,SS,CS or ES) must be loaded with the
following three fields,

        Request Privilege level ( Bits 0..1 ) = 0  (always)
        Table Indicator ( bit 2 )  = 0  
        Index  ( bits 3..15 ) = 6

             
     mov  ax,0000000110000b      ;load DS with the selector value      
     mov  ds,ax          
     mov  byte ptr DS:[ 012345h ],0     ; Using the DS segment register



    The 386 has hardware for a complete multitasking system. There are 
several different types of descriptors available in the GDT for managing
multitasking. You don't need to know about all the different descriptors just
to program in protected mode. Just the info above is enough.  All you need to
know to program in protected mode is what descriptors are available to you
and what are the selector values to these descriptors.  The base address of
the segment may also be known. See the file DOS32.DOC for obtaining the
selector values.

    There are two groups of descriptors
1) CODE/DATA descriptors which are used for any code and data segments.

2) SYSTEM descriptors are used for the multitasking system of the 386. These
type of descriptors will never need to be used for programming applications.



          Format of a code and data descriptor

BITS                 description if the field
----------------------------------------------------------------
0..15                SEGMENT LIMIT 0...15
16..39               SEGMENT BASE 0..23
40                  (A)   accessed bit                           
41..43              (TYPE) 0 = SEE BELOW
44                  (0)  0 = code/data descriptor 1 = system descriptor
45..46              (DPL) Descriptor Privilege level
47                  (P)  Segment Present bit
48..50              SEGMENT LIMIT 16..19
51..52              (AVL)   2 bits available for the OS
53                  zero for future processors
54                  Default Operation size used by code descriptors only
55                  Granularly:  1 = segment limit = limit field *1000h
                                0 = segment limit = limit field
56..63              SEGMENT BASE 24..31      

format of TYPE field
          bit  2    Executable (E)     0 = Descriptor is data type    
               1    Expansion Direction (ED) 0 = Expand up
                                             1 = Expand up
               0    Writeable (W)       W = 0 data segment is read only
                                        W = 1 data segment is R/W     


          bit  2    Executable (E)  1 = Descriptor is code type  
               1    Conforming (C)   ( I don't understand )
               0    Readable (R)        R = 0 Code segment execute only
                                        R = 1 Code segment is Readable


   I'd better stop here, I am confusing myself. As you can see there is
more to a segment that just it's base address and limit.
    The three descriptors that are available in DOS32  all have limits of
0ffffffffh (4GB). This means that the offsets can be any value. 
For example, the instruction    XOR EAX,ES:[0FFFFFFFFh]  is allowed.
   If you happen to load an invalid selector value into one of the segment
registers then the 386 will report an General Protection exception
( interrupt 13 ).  In protected mode this exception is also used for many
other illegal operations.
     


                   The LINEAR ADDRESS and PHYSICAL ADDRESS

     All the address translations described above is done by the 386
segmentation unit. The segmentation unit looks up the descriptor tables,
segment selectors, offsets and then outputs a 32bit linear address. This
linear address is calculated by the segmentation unit in the following
manner.

Linear address = base address of segment + offset. 
The segment base address is found in the Base address field ( bits 16..39 &
56..63 ) of a descriptor which is located in the Global Descriptor Table. 
The index field ( bits 3..15) of the segment register selects the descriptor
to use. 


An example of the linear address of the instruction.

          MOV       EAX, ES:[EDX*8+012345678h]

where EDX = 100h and ES equals a selector wich points to a descriptor with
the base field equal to 02000000h.

The linear address = 2000000h + ( 100h*8 + 012345678h ) = 014345E78h


     Just to make things even more complicated the 386 has an second memory
managing unit called the Paging Unit. The linear address calculated above may
still not be the physical RAM location the 386 is addressing. This linear
address has yet to go through another stage, the paging unit. If you are
having trouble with what I've said so far then you may want to take a coffee
break before continuing because this is even worse.

     The 32bit linear address is directed to the paging unit. The paging unit
divides the linear address into three sections.

bits 0..12         Offset in a page
bits 13..23        Points to the page entry in the page table
bits 24..31        Points to the directory entry in the page directory table

The page table contains 1024 double word entires. In each of these entries
is the physical address of a 4KB page. The page directory also contains 1024
double word entries. Each of these directory entries hold a physical address
of a page table. See intel Documentation for the exact format of the page
table entries and directory table entries.  The physical base address of the
DIRECTORY TABLE is in control register 3 ( CR3 ) 

  This means if every entry in the directory table is used then there would
be 1024 page tables available. Because each page table hold 1024 page
addresses then there would be a total of 1024*1024 4KB pages.
The addresses in the page tables are all physical addresses. i.e the output
of the CPU address bus pins. The paging unit can be enabled or disabled
depending on wether bit 31 of CR0 is set. If the paging in disabled then the
linear address simply equals the physical address. If paging is enabled the
the linear address is translated by the paging unit to form a physical
address.  Please note that it is possible to set up the page tables and
directory such that the linear address equals the physical address. This is
what DOS32 does for the first 1MB of linear address space.
If you can not understand my hopeless attempt of describing the paging unit
then the diagram below might help.













            The 32 bit linear address from the segmentation unit
  -------------------------------------------------------------------------
  |   BITS  0..12             |    BITS  13..23       |    BITS 24..31     |
  |                           |                       |                    |
  -------------------------------------------------------------------------
               |                                  |                   |
               |------------------------          |                   |
                                        |         |                   |
                                        |         |                   |
                                        |         |                   |
        A Page in memory (4KB)          |         |                   |
     --------------------------         |         |                   |    
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        |         |         |                   |
     |                        | <-------          |                   |
     |                        |                   |                   |
     |                        |                   |                   |
     |                        |                   |                   |
|---> -------------------------                   |                   |
|                                                 |                   |
|                                                 |                   |
|                                                 |                   |
|            PAGE  TABLE          Offset          |                   |
|    --------------------------                   |                   |
|    |                        |    4092           |                   |
|    --------------------------                   |                   |
|    .                        .                   |                   |
|    .                        .                   |                   |
|    .                        .                   |                   |
|    --------------------------                   |                   |
|    |                        |     12            |                   |
|    --------------------------                   |                   |
|---<|   address of page      |     8   <---------                    |
     --------------------------                                       |    
     |                        |     4                                 |
     --------------------------                                       |
     |                        |     0                                 |
|--> --------------------------                                       |
|                                                                     |
|                                                                     |
|                                                                     |
|                                                                     |
|                                                                     |
|         DIRECTORY TABLE          Offset                             |
|    --------------------------                                       |
|    |                        |    4092                               |
|    --------------------------                                       |
|    .                        .                                       |
|    .                        .                                       |
|    .                        .                                       |
|    --------------------------                                       |
|---<| address of page table  |     12  <-----------------------------
     --------------------------                                       
     |                        |     8   
     --------------------------                   
     |                        |     4             
     --------------------------                   
     |                        |     0             
|--> --------------------------                   
|
|
|                           
|         ----------------------------------------------
--------<-|               CR3                          |
          ----------------------------------------------







   This was only meant to be a rough introduction to the protected mode
segmentation mechanism of the 80386+. I hope I did not make this sound
too complicated so that you have been put off with the whole idea of
protected mode programming. If you want to know more then I suggest you buy a
book on the 80386. The "Intel Programmers Reference guide" is the most
detailed book around. The info here is only meant to give you an idea of how
protected mode works.
   Please note that DOS32 does *ALL* of the setting up needed for protected
mode.  Don't worry if you couldn't understand half the stuff I was talking
about. You don't have to know about any stupid things like the descriptor
format, selector index fields, privilege levels ,paging, tables, ect , ect. 
What you do need to know are the selector values for all the descriptors that
are available to your program. Then the segment registers can simply be
loaded with these known selector values. DOS32 uses only 3 descriptors ( or
segments ) as described in DOS32.DOC.

