#### What's cool about x86 assembly language?

Allan B. Cruse Emeritus Professor Computer Science and Mathematics University of San Francisco

### A 'background' story

A few years ago the authors of a book about the Linux kernel asked me to review a first-draft of their revised edition, for 'technical accuracy'

In Chapter 3 they made a sweeping statement which, to me, seemed unlikely to be true:

"page-faults are not allowed in kernel mode"

I realized that I knew enough about Linux, and about CPU behavior, that I could quickly write some x86 assembly language to test this claim

#### An x86 'introductory overview'

If I explain my idea for the 'test', and show you a few of its details, I can cover the basic topics Professor Pacheco had requested:

-- registers and addressing modes

-- integer (and other) operations

-- instruction encoding

-- some x86 historical notes

#### Historical note

- The x86 processor is a 'work-in-progress'
  - It has evolved over more than three decades
  - Its features have responded to 'market forces'
  - It has fiercely maintained 'backward compatibility'
- So today's x86 implements an immense legacy
  - Earliest 16-bit **real-mode** still is x86 'startup' mode
  - It can switch to 16-bit or 32-bit protected-mode
  - It can emulate real-mode within virtual-8086 mode
  - It can now be switched to 64-bit **enhanced mode**

## **Design-evolution influences**

- Faster
- Cheaper
- Smaller
- Stronger
- Safer
- Cooler



introduced in 1985



introduced in 2009

--Yet always keeping 'backward compatibility'

## The original 8086 registers

Each of these registers is 16-bits wide



Descriptive names are used for these 'nonorthogonal' registers

#### The 80386 registers

Here the 'Extended' registers are 32-bits wide



For 'backward compatibility', the former 16-bit registers are implemented within these 'Extended' 32-bit registers (i.e., as the least-significant 16-bits in each)

#### Lab machines use Intel's i7 CPU

Here the 'Revised' registers are 64-bits wide



For 'backward compatibility', the former 32-bit registers are implemented within these 'Revised' 64-bit registers (i.e., as the least-significant 32-bits in each) AND there are 8 additional 64-bit general registers (named R8, R9, R10, ... R15)

#### Sub-registers

Each general-purpose register can, by using different assembly language names, be treated in programs as having a width of 64-bits, 32-bits, 16-bits, or 8-bits.

**EXAMPLE:** The accumulator register can be called **AL**, or **AX**, or **EAX**, or **RAX**.



Similarly for the other general-purpose registers (e.g., **B**, **C**, and **D**).

## Little-Endian vs Big-Endian

• A signature feature of x86 processors (as well as Intel's various peripheral-device controllers) is its use of the "little-endian" convention when representing multi-byte data-values



• A programming advantage of x86 "little-endian" data-storage convention is that operands with differing widths can be addressed in a uniform manner, by referencing their earliest address.

#### Address-Spaces

An unusual feature of the x86 architecture is its use of different address-spaces for accesses to memory-locations versus I/O device-registers



#### x86 Privilege Levels



#### code-segment selector in CS register



## Earliest instruction-categories

- Data Transfer
- Control Transfer
- Arithmetical/Logical
- Processor Control
- Bit-manipulation
- String-manipulation

Some useful general rules apply to these catagories

### Arithmetical/Logical

- ADD, SUB, ADC, SBB, CMP, NEG
- AND, OR, XOR, TEST, NOT
- MUL, IMUL, DIV, IDIV
- INC, DEC
- CBW, CWD, CWDE, CDQ
- AAA, AAS, AAM, AAD, DAA, DAS

#### Data Transfer

- MOV, XCHG, XLAT, BSWAP
- PUSH, POP, PUSHA, POPA
- LEA, LDS, LES, LSS, LFS, LGS
- LAHF, SAHF, PUSHF, POPF
- IN, OUT

#### **Control Transfer**

- JMP, CALL, RET,
- JC/JNC, JS/JNS, JZ/JNZ, JP/JNP, JO/JNO
- JA/JAE, JB/JBE, JL/JLE, JG/JGE
- LOOP, LOOPE/LOOPZ, LOOPNE/LOOPNZ
- JCXZ/JECXZ
- INT, IRET, INTO, INT3

#### **Processor Control**

- CLI, STI
- CLD, STD
- CLC, STC, CMC
- NOP, HLT, WAIT, ESC, LOCK
- CS:, DS:, ES:, SS:, FS:, GS:
- INVD, WBINVD, INVLPG

#### **Bit-manipulation**

- SHL, SHR, SAL, SAR
- ROL, ROR, RCL, RCR
- BT, BTS, BTR, BTC
- SHLD, SHRD

# String Manipulation

- MOVS
- CMPS
- SCAS
- STOS
- LODS

INS/OUTS

• REP/REPE/REPZ

• REPNE/REPNZ

#### The FLAGS register



DF = Direction Flag IF = Interrupt Flag TF = Trap Flag IOPL = I/O Privilege Level CF = Carry Flag PF = Parity Flag AF = Auxilliary Flag ZF = Zero Flag SF = Sign Flag OF = Overflow Flag

Status Flags automatically get modified by Arithmetic and Logic instructions

Control Flags are only modified by an explicit use of a specific instruction

#### Assembly Language Statement Format



Example 1: An unlabeled double-operand data-transfer instruction-statement

mov %rdx, %rax # copies value from RDX into RAX

Example 2: An unlabeled single-operand arithmetical instruction-statement

inc %rbx # increases the value in RBX by +1

### x86 Machine-Instruction Format

x86 instructions may be of varying lengths, between 1 and 15 bytes inclusive



#### The One-Byte Opcode-Table

| 1<br>2<br>3<br>4<br>5<br>6 P    | 0<br>Eb, Gb<br>Eb, Gb<br>Eb, Gb<br>Eb, Gb<br>Eb, Gb<br>Eb, Gb<br>eAX<br>REX<br>rAX/r8<br>rAX/r8<br>rAX/r8<br>rAX/r8 | 1<br>Ev, Gv<br>Ev, Gv<br>Ev, Gv<br>Ev, Gv<br>eCX<br>REX.B<br>rCX/r9 | eDX<br>REX.X                                                                                | Gv, Ev<br>C<br>Gv, Ev<br>D<br>Gv, Ev<br>R<br>Gv, Ev<br><sup>34</sup> general regis<br>eBX<br>REX.XB | 4<br>AL, Ib<br>AL, Ib<br>AL, Ib<br>ster / REX <sup>664</sup> Pr<br>eSP<br>REX.R | eBP                                             | 6<br>PUSH<br>ES <sup>164</sup><br>SS <sup>164</sup><br>SEG=ES<br>(Prefix)<br>SEG=SS<br>(Prefix) | 7<br>POP<br>ES <sup>164</sup><br>DAA <sup>164</sup><br>AAA <sup>164</sup> | 8<br>Eb, Gb<br>Eb, Gb                                             | 9<br>Ev, Gv<br>Ev, Gv<br>Ev, Gv | Gb, Eb<br>Gb, Eb<br>Gb, Eb | B<br>DR<br>Gv, Ev<br>BB<br>Gv, Ev<br>UB<br>Gv, Ev | C<br>AL, Ib<br>AL, Ib<br>AL, Ib | D<br>rAX, Iz<br>rAX, Iz<br>rAX, Iz | E<br>PUSH<br>CS <sup>i64</sup><br>PUSH<br>DS <sup>i64</sup><br>SEG=CS<br>(Prefix) | F<br>2-byte<br>escape<br>(Table A-3)<br>POP<br>DS <sup>164</sup><br>DAS <sup>164</sup>                                                                                                                                                          |  |  |  |  |  |  |
|---------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|-------------------------------------------------|-------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------|-------------------------------------------------------------------|---------------------------------|----------------------------|---------------------------------------------------|---------------------------------|------------------------------------|-----------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| 1<br>2<br>3<br>4<br>5<br>6<br>P | Eb, Gb<br>Eb, Gb<br>Eb, Gb<br>eAX<br>REX<br>rAX/r8<br>cUSHA <sup>i64</sup> /                                        | Ev, Gv<br>Ev, Gv<br>Ev, Gv<br>eCX<br>REX.B<br>rCX/r9                | Gb, Eb<br>AD<br>Gb, Eb<br>AN<br>Gb, Eb<br>XO<br>Gb, Eb<br>INC <sup>it</sup><br>eDX<br>REX.X | Gv, Ev<br>C<br>Gv, Ev<br>D<br>Gv, Ev<br>R<br>Gv, Ev<br><sup>34</sup> general regis<br>eBX<br>REX.XB | AL, Ib<br>AL, Ib<br>AL, Ib<br>ster / REX <sup>064</sup> Pr<br>eSP               | rAX, Iz<br>rAX, Iz<br>rAX, Iz<br>refixes<br>eBP | PUSH<br>SS <sup>164</sup><br>SEG=ES<br>(Prefix)<br>SEG=SS                                       | ES <sup>164</sup><br>POP<br>SS <sup>164</sup><br>DAA <sup>164</sup>       | Eb, Gb<br>Eb, Gb                                                  | Ev, Gv                          | Gb, Eb<br>Gb, Eb<br>Gb, Eb | Gv, Ev<br>BB<br>Gv, Ev<br>UB<br>Gv, Ev            | AL, Ib                          | rAX, Iz                            | PUSH<br>DS <sup>164</sup><br>SEG=CS<br>(Prefix)                                   | escape<br>(Table A-3)<br>POP<br>DS <sup>164</sup><br>DAS <sup>164</sup>                                                                                                                                                                         |  |  |  |  |  |  |
| 2<br>3<br>4<br>5<br>6 P         | Eb, Gb<br>Eb, Gb<br>eAX<br>REX<br>rAX/r8<br>CUSHA <sup>i64</sup> /                                                  | Ev, Gv<br>Ev, Gv<br>eCX<br>REX.B<br>rCX/r9                          | Gb, Eb<br>AN<br>Gb, Eb<br>XO<br>Gb, Eb<br>INC <sup>it</sup><br>eDX<br>REX.X                 | Gv, Ev<br>D<br>Gv, Ev<br>R<br>Gv, Ev<br><sup>54</sup> general regis<br>eBX<br>REX.XB                | AL, Ib<br>AL, Ib<br>ster / REX <sup>064</sup> Pr<br>eSP                         | rAX, Iz<br>rAX, Iz<br>refixes<br>eBP            | SEG=ES<br>(Prefix)<br>SEG=SS                                                                    | DAA <sup>164</sup>                                                        | Eb, Gb                                                            |                                 | Gb, Eb<br>S<br>Gb, Eb      | Gv, Ev<br>UB<br>Gv, Ev                            |                                 |                                    | SEG=CS<br>(Prefix)                                                                | DAS <sup>164</sup>                                                                                                                                                                                                                              |  |  |  |  |  |  |
| 3<br>4<br>5<br>6 P              | Eb, Gb<br>eAX<br>REX<br>rAX/r8<br>PUSHA <sup>i64</sup> /                                                            | Ev, Gv<br>eCX<br>REX.B<br>rCX/r9                                    | Gb, Eb<br>XO<br>Gb, Eb<br>INC <sup>if</sup><br>eDX<br>REX.X                                 | Gv, Ev<br>R<br>Gv, Ev<br><sup>64</sup> general regis<br>eBX<br>REX.XB                               | AL, Ib<br>ster / REX <sup>064</sup> Pr                                          | rAX, Iz<br>refixes<br>eBP                       | (Prefix)<br>SEG=SS                                                                              |                                                                           |                                                                   | Ev, Gv                          | Gb, Eb                     | Gv, Ev                                            | AL, Ib                          | rAX, Iz                            | (Prefix)                                                                          |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| 4<br>5<br>6 P                   | eAX<br>REX<br>rAX/r8                                                                                                | eCX<br>REX.B<br>rCX/r9                                              | Gb, Eb<br>INC <sup>i6</sup><br>eDX<br>REX.X                                                 | Gv, Ev<br><sup>64</sup> general regis<br>eBX<br>REX.XB                                              | ster / REX <sup>064</sup> Pr                                                    | refixes<br>eBP                                  |                                                                                                 | AAA <sup>i64</sup>                                                        |                                                                   |                                 |                            |                                                   |                                 |                                    |                                                                                   | 164                                                                                                                                                                                                                                             |  |  |  |  |  |  |
| 5<br>6 P                        | rAX/r8                                                                                                              | REX.B                                                               | eDX<br>REX.X                                                                                | eBX<br>REX.XB                                                                                       | eSP                                                                             | eBP                                             |                                                                                                 |                                                                           | Eb, Gb                                                            | Ev, Gv                          | Gb, Eb                     | MP<br>Gv, Ev                                      | AL, Ib                          | rAX, Iz                            | SEG=DS<br>(Prefix)                                                                | AAS <sup>164</sup>                                                                                                                                                                                                                              |  |  |  |  |  |  |
| 6 P                             | rAX/r8                                                                                                              | REX.B                                                               | REX.X                                                                                       | REX.XB                                                                                              |                                                                                 |                                                 |                                                                                                 |                                                                           | DEC <sup>i64</sup> general register / REX <sup>064</sup> Prefixes |                                 |                            |                                                   |                                 |                                    |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| 6 P                             | PUSHA <sup>i64</sup> /                                                                                              |                                                                     | DV/-40                                                                                      |                                                                                                     |                                                                                 | REX.RB                                          | eSI<br>REX.RX                                                                                   | eDI<br>REX.RXB                                                            | eAX<br>REX.W                                                      | eCX<br>REX.WB                   | eDX<br>REX.WX              | eBX<br>REX.WXB                                    | eSP<br>REX.WR                   | eBP<br>REX.WRB                     | eSI<br>REX.WRX                                                                    | eDI<br>REX.WRXB                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| 6 P                             | PUSHA <sup>i64</sup> /                                                                                              |                                                                     | -D)//-40                                                                                    | PUSH <sup>d64</sup> general register                                                                |                                                                                 |                                                 |                                                                                                 |                                                                           |                                                                   |                                 |                            | POP <sup>d64</sup> into general register          |                                 |                                    |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| 6 P<br>Pl                       | PUSHA <sup>i64</sup> /<br>PUSHAD <sup>i64</sup>                                                                     |                                                                     | rDX/r10                                                                                     | rBX/r11                                                                                             | rSP/r12                                                                         | rBP/r13                                         | rSI/r14                                                                                         | rDI/r15                                                                   | rAX/r8                                                            | rCX/r9                          | rDX/r10                    | rBX/r11                                           | rSP/r12                         | rBP/r13                            | rSI/r14                                                                           | rDI/r15                                                                                                                                                                                                                                         |  |  |  |  |  |  |
|                                 |                                                                                                                     | POPA <sup>i64</sup> /<br>POPAD <sup>i64</sup>                       | BOUND <sup>i64</sup><br>Gv, Ma                                                              | ARPL <sup>i64</sup><br>Ew, Gw<br>MOVSXD <sup>064</sup><br>Gv, Ev                                    | SEG=FS<br>(Prefix)                                                              | SEG=GS<br>(Prefix)                              | Operand<br>Size<br>(Prefix)                                                                     | Address<br>Size<br>(Prefix)                                               | PUSH <sup>d64</sup><br>Iz                                         | IMUL<br>Gv, Ev, Iz              | PUSH <sup>d64</sup><br>Ib  | IMUL<br>Gv, Ev, Ib                                | INS/<br>INSB<br>Yb, DX          | INS/<br>INSW/<br>INSD<br>Yz, DX    | OUTS/<br>OUTSB<br>DX, Xb                                                          | OUTS/<br>OUTSW/<br>OUTSD<br>DX. Xz                                                                                                                                                                                                              |  |  |  |  |  |  |
| 7                               |                                                                                                                     |                                                                     | Jcc <sup>f64</sup> , Jt                                                                     | b - Short-displa                                                                                    | icement jump or                                                                 | n condition                                     |                                                                                                 |                                                                           |                                                                   |                                 | loof64                     | b- Short displace                                 | omont iumn on                   |                                    |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
|                                 | 0                                                                                                                   | NO                                                                  | B/NAE/C                                                                                     | NB/AE/NC                                                                                            | Z/E                                                                             | NZ/NE                                           | BE/NA                                                                                           | NBE/A                                                                     | s                                                                 | NS                              | P/PE                       | NP/PO                                             | L/NGE                           | NL/GE                              | LE/NG                                                                             | I NILE/C                                                                                                                                                                                                                                        |  |  |  |  |  |  |
| 8                               |                                                                                                                     | Immediat                                                            |                                                                                             |                                                                                                     | TE                                                                              | EST XCHG                                        |                                                                                                 |                                                                           | 3                                                                 |                                 | OV OV                      |                                                   | MOV                             | LEA                                |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
|                                 | Eb, Ib                                                                                                              | Ev, Iz                                                              | Eb, Ib <sup>i64</sup>                                                                       | Ev, Ib                                                                                              | Eb, Gb                                                                          | Ev, Gv                                          | Eb, Gb                                                                                          | Ev, Gv                                                                    | Eb, Gb                                                            | Ev, Gv                          | Gb, Eb                     | Gv, Ev                                            | Ev, Sw                          | Gv, M                              | Sw, Ew                                                                            | POP <sup>d64</sup> Ev                                                                                                                                                                                                                           |  |  |  |  |  |  |
| 9                               | NOP XCHG word, double-word or quad-word register with rAX                                                           |                                                                     |                                                                                             |                                                                                                     |                                                                                 |                                                 |                                                                                                 | CBW/                                                                      | CWD/                                                              | CALLF <sup>i64</sup>            | FWAIT/                     | PUSHF/D/Q                                         | POPF/D/Q                        | SAHF                               | LAHF                                                                              |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
|                                 | PAUSE(F3)<br>CHG r8, rAX                                                                                            | rCX/r9                                                              | rDX/r10                                                                                     | rBX/r11                                                                                             | rSP/r12                                                                         | rBP/r13                                         | rSI/r14                                                                                         | rDI/r15                                                                   | CWDE/<br>CDQE                                                     | CDQ/<br>CQO                     | Ap                         | WAIT                                              | d64/<br>Fv                      | d64/<br>Ev                         |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| A                               | AL, Ob                                                                                                              | MC<br>rAX, Ov                                                       | OV<br>Ob, AL                                                                                | Ov, rAX                                                                                             | MOVS/B<br>Xb, Yb                                                                | MOVS/W/D/Q<br>Xv, Yv                            | CMPS/B<br>Xb, Yb                                                                                | CMPS/W/D<br>Xv, Yv                                                        | TE<br>AL, Ib                                                      | ST<br>rAX, Iz                   | STOS/B<br>Yb, AL           | STOS/W/D/Q<br>Yv, rAX                             | LODS/B<br>AL, Xb                | LODS/W/D/Q<br>rAX, Xv              | SCAS/B<br>AL, Yb                                                                  | SCAS/W/D/Q<br>rAX, Xv                                                                                                                                                                                                                           |  |  |  |  |  |  |
| В                               |                                                                                                                     |                                                                     | MO                                                                                          | V immediate b                                                                                       | yte into byte reg                                                               | gister                                          |                                                                                                 |                                                                           |                                                                   | M                               | DV immediate               | word or double ir                                 | nto word, doubl                 | e, or quad regis                   | ter                                                                               | X, Xb         OUTSD<br>DX, Xz           E/NG         NLE/G           MOV         Grp 1A <sup>1A</sup><br>POP <sup>d04</sup> Ev           iAHF         LAHF           CAS/B         SCAS/W/D/Q<br>rAX, Xv           /r14, Iv         rDI/r15, Iv |  |  |  |  |  |  |
| A                               | AL/R8L, Ib                                                                                                          | CL/R9L, Ib                                                          | DL/R10L, Ib                                                                                 | BL/R11L, Ib                                                                                         | AH/R12L, Ib                                                                     |                                                 | DH/R14L, Ib                                                                                     | BH/R15L, Ib                                                               | rAX/r8, lv                                                        | rCX/r9, lv                      | rDX/r10, lv                | rBX/r11, lv                                       | rSP/r12, lv                     | rBP/r13, lv                        | rSI/r14, Iv                                                                       | rDI/r15, lv                                                                                                                                                                                                                                     |  |  |  |  |  |  |
| С                               | Shift Gi<br>Eb, Ib                                                                                                  | p 2 <sup>1A</sup><br>Ev, Ib                                         | RETN <sup>f64</sup><br>Iw                                                                   | RETN <sup>f64</sup>                                                                                 | LES <sup>i64</sup><br>Gz,<br>MpVEX+2byte                                        | LDS <sup>i64</sup><br>Gz, Mp<br>VEX+1byte       | Grp 11<br>Eb, Ib                                                                                | <sup>1A</sup> - MOV<br>Ev, Iz                                             | ENTER<br>Iw, Ib                                                   | LEAVE <sup>d64</sup>            | RETF                       | RETF                                              | INT 3                           | INT<br>Ib                          | INTO <sup>i64</sup>                                                               | IRET/D/Q                                                                                                                                                                                                                                        |  |  |  |  |  |  |
| D                               | Shift Grp 2 <sup>1A</sup> AAM <sup>i64</sup> AAD <sup>i64</sup> XLAT/                                               |                                                                     |                                                                                             |                                                                                                     |                                                                                 |                                                 |                                                                                                 |                                                                           |                                                                   |                                 | ESC (                      | Escape to copro                                   | cessor instruct                 | ion set)                           |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
|                                 | Eb, 1                                                                                                               | Ev, 1                                                               | Eb, CL                                                                                      | Ev, CL                                                                                              | lb                                                                              | lb                                              |                                                                                                 | XLATB                                                                     |                                                                   |                                 |                            |                                                   |                                 |                                    |                                                                                   |                                                                                                                                                                                                                                                 |  |  |  |  |  |  |
| E LC                            | OOPNE <sup>f64</sup> /                                                                                              | LOOPE <sup>f64</sup> /                                              | LOOP <sup>f64</sup>                                                                         | JrCXZ <sup>f64</sup> /                                                                              |                                                                                 | N                                               | (                                                                                               | DUT                                                                       | CALL <sup>f64</sup>                                               |                                 | JMP                        |                                                   | I                               | N                                  | 0                                                                                 | UT                                                                                                                                                                                                                                              |  |  |  |  |  |  |
|                                 | .OOPNZ <sup>f64</sup><br>Jb                                                                                         | LOOPZ <sup>f64</sup><br>Jb                                          | Jb                                                                                          | Jb                                                                                                  | AL, Ib                                                                          | eAX, Ib                                         | lb, AL                                                                                          | lb, eAX                                                                   | Jz                                                                | near <sup>f64</sup><br>Jz       | far <sup>i64</sup><br>Ap   | short <sup>f64</sup><br>Jb                        | AL, DX                          | eAX, DX                            | DX, AL                                                                            | DX, eAX                                                                                                                                                                                                                                         |  |  |  |  |  |  |
| F                               | LOCK<br>(Prefix)                                                                                                    |                                                                     | REPNE                                                                                       | REP/REPE                                                                                            | HLT                                                                             | CMC                                             |                                                                                                 | Grp 3 <sup>1A</sup>                                                       | CLC                                                               | STC                             | CLI                        | STI                                               | CLD                             | STD                                | INC/DEC                                                                           | INC/DEC                                                                                                                                                                                                                                         |  |  |  |  |  |  |
|                                 | (FIGHX)                                                                                                             |                                                                     | (Prefix)                                                                                    | (Prefix)                                                                                            |                                                                                 |                                                 | Eb                                                                                              | Ev                                                                        |                                                                   |                                 |                            |                                                   |                                 |                                    | Grp 4 <sup>1A</sup>                                                               | Grp 5 <sup>1A</sup>                                                                                                                                                                                                                             |  |  |  |  |  |  |

Note: The 0x0F byte is the 'escape' to the Two-Byte Opcode Table.

Note: The bytes 0xD8, 0xD9, ... 0xDF are 'escapes' to co-processor opcode tables.\_

## CISC string-instruction example

msg: .asciz "Hello, World!"

# for computing the length of a 'null-terminated' character-string

.equ MAXLEN, 65535

strlen: # null-byte in register AL %al, %al xor \$msg, %edi # EDI = string's address mov # ECX = maximum length \$MAXLEN, %ecx mov # use forward processing cld # scan for the final byte repne scasb # subtract initial from final sub \$msg, %edi

# now EDI will contain the number of bytes scanned

# Now, what is a 'page-mapping'?



## And, what is a 'task-switch'?



## So, what is a 'page-fault'?

- If the CPU tries to access a memory-address that is not currently "mapped"-- or that it lacks appropriate privilege – this will be detected by the CPU as an illegal memory-reference -- and is known as a 'page-fault exception'
- The CPU will automatically jump to a routine in kernel space, known as an 'exception handler', after first saving a tiny amount of information about the situation that led to this 'exception'
- For more details we need some background...

#### page-fault context-information

The kernel's stack



NOTE: The CS register's value includes the 2-bit 'Current Privilege Level' (CPL)

# Basic idea for my page-fault tester

# my global variables
unsigned short selector;
unsigned long oldisr14;

# global variable for storing 16-bit value# global variable for storing 64-bit constant

# my Interrupt Service Routine 'front-end' for handling 'page-fault' exceptions isr\_entry:

| mov | 16(%rsp), %rax | # data-transfer from memory to register |
|-----|----------------|-----------------------------------------|
| mov | %ax, selector  | # data-transfer from register to memory |
| jmp | *oldisr14      | # indirect control-transfer via memory  |

# NOTE: this simple code-fragment ignores a few crucial issues...

#### Essential issues to confront

- Another 'page-fault' is likely to happen quickly
  - So any saved 'selector' value will get 'overwritten'

- Any interrupt-routine must preserve registers
  - With my 'handler' the RAX register gets 'clobbered'
- All SMP interrupt-routines must be 'reentrant'
  - Multiple CPUs could access 'selector' in parallel

#### 1. use an array of storage-cells

#### #define MAXNUM 255

| unsigned short | selector[ MAXNUM ]; |
|----------------|---------------------|
| unsigned long  | oldisr14;           |
| unsigned long  | pgfaults = 0;       |

# enough space for lots of 16-bit values# to store address of the original handler# keep count of page-fault occurrences

#### isr\_entry:

| mov        | pgfaults, %rbx                              | # setup count as array-index in EBX                                                 |
|------------|---------------------------------------------|-------------------------------------------------------------------------------------|
| mov<br>mov | 16(%rsp), %rax<br>%ax, selector( , %rbx, 2) | <pre># copy CS-selector-image into RAX # and save it into the next array cell</pre> |
| incq       | pgfaults                                    | # increment count for the next fault                                                |
| jmpq       | *oldisr14                                   | # transfer to the normal fault-handler                                              |

## 2. Preserve the 'working' registers

#### #define MAXNUM 255

| unsigned short | selector[ MAXNUM ]; |
|----------------|---------------------|
| unsigned long  | oldisr14;           |
| unsigned long  | pgfaults = 0;       |

# enough space for lots of 16-bit values# to store address of the original handler# keep count of page-fault occurrences

#### isr\_entry:

| push<br>push | %rax<br>%rbx                                        | # save the RAX register-value # save the RBX register-value                         |
|--------------|-----------------------------------------------------|-------------------------------------------------------------------------------------|
| mov          | pgfaults, %rbx                                      | # setup count as array-index in EBX                                                 |
| mov<br>mov   | <b>32</b> (%rsp), %rax<br>%ax, selector( , %rbx, 2) | <pre># copy CS-selector-image into RAX # and save it into the next array cell</pre> |
| incq         | pgfaults                                            | # increment count for the next fault                                                |
| pop<br>pop   | %rbx<br>%rax                                        | # recover the RBX register-value<br># recover the RAX register-value                |
| jmpq         | *oldisr14                                           | # transfer to the normal fault-handler                                              |

### Stack after pushing RAX and RBX

The kernel's stack



NOTE: The CS register's value includes the 2-bit 'Current Privilege Level' (CPL)

# 3. Increment 'pgfaults' atomically

#define MAXNUM 255

| unsigne<br>unsigne<br>unsigne | d long      | selector[ MAXNUM ];<br>oldisr14;<br>pgfaults = 0; | <ul><li># enough space for lots of 16-bit values</li><li># to store address of the original handler</li><li># keep count of page-fault occurrences</li></ul> |
|-------------------------------|-------------|---------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
| isr_entry                     | /:<br>push  | %rax                                              | # save the RAX register-value                                                                                                                                |
|                               | push        | %rbx                                              | # save the RBX register-value                                                                                                                                |
|                               | mov<br>xadd | \$1, %rbx<br>%rbx, pgfaults                       | <pre># amount of the 'pgfaults' increment # use XADD for SMP-safe update</pre>                                                                               |
|                               | mov<br>mov  | 32(%rsp), %rax<br>%ax, selector( , %rbx, 2)       | <pre># copy CS-selector-image into RAX # and save it into the next array cell</pre>                                                                          |
|                               | рор<br>рор  | %rbx<br>%rax                                      | <pre># recover the RBX register-value # recover the RAX register-value</pre>                                                                                 |
|                               | jmpq        | *oldisr14                                         | # transfer to the normal fault-handler                                                                                                                       |

#### But now a 'new' issue arises

• After a certain number of page-faults occur, our 'selector[]' storage-array will overflow!

• Can you think of a way to solve that problem? (There is more than one way you could do it)

#### From our 'faultcpl.c' demo

#define MAXNUM 255 // maximum number of saved selectors const long max = MAXNUM; unsigned long oldisr14; unsigned long pgfaults = 0; unsigned short selector [ MAXNUM ]; //---- INTERRUPT SERVICE ROUTINE void isr entry( void ); asm(" .text "); asm(" "); isr entry, @function .type "); asm("isr entry: asm(" push %rax "); asm(" push %rbx "); asm(" mov pgfaults, %rax "); asm(" cmp max, %rax "); asm(" jge bypass "); asm(" mov \$1, %rbx "); asm(" xadd %rbx, pgfaults "); asm(" mov 32(%rsp), %rax "); asm(" %ax, selector(, %rbx, 2) "); mov "); asm("bypass: asm(" %rbx "); pop asm(" "); pop %rax \*oldisr14 asm(" jmpq "); 11---

#### Output from '/proc/faultcpl' demo

Below are shown the code-segment selectors for the first 255 page-faults:

| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
|------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|--|
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0010 | 0010 | 0010 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0010 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0010 | 0010 | 0010 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0010 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0010 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
| 0010 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 | 0033 |  |
|      |      |      |      |      |      |      |      |      |      |      |      |      |      |      |  |

Observe that 10 of these page-faults occurred while executing in kernel-mode

## So what's the "coolest" thing?

- Knowing x86 assembly language gives you the power to investigate all capabilities of the CPUs being used in today's most common platforms
- It's like having Galileo's telescope: you can see whether 'conventional wisdom' is correct or not
- And you can investigate new x86 features as soon as they are implemented, without waiting until software tools or programming languages have been developed to offer support for them

# Model Specific Registers (MSR)

- As enhancements get added to the x86 CPU, additional registers are needed for controls.
- So Intel devised a scheme for accessing up to 4-billion so-called MSRs by means of just two privileged instructions: rdmsr and wrmsr.
- EXAMPLE: # reading from a 64-bit MSR

mov \$0x19C, %ecx rdmsr

- mov %eax, msr\_lo
- mov %edx, msr\_hi

# load MSR's ID-number into ECX# read the Model Specific Register# EAX holds least-significant 32-bits# EDX holds most-significant 32-bits

#### Intel-x86 Processor Modulation

This is a fairly recent Intel-x86 enhancement that allows monitoring and controlling the temperature inside the CPU, to prevent circuit-damage due to'overheating'



Figure 14-5. Processor Modulation Through Stop-Clock Mechanism

#### Intel-x86 Thermal Status register

| 63                                       | 32             | 31 2 | 7 23 22 | 16 15 | 11 10 9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 8 7 | 6 | 5 4 | 3 | 2 |  |
|------------------------------------------|----------------|------|---------|-------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|---|-----|---|---|--|
|                                          | Reserved       |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Reading Valid                            |                |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Resolution in Deg                        | . Celsius      |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Digital Readout                          |                |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Power Limit Notif<br>Power Limit Notif   | ication Log —  |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| <b>Thermal Thresho</b>                   | Id #2 Log      |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Thermal Thresho                          | Id #2 Status - |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Thermal Thresho                          | Id #1 Log      |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Thermal Thresho                          | Id #1 Status   |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Critical Temperate<br>Critical Temperate | ure Log        |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| DDOOLIOT# or E                           | ODCEDD#100     |      |         |       | and the second se |     |   |     |   |   |  |
| PROCHOT# or F                            | ORCEPR# Eve    | nt   |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Thermal Status L                         | og             |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |
| Thermal Status -                         |                |      |         |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |   |     |   |   |  |

Figure 14-12. IA32\_THERM\_STATUS Register

## IA32\_THERM\_CONTROL

| 63       | 543 10                                                                     |
|----------|----------------------------------------------------------------------------|
|          | Reserved                                                                   |
|          | On-Demand Clock Modulation Enable<br>On-Demand Clock Modulation Duty Cycle |
| Reserved |                                                                            |

Table 14-1. On-Demand Clock Modulation Duty Cycle Field Encoding

| Duty Cycle Field Encoding | Duty Cycle      |
|---------------------------|-----------------|
| 000B                      | Reserved        |
| 001B                      | 12.5% (Default) |
| 010B                      | 25.0%           |
| 011B                      | 37.5%           |
| 100B                      | 50.0%           |
| 101B                      | 63.5%           |
| 110B                      | 75%             |
| 111B                      | 87.5%           |

### IA32\_THERM\_INTERRUPT

| 63                                                                | 25 | 24 | 23 | 22 | <br>16 | 15 | 14 | <br>8 | 5 | 4 | 3 | 2 | 1 | 0 |  |
|-------------------------------------------------------------------|----|----|----|----|--------|----|----|-------|---|---|---|---|---|---|--|
| Reserved                                                          |    |    |    |    |        |    |    |       |   |   |   |   |   |   |  |
| Power Limit Notification Enable—<br>Threshold #2 Interrupt Enable |    |    |    |    |        |    |    |       |   |   |   |   |   |   |  |
| Threshold #2 Value<br>Threshold #1 Interrupt Enable               |    |    |    |    |        |    |    |       |   |   |   |   |   |   |  |
| Threshold #1 Value<br>Overheat Interrupt Enable                   |    |    |    |    | <br>   |    |    |       |   |   |   |   |   |   |  |
| FORCPR# Interrupt Enable —<br>PROCHOT# Interrupt Enable           |    |    |    |    | <br>   |    |    | <br>  |   |   |   |   |   |   |  |
| Low Temp. Interrupt Enable —<br>High Temp. Interrupt Enable       |    |    |    |    | <br>   |    |    | <br>  |   |   |   |   |   |   |  |

Figure 14-13. IA32\_THERM\_INTERRUPT Register

#### Our 'celsius.c' example

- We wrote this kernel module to let users view the value in a cpu's THERM\_STATUS register
- Its '**Digital Readout**' field will show how far the processor's current temperature is below what Intel regards as its maximum (called 'Tj Max')
- The processor shuts down if this difference is 0
- These temperatures are in degrees-Celsius
- (Note: AMD's cpus employ a different scheme)

#### '/proc/celsius'

• Our 'celsius.c' module creates this pseudo-file, which a user can access with this command:

\$ cat /proc/celsius

• Here's what the screen-output would look like:

```
cpu1 thermal_status = 882E0000 54-degrees (celsius)
cpu2 thermal_status = 882C0000 56-degrees (celsius)
cpu3 thermal_status = 88280000 60-degrees (celsius)
cpu4 thermal_status = 882E0000 54-degrees (celsius)
```

4 CPUs detected

Note: This demo was run on a machine with an Intel Core 2 Quad processor.

#### **Demos and Questions**

Speaker's website: <http://cs.usfcs.edu/~cruse/>