How does switch-case work? A low-level look

A low-level look at how switch-case compiles to assembly, why it's faster than if-else, and the jump-table trick behind the speed.

May 12, 20266 min readSwitch Case

Ever wondered how switch-case actually works under the hood? We’ll focus on C++ and other old-school compiled languages.

There are a few questions we should answer before we get to the “How”. Let’s start.

What is a switch-case statement?#

A switch-case statement is shown in the code snippet below. It accepts a variable and compares it against all cases. The first case statement that matches the value gets executed. There are some concepts like fallthrough and break that I’ll explain later.

switch(fruit) {
  case 1:
    cout << "Apple" << endl;
    break;
  case 2:
    cout << "Orange" << endl;
    break;
  case 3:
    cout << "Banana" << endl;
    break;
  default:
    cout << "Invalid Fruit" << endl;
    break;
}

Snippet 1.C++ switch-case example

For example, suppose the fruit variable is equal to 2 so the switch statement matches the second case statement and executes cout << "Orange" << endl; break;. So once those lines run we will see Orange in stdout. To better understand the flow of code, you can think of it as a chain of if-else statements — same behavior, different mechanism.

if (fruit == 1) {
  cout << "Apple" << endl;
} else if (fruit == 2) {
  cout << "Orange" << endl;
} else if (fruit == 3) {
  cout << "Banana" << endl;
} else {
  cout << "Invalid Fruit" << endl;
}

Snippet 2.C++ if-else example

As you can see in the snippet, a switch statement always accepts a variable that is enumerable such as enum and integral values. That leads us to the second question.

Do all languages restrict switch to enumerable?#

That’s a bit complex but no, modern languages like Rust and interpreter languages like JavaScript, unlike the old-school languages, accept other types like strings. However, the architecture behind them is very different. In this blog post our focus is on old-school languages.

Why do we need switch?#

Someone might ask you why we need switch when we can mimic it with if-else. There are three reasons.

Clarity: It’s much easier to read when you have 10 different options.
Organization: It keeps all your choices in one neat block of code.
Speed: Actually, this reason is the biggest one — because of switch’s special architecture it can be much faster than if-else statements.

How does `if` work?#

It will be clearer if you know how if works first.

An if statement from the machine’s perspective is made of two instructions: compare and jump. The compare part resolves the condition of if and the jump moves the program counter to the address of the code that handles the matching branch. Sounds abstract — the example below makes it concrete.

For example Snippet 3 compiles down to Snippet 4. Assembly has multiple jump instructions each with a different meaning; we use two of them here.

JNE: Jump if the previous compare result was not equal.
JMP: It’s an unconditional jump.

Okay, let’s explain the assembly code. I suppose we want to run this code on an x86_64 Linux machine, and this code is part of a function whose first argument is the fruit variable. The calling convention of Linux says the first argument is always placed in the rdi register — keep it in mind.

if (fruit == 1) {
  cout << "Apple" << endl;
} else if (fruit == 2) {
  cout << "Orange" << endl;
} else {
  cout << "Invalid Fruit" << endl;
}

Snippet 3.Simple C++ if-else example

  ;if (fruit == 1)
  CMP rdi, 1
  JNE .L1
    ;cout << "Apple" << endl;
    LEA rdi, APPLE
    call Print
    JMP .L3
  .L1:
    CMP rdi, 2
    JNE .L2
      ;cout << "Orange" << endl;
      LEA rdi, ORANGE
      call Print
      JMP .L3
    .L2:
      ;cout << "Invalid Fruit" << endl;
      LEA rdi, INVALID_FRUIT
      call Print
  .L3:

Snippet 4.Assembly code of "Snippet 3"

CMP rdi, 1 compares rdi to 1 and sets the CPU flags so they can be used by conditional jump instructions.
JNE .L1 jumps to the .L1 label if the previous comparison result was not equal.
LEA rdi, APPLE loads the address of the APPLE string into the rdi register.
call Print calls the Print function and passes rdi as the first argument.
JMP .L3 jumps unconditionally to .L3, skipping over the code in between.

Other lines of code are similar to the previous five lines that I explained.

Finally, as a bonus, here is the flowchart for the assembly code above.

graph TD
    S([start]) --> CMP[CMP rdi, 1]
    CMP --> D1{rdi == 1?}
    D1 -->|yes| A[LEA rdi, APPLE; call Print]
    A --> E_L3([.L3: end])
    D1 -->|no| CMP2[CMP rdi, 2]
    CMP2 --> D2{rdi == 2?}
    D2 -->|yes| O[LEA rdi, ORANGE; call Print]
    O --> E_L3
    D2 -->|no| I[LEA rdi, INVALID_FRUIT; call Print]
    I --> E_L3

Figure 1.Assembly code schematic

So, how does switch-case work?#

Unlike if-else statements, switch-case doesn’t use assembly compare and jump instructions, instead it uses a jump table or in some cases a hash table. To make it easy to understand, we’ll focus on the jump table. Let’s see the switch-case statement from another angle. Hypothetically, see the statement as an array where each array item — conceptually — contains the code for each option. In that case, we can jump to the right item just by knowing its index, which is already stored in the fruit variable. Wait — but what if fruit is bigger than 3? Yeah, we’d still need a comparison to handle that edge case. The other cases, though, can be resolved just by indexing into the array.

print Apple; break

print Orange; break

print Banana; break

Figure 2.Switch-case as an array

There is still a problem — how do we store the code in an array? The answer: “We can’t”. But we can store the address of each block of code in the array, then jump to whichever address sits at the right index.

So in assembly we get something like snippet 5.

  dec     rdi
  cmp     rdi, 2
  ja      .Ld
  lea     rcx, .Lt 
  mov     rax, [rcx + rdi * 8]
  jmp     rax
  .L1:
    lea   rdi, APPLE
    call  Print
    jmp   .Le
  .L2:
    lea   rdi, ORANGE
    call  Print  
    jmp   .Le
  .L3:
    lea   rdi, BANANA
    call  Print
    jmp   .Le
  .Ld:
    lea   rdi, INVALID_FRUIT
    call  Print
 
  .Le: 
    ret
  .Lt:
    .quad   .L1
    .quad   .L2
    .quad   .L3   
  APPLE: .asciz "Apple"
  ORANGE: .asciz "Orange"
  BANANA: .asciz "Banana"
  INVALID_FRUIT: .asciz "Invalid Fruit"

Snippet 5.Assembly representation of C++ switch-case

As you can see, there is only one comparison. To find the correct location of the code we do some math. Let me walk through it:

Since arrays are zero-indexed, rdi should be adjusted so we subtract 1 from it.
Then compare it to the highest possible value; if it is bigger than that, jump to the default label (.Ld).
lea rcx, .Lt loads the array address into rcx.
mov rax, [rcx + rdi * 8] reads the address at offset rdi * 8 (in 64-bit computers takes 8 bytes) into rax.
jmp rax jumps to that address.

That’s it. If anything is still unclear, keep reading.

Why is switch-case faster than if-else?#

The conditional jump assembly instructions get expensive when the CPU mispredicts them — the CPU pipeline has to flush whatever work it had guessed and start over, so by minimizing them and by making them more predictable, the CPU can go ahead and be prepared for our code.

Do compilers always use jump tables?#

Actually no, there are many reasons why a compiler might decide to choose another approach. For example, if the options are fewer than a threshold, they choose to use the compare approach instead of a jump table, because in those cases an indirect jump is costlier than a few direct compares. So if you check the compiler-generated assembly, you may see different code.

What’s next?#

To actually practice and see jump tables in action, try the pwn.college assembly crash course — especially absolute-jump, relative-jump, jump-trampoline, conditional-jump, and indirect-jump challenges.

▸ stay subscribed

Liked this?

Drop your email and you'll get the next post when it's published. No tracking, one-click unsubscribe.