A false trick on swapping two variables

August 16, 2017

There are two typical approaches to implement an function to compute the n’th element of the Fibonacci sequence.

  • One approach:
class Solution {
public:
    int Fibonacci(int n) {
        if (n <= 0) {
            return 0;
        }
        if (n == 1) {
            return 1;
        }
        int a_i_1 = 0;
        int a_i = 1;
        int tmp;
        for (int i = 2; i <= n; i++) {
            tmp = a_i;
            a_i = a_i_1 + a_i;
            a_i_1 = tmp;
        }
        return a_i;
    }
};
  • Yet another approach:
class Solution {
public:
    int Fibonacci(int n) {
        if (n <= 0) {
            return 0;
        }
        if (n == 1) {
            return 1;
        }
        int a_i_1 = 0;
        int a_i = 1;
        //int tmp;
        for (int i = 2; i <= n; i++) {
            //tmp = a_i;
            a_i = a_i_1 + a_i;
            a_i_1 = a_i - a_i_1;
        }
        return a_i;
    }
};

It seems that the second approach will ease the number of registers used and the decrease in number of statements will result in a better performance. However, it is proven to be false.

Below are my experiments: First write a Makefile to compile them:

CC = g++
CFLAGS = -Wall -g

TARGET = driver
OBJECT = driver.o
default: driver

$(TARGET): $(OBJECT)
	$(CC) $(CFLAGS) -o $(TARGET) $(OBJECT)

OBJECT: driver.cpp
	$(CC) $(CFLAGS) -o driver.cpp

clean:
    -rm -f *.o *~

Then make to compile the source driver.cpp and generate the binary driver,

Run the binary to see if anything goes right.

./driver

Dump the binary into assembly:

objdump -D driver > driver.s

Do these steps for both the versions with and without tmp

For the version with tmp:

00000000004005dc <_ZN8Solution9FibonacciEi>:
403   4005dc: 55                    push   %rbp
404   4005dd: 48 89 e5              mov    %rsp,%rbp
405   4005e0: 48 89 7d e8           mov    %rdi,-0x18(%rbp)
406   4005e4: 89 75 e4              mov    %esi,-0x1c(%rbp)
407   4005e7: 83 7d e4 00           cmpl   $0x0,-0x1c(%rbp)
408   4005eb: 7f 07                 jg     4005f4 <_ZN8Solution9FibonacciEi+0x18>
409   4005ed: b8 00 00 00 00        mov    $0x0,%eax
410   4005f2: eb 45                 jmp    400639 <_ZN8Solution9FibonacciEi+0x5d>
411   4005f4: 83 7d e4 01           cmpl   $0x1,-0x1c(%rbp)
412   4005f8: 75 07                 jne    400601 <_ZN8Solution9FibonacciEi+0x25>
413   4005fa: b8 01 00 00 00        mov    $0x1,%eax
414   4005ff: eb 38                 jmp    400639 <_ZN8Solution9FibonacciEi+0x5d>
415   400601: c7 45 fc 00 00 00 00  movl   $0x0,-0x4(%rbp)
416   400608: c7 45 f8 01 00 00 00  movl   $0x1,-0x8(%rbp)
417   40060f: c7 45 f4 02 00 00 00  movl   $0x2,-0xc(%rbp)
418   400616: eb 16                 jmp    40062e <_ZN8Solution9FibonacciEi+0x52>
419   400618: 8b 45 f8              mov    -0x8(%rbp),%eax
420   40061b: 89 45 f0              mov    %eax,-0x10(%rbp)
421   40061e: 8b 45 fc              mov    -0x4(%rbp),%eax
422   400621: 01 45 f8              add    %eax,-0x8(%rbp)
423   400624: 8b 45 f0              mov    -0x10(%rbp),%eax
424   400627: 89 45 fc              mov    %eax,-0x4(%rbp)
425   40062a: 83 45 f4 01           addl   $0x1,-0xc(%rbp)
426   40062e: 8b 45 f4              mov    -0xc(%rbp),%eax
427   400631: 3b 45 e4              cmp    -0x1c(%rbp),%eax
428   400634: 7e e2                 jle    400618 <_ZN8Solution9FibonacciEi+0x3c>
429   400636: 8b 45 f8              mov    -0x8(%rbp),%eax
430   400639: 5d                    pop    %rbp
431   40063a: c3                    retq

For the version without tmp:

00000000004005dc <_ZN8Solution9FibonacciEi>:
401   4005dc: 55                    push   %rbp
402   4005dd: 48 89 e5              mov    %rsp,%rbp
403   4005e0: 48 89 7d e8           mov    %rdi,-0x18(%rbp)
404   4005e4: 89 75 e4              mov    %esi,-0x1c(%rbp)
405   4005e7: 83 7d e4 00           cmpl   $0x0,-0x1c(%rbp)
406   4005eb: 7f 07                 jg     4005f4 <_ZN8Solution9FibonacciEi+0x18>
407   4005ed: b8 00 00 00 00        mov    $0x0,%eax
408   4005f2: eb 46                 jmp    40063a <_ZN8Solution9FibonacciEi+0x5e>
409   4005f4: 83 7d e4 01           cmpl   $0x1,-0x1c(%rbp)
410   4005f8: 75 07                 jne    400601 <_ZN8Solution9FibonacciEi+0x25>
411   4005fa: b8 01 00 00 00        mov    $0x1,%eax
412   4005ff: eb 39                 jmp    40063a <_ZN8Solution9FibonacciEi+0x5e>
413   400601: c7 45 fc 00 00 00 00  movl   $0x0,-0x4(%rbp)
414   400608: c7 45 f8 01 00 00 00  movl   $0x1,-0x8(%rbp)
415   40060f: c7 45 f4 02 00 00 00  movl   $0x2,-0xc(%rbp)
416   400616: eb 17                 jmp    40062f <_ZN8Solution9FibonacciEi+0x53>
417   400618: 8b 45 fc              mov    -0x4(%rbp),%eax
418   40061b: 01 45 f8              add    %eax,-0x8(%rbp)
419   40061e: 8b 45 fc              mov    -0x4(%rbp),%eax
420   400621: 8b 55 f8              mov    -0x8(%rbp),%edx
421   400624: 29 c2                 sub    %eax,%edx
422   400626: 89 d0                 mov    %edx,%eax
423   400628: 89 45 fc              mov    %eax,-0x4(%rbp)
424   40062b: 83 45 f4 01           addl   $0x1,-0xc(%rbp)
425   40062f: 8b 45 f4              mov    -0xc(%rbp),%eax
426   400632: 3b 45 e4              cmp    -0x1c(%rbp),%eax
427   400635: 7e e1                 jle    400618 <_ZN8Solution9FibonacciEi+0x3c>
428   400637: 8b 45 f8              mov    -0x8(%rbp),%eax
429   40063a: 5d                    pop    %rbp

Compare these two versions of assembly code, the corresponding parts inside the loop are: line 419 to 427 in first version; line 417 to 426 in second version; They both have 6 mov, 1 add, 1 addl, but the second version has one more sub

Notes: To make it easier to read the assembly code, here I briefly give a map between C variables and the corresponding notation in assembly:

-0x1c(%rbp) is n.

-0xc(%rbp) is i inside the loop.

-0x8(%rbp) is corresponding to a_i.

-0x4(%rbp) is corresponding to a_i_1.

Conclusion: There is no actual performance benefit to use subtraction in order to avoid extra temperal variables in C code, since the actual variable swap is handled by compiler.