Tag: Engineering
Published on: 01 May 2024
When reading the C++ language specification, you may have noticed that methods defined entirely inside a class are implicitly inline. Quoting it from cppreference.com:
A function defined entirely inside a class/struct/union definition, whether it’s a member function or a non-member friend function, is implicitly an inline function unless it is attached to a named module(since C++20).
This statement makes me wondering why is it necessary? Given that the inline
keyword would expand the function body at the call site, leading to code bloat, why would the language designers make class methods implicitly inline?
The reason is simple: to avoid linker errors. In C++, we have the one-definition rule (ODR) which states that an entity can only have one definition in the entire program. If you define a function in a header file and include it in multiple translation units, the linker will complain about multiple definitions.
For example, consider the following code, which violates the ODR:
First, we have a commonFunction
defined in a header file common.h
, which is later used by two functions in func1.cpp
and func2.cpp
.
// common.h
#pragma once
int commonFunction() {
return 42;
}
// func1.h
#pragma once
int func1();
// func1.cpp
#include "common.h"
int func1() {
return commonFunction();
}
// func2.h
#pragma once
int func2();
// func2.cpp
#include "common.h"
int func2() {
return commonFunction();
}
Then, inside the main.cpp
, we call both func1
and func2
.
// main.cpp
#include <iostream>
#include "func1.h"
#include "func2.h"
int main() {
int value1 = func1();
int value2 = func2();
return 0;
}
Now, trying to compile the code will result in a linker error:
g++ -Wall -Wextra -std=c++11 -c main.cpp # OK
g++ -Wall -Wextra -std=c++11 -c func1.cpp # OK
g++ -Wall -Wextra -std=c++11 -c func2.cpp # OK
g++ -Wall -Wextra -std=c++11 -o main main.o func1.o func2.o # <- Linker error
duplicate symbol '__Z14commonFunctionv' in:
func1.o
func2.o
ld: 1 duplicate symbols
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [main] Error 1
Since we have two translation units (code unit that are compiled into object files) func1.cpp
and func2.cpp
that include the common.h
header file, the commonFunction
is defined twice in func1.o
and func2.o
, violating the ODR.
Inspecting the object files with objdump
shows that the commonFunction
is defined in both object files:
objdump -t func1.o
func1.o: file format mach-o arm64
SYMBOL TABLE:
0000000000000000 l F __TEXT,__text ltmp0
0000000000000020 l O __LD,__compact_unwind ltmp1
0000000000000000 g F __TEXT,__text __Z14commonFunctionv # <- Note
0000000000000008 g F __TEXT,__text __Z5func1v
func2.o: file format mach-o arm64
SYMBOL TABLE:
0000000000000000 l F __TEXT,__text ltmp0
0000000000000020 l O __LD,__compact_unwind ltmp1
0000000000000000 g F __TEXT,__text __Z14commonFunctionv # <- Note
0000000000000008 g F __TEXT,__text __Z5func2v
In addition, note that the commonFunction
is defined as a global symbol in both object files with the flag g
in the symbol table (we will see for the inline
‘d version, we will have a w
flag, aka. a weak symbol instead).
To solve the linker error, we can make the commonFunction
inline. And according to the ODR, inline functions, inline variables, templates and types can be defined more than once as long as the definitions are identical.
Updated common.h
:
// common.h
#pragma once
inline int commonFunction() {
return 42;
}
Now, the code compiles and links successfully. Inspect the updated object files with objdump
:
func1.o: file format mach-o arm64
SYMBOL TABLE:
0000000000000000 l F __TEXT,__text ltmp0
0000000000000020 l O __LD,__compact_unwind ltmp1
0000000000000014 w F __TEXT,__text __Z14commonFunctionv # <- Note
0000000000000000 g F __TEXT,__text __Z5func1v
func2.o: file format mach-o arm64
SYMBOL TABLE:
0000000000000000 l F __TEXT,__text ltmp0
0000000000000020 l O __LD,__compact_unwind ltmp1
0000000000000014 w F __TEXT,__text __Z14commonFunctionv # <- Note
0000000000000000 g F __TEXT,__text __Z5func2v
See? The commonFunction
is now defined as a weak symbol with the flag w
in the symbol table. Weak symbols are a special type of symbol used to resolve specific linking scenarios where multiple definitions might exist. A weak symbol can be overridden by a non-weak symbol (which is often “global” and “strong”). This is useful in cases where you want to allow a symbol to be overridden by another definition elsewhere in the program. If no strong definition is found, the weak one is used.
Now, the answer to our original question is apparent. In C++, it’s common to see headers that also include the implementation of class methods (I don’t want to argue whether this is a good practice or not). If these methods are not inline, multiple definitions error will arise as we have seen above. By making class methods implicitly inline, the language designers avoid this issue.
To see this in action, consider the following C++ code:
// common.h
#pragma once
class Common {
public:
int commonFunction() {
return 42;
}
};
// func1.h
#pragma once
int func1();
// func1.cpp
#include "common.h"
#include "func1.h"
int func1() {
Common common;
return common.commonFunction();
}
Compiling it, we can see that the commonFunction
is indeed defined as a weak symbol in the object file:
func1.o: file format mach-o arm64
SYMBOL TABLE:
0000000000000000 l F __TEXT,__text ltmp0
0000000000000038 l O __LD,__compact_unwind ltmp1
0000000000000000 g F __TEXT,__text __Z5func1v
0000000000000020 w F __TEXT,__text __ZN6Common14commonFunctionEv # <- this