extern template
Explicit-Instantiation Declarations
The extern template prefix can be used to suppress implicit generation of local object code for the definitions of particular specializations of class, function, or variable templates used within a translation unit, with the expectation that any suppressed object-code-level definitions will be provided elsewhere within the program by template definitions that are instantiated explicitly.
Description
Inherent in the current ecosystem for supporting template programming in C++ is the need to generate redundant definitions of fully specified function and variable templates within .o files. For common instantiations of popular templates, such as std::vector, the increased object-file size, a.k.a. code bloat, and potentially extended link times might become significant:
#include <vector> // std::vector is a popular template. std::vector<int> v; // std::vector<int> is a common instantiation. #include <string> // std::basic_string is a popular template. std::string s; // std::string, an alias for std::basic_string<char>, is // a common instantiation.
The intent of the extern template feature is to suppress the implicit generation of duplicative object code within every translation unit in which a fully specialized class template, such as std::vector<int> in the code snippet above, is used. Instead, extern template allows developers to choose a single translation unit in which to explicitly generate object code for all the definitions pertaining to that specific template specialization as explained next.
Explicit-instantiation definition
Creating an explicit-instantiation definition was possible prior to C++11.1 The requisite syntax is to place the keyword template in front of the name of the fully specialized class template, function template, or, in C++14, variable template (see Section 1.2. “Variable Templates” on page 157):
#include <vector> // std::vector (general template) template class std::vector<int>; // Deposit all definitions for this specialization into the .o for this // translation unit.
This explicit-instantiation directive compels the compiler to instantiate all functions defined by the named std::vector class template having the specified int template argument; any collateral object code resulting from these instantiations will be deposited in the resulting .o file for the current translation unit. Importantly, even functions that are never used are still instantiated, so this solution might not be the correct one for many classes; see Potential Pitfalls — Accidentally making matters worse on page 373.
Explicit-instantiation declaration
C++11 introduced the explicit-instantiation declaration, a complement to the explicit-instantiation definition. The newly provided syntax allows us to place extern template in front of the declaration of an explicit specialization of a class template, a function template, or a variable template:
#include <vector> // std::vector (general template) extern template class std::vector<int>; // Suppress depositing of any object code for std::vector<int> into the // .o file for this translation unit.
Using the modern extern template syntax above instructs the compiler to refrain from depositing any object code for the named specialization in the current translation unit and instead to rely on some other translation unit to provide any missing object-level definitions that might be needed at link time; see Annoyances — No good place to put definitions for unrelated classes on page 373.
Note, however, that declaring an explicit instantiation to be an extern template in no way affects the ability of the compiler to instantiate and to inline visible function-definition bodies for that template specialization in the translation unit:
// client.cpp: #include <vector> // std::vector (general template) extern template class std::vector<int>; void client(std::vector<int>& inOut) // fully specialized instance of a vector { if (inOut.size()) // This invocation of size can inline. { int value = inOut[0]; // This invocation of operator[] can be inlined. } }
In the previous example, the two tiny member functions of vector, namely, size and operator[], will typically be inlined — in precisely the same way they would have been had the extern template declaration been omitted. The only purpose of an extern template declaration is to suppress object-code generation for this particular template instantiation for the current translation unit.
Finally, note that the use of explicit-instantiation directives has absolutely no effect on the logical meaning of a well-formed program; in particular, when applied to specializations of function templates, they have no effect on overload resolution:
template <typename T> bool f(T v) {/*...*/} // general template definition extern template bool f(char c); // specialization of f for char extern template bool f(int v); // specialization of f for int bool bc = f((char) 0); // exact match: Object code is suppressed locally. bool bs = f((short) 0); // not exact match: Object code is generated locally. bool bi = f((int) 0); // exact match: Object code is suppressed locally. bool bu = f((unsigned)0); // not exact match: Object code is generated locally.
As the example above illustrates, overload resolution and template argument deduction occur independently of any explicit-instantiation declarations. Only after the template to be instantiated is determined does the extern template syntax take effect; see also Potential Pitfalls — Corresponding explicit-instantiation declarations and definitions on page 371.
A more complete illustrative example
So far, we have seen the use of explicit-instantiation declarations and explicit-instantiation definitions applied to only a standard class template, std::vector. The same syntax shown in the previous code snippet applies also to full specializations of individual function templates and variable templates.
As a more comprehensive, albeit largely pedagogical, example, consider the overly simplistic my::Vector class template along with other related templates defined within a header file, my_vector.h:
// my_vector.h: #ifndef INCLUDED_MY_VECTOR // internal include guard #define INCLUDED_MY_VECTOR #include <cstddef> // std::size_t #include <utility> // std::swap namespace my // namespace for all entities defined within this component { template <typename T> class Vector { static std::size_t s_count; // track number of objects constructed T* d_data_p; // pointer to dynamically allocated memory std::size_t d_length; // current number of elements in the vector std::size_t d_capacity; // number of elements currently allocated public: // ... std::size_t length() const { return d_length; } // Return the number of elements. // ... }; // ... Any partial or full specialization definitions ... // ... of the class template Vector go here. ... template <typename T> void swap(Vector<T> &lhs, Vector<T> &rhs) { return std::swap(lhs, rhs); } // free function that operates on objects of type my::Vector via ADL // ... Any [full] specialization definitions ... // ... of free function swap would go here. ... template <typename T> const std::size_t vectorSize = sizeof(Vector<T>); // C++14 variable template // This nonmodifiable static variable holds the size of a my::Vector<T>. // ... Any [full] specialization definitions ... // ... of variable vectorSize would go here. ... template <typename T> std::size_t Vector<T>::s_count = 0; // definition of static counter in general template // ... We might opt to add explicit-instantiation declarations here. // ... } // Close my namespace. #endif // Close internal include guard.
In the my_vector component in the code snippet above, we have defined the following, in the my namespace.
A class template, Vector, parameterized on element type
A free-function template, swap, that operates on objects of corresponding specialized Vector type
A const C++14 variable template, vectorSize, that represents the number of bytes in the footprint of an object of the corresponding specialized Vector type
Any use of these templates by a client might and typically will trigger the depositing of equivalent definitions as object code in the client translation unit’s resulting .o file, irrespective of whether the definition being used winds up getting inlined.
To eliminate object code for specializations of entities in the my_vector component, we must first decide where the unique definitions will go; see Annoyances — No good place to put definitions for unrelated classes on page 373. In this specific case, we own the component that requires specialization, and the specialization is for a ubiquitous built-in type; hence, the natural place to generate the specialized definitions is in a .cpp file corresponding to the component’s header:
// my_vector.cpp: #include <my_vector.h> // We always include the component's own header first. // By including this header file, we have introduced the general template // definitions for each of the explicit-instantiation declarations below. namespace my // namespace for all entities defined within this component { template class Vector<int>; // Generate object code for all nontemplate member functions and definitions // of static data members of template my::Vector having int elements. template std::size_t Vector<double>::length() const; // BAD IDEA // In addition, we could generate object code for just a particular member // function definition of my::Vector (e.g., length) for some other // argument type (e.g., double). template void swap(Vector<int>& lhs, Vector<int>& rhs); // Generate object code for the full specialization of the swap free- // function template that operates on objects of type my::Vector<int>. template const std::size_t vectorSize<int>; // C++14 variable template // Generate the object-code-level definition for the specialization of the // C++14 variable template instantiated for built-in type int. template std::size_t Vector<int>::s_count; // Generate the object-code-level definition for the specialization of the // static member variable of Vector instantiated for built-in type int. } // Close my namespace.
Each of the constructs introduced by the keyword template within the my namespace in the previous example represents a separate explicit-instantiation definition. These constructs instruct the compiler to generate object-level definitions for general templates declared in my_vector.h specialized on the built-in type int. Explicit instantiation of individual member functions, such as length() in the example, is, however, only rarely useful; see Annoyances — All members of an explicitly defined template class must be valid on page 374.
Having installed the necessary explicit-instantiation definitions in the component’s my_vector.cpp file, we must now go back to its my_vector.h file and, without altering any of the previously existing lines of code, add the corresponding explicit-instantiation declarations to suppress redundant local code generation:
// my_vector.h: #ifndef INCLUDED_MY_VECTOR // internal include guard #define INCLUDED_MY_VECTOR namespace my // namespace for all entities defined within this component { // ... // ... everything that was in the original my namespace // ... // ----------------------------------- // explicit-instantiation declarations // ----------------------------------- extern template class Vector<int>; // Suppress object code for this class template specialized for int. extern template std::size_t Vector<double>::length() const; // BAD IDEA // Suppress object code for this member, only specialized for double. extern template void swap(Vector<int>& lhs, Vector<int>& rhs); // Suppress object code for this free function specialized for int. extern template std::size_t vectorSize<int>; // C++14 // Suppress object code for this variable template specialized for int. extern template std::size_t Vector<int>::s_count; // Suppress object code for this static member definition w.r.t. int. } // Close my namespace. #endif // Close internal include guard.
Each of the constructs that begins with extern template in the example above are explicit-instantiation declarations, which serve only to suppress the generation of any object code emitted to the .o file of the current translation unit in which such specializations are used. These added extern template declarations must appear in my_header.h after the declaration of the corresponding general template and, importantly, before whatever relevant definitions are ever used.
The effect on various .o files
To illustrate the effect of explicit-instantiation declarations and explicit-instantiation definitions on the contents of object and executable files, we’ll use a simple lib_interval library component consisting of a header file, lib_interval.h, and an implementation file, lib_interval.cpp. The latter, apart from including its corresponding header, is effectively empty:
// lib_interval.h: #ifndef INCLUDED_LIB_INTERVAL // internal include guard #define INCLUDED_LIB_INTERVAL namespace lib // namespace for all entities defined within this component { template <typename T> // elided definition of a class template class Interval { T d_low; // interval's low value T d_high; // interval's high value public: explicit Interval(const T& p) : d_low(p), d_high(p) { } // Construct an empty interval. Interval(const T& low, const T& high) : d_low(low), d_high(high) { } // Construct an interval having the specified boundary values. const T& low() const { return d_low; } // Return this interval's low value. const T& high() const { return d_high; } // Return this interval's high value. int length() const { return d_high - d_low; } // Return this interval's length. // ... }; template <typename T> // elided definition of a function template bool intersect(const Interval<T>& i1, const Interval<T>& i2) // Determine whether the specified intervals intersect. { bool result = false; // nonintersecting until proven otherwise // ... return result; } } // Close lib namespace. #endif // INCLUDED_LIB_INTERVAL // lib_interval.cpp: #include <lib_interval.h>
This library component above defines, in the namespace lib, an implementation of (1) a class template, Interval, and (2) a function template, intersect.
Let’s also consider a trivial application that uses this library component:
// app.cpp: #include <lib_interval.h> // Include the library component's header file. int main(int argv, const char** argc) { lib::Interval<double> a(0, 5); // instantiate with double type argument lib::Interval<double> b(3, 8); // instantiate with double type argument lib::Interval<int> c(4, 9); // instantiate with int type argument if (lib::intersect(a, b)) // instantiate deducing double type argument { return 0; // Return "success" as (0.0, 5.0) does intersect (3.0, 8.0). } return 1; // Return "failure" status as function apparently doesn't work. }
The purpose of this application is merely to exhibit a couple of instantiations of the library class template, lib::Interval, for type arguments int and double, and of the library function template, lib::intersect, for just double.
Next, we compile the application and library translation units, app.cpp and lib_interval.cpp, and inspect the symbols in their respective corresponding object files, app.o and lib_interval.o:
$ gcc -I. -c app.cpp lib_interval.cpp $ nm -C app.o lib_interval.o app.o: 0000000000000000 W lib::Interval<double>::Interval(double const&, double const&) 0000000000000000 W lib::Interval<int>::Interval(int const&, int const&) 0000000000000000 W bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&) 0000000000000000 T main lib_interval.o:
Looking at app.o in the previous example, the class and function templates used in the main function, which is defined in the app.cpp file, were instantiated implicitly, and the relevant code was added to the resulting object file, app.o, with each instantiated function definition in its own separate section. In the Interval class template, the generated symbols correspond to the two unique instantiations of the constructor, i.e., for double and int, respectively. The intersect function template, however, was implicitly instantiated for only type double. Note that all of the implicitly instantiated functions have the W symbol type, indicating that they are weak symbols, which are permitted to be present in multiple object files. By contrast, this file also defines the strong symbol main, marked here by a T. Linking app.o with any other object file containing such a symbol would cause the linker to report a multiply-defined-symbol error. On the other hand, the lib_interval.o file corresponds to the lib_interval library component, whose .cpp file served only to include its own .h file, and is again effectively empty.
Let’s now link the two object files, app.o and lib_interval.o, and inspect the symbols in the resulting executable, app2:
$ gcc -o app app.o lib_interval.o $ nm -C app 000000000040056e W lib::Interval<double>::Interval(double const&, double const&) 00000000004005a2 W lib::Interval<int>::Interval(int const&, int const&) 00000000004005ce W bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&) 00000000004004b7 T main
As the textual output above confirms, the final program contains exactly one copy of each weak symbol. In this tiny illustrative example, these weak symbols have been defined in only a single object file, thus not requiring the linker to select one definition out of many.
More generally, if the application comprises multiple object files, each file will potentially contain their own set of weak symbols, often leading to duplicate code sections for implicitly instantiated class, function, and variable templates instantiated on the same type arguments. When the linker combines object files, it will arbitrarily choose at most one of each of these respective and ideally identical weak-symbol sections to include in the final executable.
Imagine now that our program includes a large number of object files, many of which make use of our lib_interval component, particularly to operate on double intervals.
Suppose, for now, we decide we would like to suppress the generation of object code for templates related to just double type with the intent of later putting them all in one place, i.e., the currently empty lib_interval.o. Achieving this objective is precisely what the extern template syntax is designed to accomplish.
Returning to our lib_interval.h file, we need not change one line of code; we need only to add two explicit-instantiation declarations — one for the class template, Interval<double>, and one for the function template, intersect<double>(const double&, const double&) — to the header file anywhere after their respective corresponding general template declaration and definition:
// lib_interval.h: // No change to existing code. #ifndef INCLUDED_LIB_INTERVAL // internal include guard #define INCLUDED_LIB_INTERVAL namespace lib // namespace for all entities defined within this component { template <typename T> class Interval { // ... (same as before) }; template <typename T> bool intersect(const Interval<T>& i1, const Interval<T>& i2) { // ... (same as before) } extern template class Interval<double>; // explicit-instantiation declaration extern template // explicit-instantiation declaration bool intersect(const Interval<double>&, const Interval<double>&); } // close lib namespace #endif // INCLUDED_LIB_INTERVAL
Let’s again compile the two .cpp files and inspect the corresponding .o files:
$ gcc -I. -c app.cpp lib_interval.cpp $ nm -C app.o lib_interval.o app.o: U lib::Interval<double>::Interval(double const&, double const&) 0000000000000000 W lib::Interval<int>::Interval(int const&, int const&) U bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&) 0000000000000000 T main lib_interval.o:
Notice that this time some of the symbols, specifically those relating to the class and function templates instantiated for type double, have changed from W, indicating a weak symbol, to U, indicating an undefined one. This symbol type change means that instead of generating a weak symbol for the explicit specializations for double, the compiler left those symbols undefined, as if only the declarations of the member and free-function templates had been available when compiling app.cpp, yet inlining of the instantiated definitions is in no way affected. Undefined symbols are expected to be made available to the linker from other object files. Attempting to link this application expectedly fails because no object files being linked contain the needed definitions for those instantiations:
$ gcc -o app app.o lib_interval.o app.o: In function 'main': app.cpp:(.text+0x38): undefined reference to `lib::Interval<double>::Interval(double const&, double const&)' app.cpp:(.text+0x69): undefined reference to `lib::Interval<double>::Interval(double const&, double const&)' app.cpp:(.text+0xa1): undefined reference to `bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&)' collect2: error: ld returned 1 exit status
To provide the missing definitions, we will need to instantiate them explicitly. Since the type for which the class and function are being specialized is the ubiquitous built-in type, double, the ideal place to sequester those definitions would be within the object file of the lib_interval library component itself, but see Annoyances — No good place to put definitions for unrelated classes on page 373. To force the needed template definitions into the lib_interval.o file, we will need to use explicit-instantiation definition syntax, i.e., the template prefix:
// lib_interval.cpp: #include <lib_interval.h> template class lib::Interval<double>; // example of an explicit-instantiation definition for a class template bool lib::intersect(const Interval<double>&, const Interval<double>&); // example of an explicit-instantiation definition for a function
We recompile once again and inspect our newly generated object files:
$ gcc -I. -c app.cpp lib_interval.cpp $ nm -C app.o lib_interval.o app.o: U lib::Interval<double>::Interval(double const&, double const&) 0000000000000000 W lib::Interval<int>::Interval(int const&, int const&) U bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&) 0000000000000000 T main lib_interval.o: 0000000000000000 W lib::Interval<double>::Interval(double const&) 0000000000000000 W lib::Interval<double>::Interval(double const&, double const&) 0000000000000000 W lib::Interval<double>::low() const 0000000000000000 W lib::Interval<double>::high() const 0000000000000000 W lib::Interval<double>::length() const 0000000000000000 W bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&)
The application object file, app.o, naturally remained unchanged. What’s new here is that the functions that were missing from the app.o file are now available in the lib_interval.o file, again as weak (W), as opposed to strong (T), symbols. Notice, however, that explicit instantiation forces the compiler to generate code for all of the member functions of the class template for a given specialization. These symbols might all be linked into the resulting executable unless we take explicit precautions to exclude those that aren’t needed3:
$ gcc -o app app.o lib_interval.o -Wl,--gc-sections $ nm -C app 00000000004005ca W lib::Interval<double>::Interval(double const&, double const&) 000000000040056e W lib::Interval<int>::Interval(int const&, int const&) 000000000040063d W bool lib::intersect<double>(lib::Interval<double> const&, lib::Interval<double> const&) 00000000004004b7 T main
The extern template feature is provided to enable software architects to reduce code bloat in individual object files for common instantiations of class, function, and, as of C++14, variable templates in large-scale C++ software systems. The practical benefit is in reducing the physical size of libraries, which might lead to improved link times. Explicit-instantiation declarations do not (1) affect the meaning of a program, (2) suppress inline template implicit instantiation, (3) impede the compiler’s ability to inline, or (4) meaningfully improve compile time. To be clear, the only purpose of the extern template syntax is to suppress object-code generation for the current translation unit, which is then selectively overridden in the translation unit(s) of choice.
Use Cases
Reducing template code bloat in object files
The motivation for the extern template syntax is as a purely compile-time, not runtime, optimization, i.e., to reduce the amount of redundant code within individual object files resulting from common template instantiations in client code. As an example, consider a fixed-size-array class template, FixedArray, that is used widely, i.e., by many clients from separate translation units, in a large-scale game project for both integral and floating-point calculations, primarily with type arguments int and double and array sizes of either 2 or 3:
// game_fixedarray.h: #ifndef INCLUDED_GAME_FIXEDARRAY // internal include guard #define INCLUDED_GAME_FIXEDARRAY #include <cstddef> // std::size_t namespace game // namespace for all entities defined within this component { template <typename T, std::size_t N> // widely used class template class FixedArray { // ... (elided private implementation details) public: FixedArray() { /*...*/ } FixedArray(const FixedArray<T, N>& other) { /*...*/ } T& operator[](std::size_t index) { /*...*/ } const T& operator[](std::size_t index) const { /*...*/ } }; template <typename T, std::size_t N> T dot(const FixedArray<T, N>& a, const FixedArray<T, N>& b) { /*...*/ } // Return the scalar ("dot") product of the specified 'a' and 'b'. // Explicit-instantiation declarations for full template specializations // commonly used by the game project are provided below. extern template class FixedArray<int, 2>; // class template extern template int dot(const FixedArray<int, 2>& a, // function template const FixedArray<int, 2>& b); // for int and 2 extern template class FixedArray<int, 3>; // class template extern template int dot(const FixedArray<int, 3>& a, // function template const FixedArray<int, 3>& b); // for int and 3 extern template class FixedArray<double, 2>; // for double and 2 extern template double dot(const FixedArray<double, 2>& a, const FixedArray<double, 2>& b); extern template class FixedArray<double, 3>; // for double and 3 extern template double dot(const FixedArray<double, 3>& a, const FixedArray<double, 3>& b); } // Close game namespace. #endif // INCLUDED_GAME_FIXEDARRAY
Specializations commonly used by the game project are provided by the game library. In the component header in the example above, we have used the extern template syntax to suppress object-code generation for instantiations of both the class template FixedArray and the function template dot for element types int and double, each for array sizes 2 and 3. To ensure that these specialized definitions are available in every program that might need them, we use the template syntax counterpart to force object-code generation within just the one .o corresponding to the game_fixedarray library component4:
// game_fixedarray.cpp: #include <game_fixedarray.h> // included as first substantive line of code // Explicit-instantiation definitions for full template specializations // commonly used by the game project are provided below. template class game::FixedArray<int, 2>; // class template template int game::dot(const FixedArray<int, 2>& a, // function template const FixedArray<int, 2>& b); // for int and 2 template class game::FixedArray<int, 3>; // class template template int game::dot(const FixedArray<int, 3>& a, // function template const FixedArray<int, 3>& b); // for int and 3 template class game::FixedArray<double, 2>; // for double and 2 template double game::dot(const FixedArray<double, 2>& a, const FixedArray<double, 2>& b); template class game::FixedArray<double, 3>; // for double and 3 template double game::dot(const FixedArray<double, 3>& a, const FixedArray<double, 3>& b);
Compiling game_fixedarray.cpp and examining the resulting object file shows that the code for all explicitly instantiated classes and free functions was generated and placed into the object file, game_fixedarray.o, of which we show a subset of the relevant symbols:
$ gcc -I. -c game_fixedarray.cpp $ nm -C game_fixedarray.o 0000000000000000 W game::FixedArray<double, 2ul>::FixedArray( game::FixedArray<double, 2ul> const&) 0000000000000000 W game::FixedArray<double, 2ul>::FixedArray() 0000000000000000 W game::FixedArray<double, 2ul>::operator[](unsigned long) 0000000000000000 W game::FixedArray<double, 3ul>::FixedArray( game::FixedArray<double, 3ul> const&) 0000000000000000 W game::FixedArray<int, 3ul>::FixedArray() : 0000000000000000 W double game::dot<double, 2ul>( game::FixedArray<double, 2ul> const&, game::FixedArray<double, 2ul> const&) 0000000000000000 W double game::dot<double, 3ul>( game::FixedArray<double, 3ul> const&, game::FixedArray<double, 3ul> const&) 0000000000000000 W int game::dot<int, 2ul>( game::FixedArray<int, 2ul> const&, game::FixedArray<int, 2ul> const&) : 0000000000000000 W game::FixedArray<int, 2ul>::operator[](unsigned long) const 0000000000000000 W game::FixedArray<int, 3ul>::operator[](unsigned long) const
This FixedArray class template is used in multiple translation units within the game project. The first one contains a set of geometry utilities:
// app_geometryutil.cpp: #include <game_fixedarray.h> // game::FixedArray #include <game_unit.h> // game::Unit using namespace game; void translate(Unit* object, const FixedArray<double, 2>& dst) // Perform precise movement of the object on 2D plane. { FixedArray<double, 2> objectProjection; // ... } void translate(Unit* object, const FixedArray<double, 3>& dst) // Perform precise movement of the object in 3D space. { FixedArray<double, 3> delta; // ... } bool isOrthogonal(const FixedArray<int, 2>& a1, const FixedArray<int, 2>& a2) // Return true if 2d arrays are orthogonal. { return dot(a1, a2) == 0; } bool isOrthogonal(const FixedArray<int, 3>& a1, const FixedArray<int, 3>& a2) // Return true if 3d arrays are orthogonal. { return dot(a1, a2) == 0; }
The second one deals with physics calculations:
// app_physics.cpp: #include <game_fixedarray.h> // game::FixedArray #include <game_unit.h> // game::Unit using namespace game; void collide(Unit* objectA, Unit* objectB) // Calculate the result of object collision in 3D space. { FixedArray<double, 3> centerOfMassA = objectA->centerOfMass(); FixedArray<double, 3> centerOfMassB = objectB->centerOfMass(); // .. } void accelerate(Unit* object, const FixedArray<double, 3>& force) // Calculate the position after applying a specified force for the // duration of a game tick. { // ... }
Note that the object files for the application components throughout the game project do not contain any of the implicitly instantiated definitions that we had chosen to uniquely sequester externally, i.e., within the game_fixedarray.o file:
$ nm -C app_geometryutil.o 000000000000003e T isOrthogonal(game::FixedArray<int, 2ul> const&, game::FixedArray<int, 2ul> const&) 0000000000000068 T isOrthogonal(game::FixedArray<int, 3ul> const&, game::FixedArray<int, 3ul> const&) 0000000000000000 T translate(game::Unit*, game::FixedArray<double, 2ul> const&) 000000000000001f T translate(game::Unit*, game::FixedArray<double, 3ul> const&) U game::FixedArray<double, 2ul>::FixedArray() U game::FixedArray<double, 3ul>::FixedArray() U int game::dot<int, 2ul>(game::FixedArray<int, 2ul> const&, game::FixedArray<int, 2ul> const&) U int game::dot<int, 3ul>(game::FixedArray<int, 3ul> const&, game::FixedArray<int, 3ul> const&) $ nm -C app_physics.o 0000000000000039 T accelerate(game::Unit*, game::FixedArray<double, 3ul> const&) 0000000000000000 T collide(game::Unit*, game::Unit*) U game::FixedArray<double, 3ul>::FixedArray() 0000000000000000 W game::Unit::centerOfMass()
Whether optimization involving explicit-instantiation directives reduces library sizes on disk has no noticeable effect or actually makes matters worse will depend on the particulars of the system at hand. Having this optimization applied to frequently used templates across a large organization has been known to decrease object file sizes, storage needs, link times, and overall build times, but see Potential Pitfalls — Accidentally making matters worse on page 373.
Insulating template definitions from clients
Even before the introduction of explicit-instantiation declarations, strategic use of explicit-instantiation definitions made it possible to insulate the definition of a template from client code, presenting instead just a limited set of instantiations against which clients may link. Such insulation enables the definition of the template to change without forcing clients to recompile. What’s more, new explicit instantiations can be added without affecting existing clients.
As an example, suppose we have a single free-function template, transform, that operates on only floating-point values:
// transform.h: #ifndef INCLUDED_TRANSFORM #define INCLUDED_TRANSFORM template <typename T> // declaration only of free-function template T transform(const T& value); // Return the transform of the specified floating-point value. #endif
Initially, this function template will support just two built-in types, float and double, but it is anticipated to eventually support the additional built-in type long double and perhaps even supplementary user-defined types (e.g., Float128) to be made available via separate headers (e.g., float128.h). By placing only the declaration of the transform function template in its component’s header, clients will be able to link against only two supported explicit specializations provided in the transform.cpp file:
// transform.cpp: #include <transform.h> // Ensure consistency with client-facing declaration. template <typename T> // redeclaration/definition of free-function template T transform(const T& value) { // insulated implementation of transform function template } // explicit-instantiation definitions template float transform(const float&); // Instantiate for type float. template double transform(const double&); // Instantiate for type double.
Without the two explicit-instantiation declarations in the transform.cpp file above, its corresponding object file, transform.o, would be empty.
Note that, as of C++11, we could place the corresponding explicit-instantiation declarations in the header file for, say, documentation purposes:
// transform.h: #ifndef INCLUDED_TRANSFORM #define INCLUDED_TRANSFORM template <typename T> // declaration only of free-function template T transform(const T& value); // Return the transform of the specified floating-point value. // explicit-instantiation declarations, available as of C++11 extern template float transform(const float&); // user documentation only; extern template double transform(const double&); // has no effect whatsoever #endif
Because no definition of the transform free-function template is visible in the header, no implicit instantiation can result from client use; hence, the two explicit-instantiation declarations above for float and double, respectively, do nothing.
Potential Pitfalls
Corresponding explicit-instantiation declarations and definitions
To realize a reduction in object-code size for individual translation units and yet still be able to link all valid programs successfully into a well-formed program, four moving parts have to be brought together correctly.
Each general template, C<T>, whose object code bloat is to be optimized must be declared within some designated component’s header file, c.h.
The specific definition of each C<T> relevant to an explicit specialization being optimized — including general, partial-specialization, and full-specialization definitions — must appear in the header file prior to its corresponding explicit-instantiation declaration.
Each explicit-instantiation declaration for each specialization of each separate top-level — i.e., class, function, or variable — template must appear in the component’s .h file after the corresponding general template declaration and the relevant general, partial-specialization, or full-specialization definition, but, in practice, always after all such definitions, not just the relevant one.
Each template specialization having an explicit-instantiation declaration in the header file must have a corresponding explicit-instantiation definition in the component’s implementation file, c.cpp.
Absent items (1) and (2), clients would have no way to safely separate out the usability and inlineability of the template definitions yet consolidate the otherwise redundantly generated object-level definitions within just a single translation unit. Moreover, failing to provide the relevant definition would mean that any clients using one of these specializations would either fail to compile or, arguably worse, pick up the general definitions when a more specialized definition was intended, likely resulting in an ill-formed program.
Failing item (3), the object code for that particular specialization of that template will be generated locally in the client’s translation unit as usual, negating any benefits with respect to local object-code size, irrespective of what is specified in the c.cpp file.
Finally, unless we provide a matching explicit-instantiation definition in the c.cpp file for each and every corresponding explicit-instantiation declaration in the c.h file as in item (4), our optimization attempts might well result in a library component that compiles, links, and even passes some unit tests but, when released to our clients, fails to link. Additionally, any explicit-instantiation definition in the c.cpp file that is not accompanied by a corresponding explicit-instantiation declaration in the c.h file will inflate the size of the c.o file with no possibility of reducing code bloat in client code:
// c.h: #ifndef INCLUDED_C // internal include guard #define INCLUDED_C template <typename T> void f(T v) {/*...*/} // general template definition extern template void f<int>(int v); // OK, matched in c.cpp extern template void f<char>(char c); // Error, unmatched in .cpp file #endif // c.cpp: #include <c.h> // incorporate own header first template void f<int>(int v); // OK, matched in c.h template void f<double>(double v); // Bug, unmatched in c.h file // client.cpp: #include <c.h> void client() { int i = 1; char c = 'a'; double d = 2.0; f(i); // OK, matching explicit-instantiation directives f(c); // Link-Time Error, no matching explicit-instantiation definition f(d); // Bug, size increased due to no matching explicit-instantiation // declaration. }
In the example above, f(i) works as expected, with the linker finding the definition of f<int> in c.o; f(c) fails to link because no definition of f<char> is guaranteed to be found anywhere; and f(d) accidentally works by silently generating a redundant local copy of f<double> in client.o, while another, identical definition is generated explicitly in c.o. These extra instantiations do not result in multiply-defined symbols because they still reside in their own sections and are marked as weak symbols. Importantly, note that extern template has absolutely no effect on overload resolution because the call to f(c) did not resolve to f<int>.
Accidentally making matters worse
When making the decision to explicitly instantiate common specializations of popular templates within some designated object file, it is important to consider that not all programs necessarily need every (or even any) such instantiation. Classes that have many member functions but typically use only a few require special attention.
For such classes, it might be beneficial to explicitly instantiate individual member functions instead of the entire class template. However, selecting which member functions to explicitly instantiate and with which template arguments they should be instantiated without carefully measuring the effect on the overall object size might result in not only overall pessimization, but also to an unnecessary maintenance burden. Finally, remember that one might need to explicitly tell the linker to strip unused sections resulting, for example, from forced instantiation of common template specializations, to avoid inadvertently bloating executables, which could adversely affect load times.
Annoyances
No good place to put definitions for unrelated classes
When we consider the implications of physical dependency,5,6 determining in which component to deposit the specialized definitions can be problematic. For example, consider a codebase implementing a core library that provides both a nontemplated String class and a Vector container class template. These fundamentally unrelated entities would ideally live in separate physical components (i.e., .h/.cpp pairs), neither of which depends physically on the other. That is, an application using just one of these components could be compiled, linked, tested, and deployed entirely independently of the other. Now, consider a large codebase that makes heavy use of Vector<String>: In what component should the object-code-level definitions for the Vector<String> specialization reside?7 There are two obvious alternatives.
vector — In this case, vector.h would hold extern template class Vector<String>; — the explicit-instantiation declaration. vector.cpp would hold template class Vector<String>; — the explicit-instantiation definition. With this approach, we would create a physical dependency of the vector component on string. Any client program wanting to use a Vector would also depend on string regardless of whether it was needed.
string — In this case, string.h and string.cpp would instead be modified so as to depend on vector. Clients wanting to use a string would also be forced to depend physically on vector at compile time.
Another possibility might be to create a third component, called stringvector, that itself depends on both vector and string. By escalating8 the mutual dependency to a higher level in the physical hierarchy, we avoid forcing any client to depend on more than what is actually needed. The practical drawback to this approach is that only those clients that proactively include the composite stringvector.h header would realize any benefit; fortunately, in this case, there is no one-definition rule (ODR) violation if they don’t.
Finally, complex machinery could be added to both string.h and vector.h to conditionally include stringvector.h whenever both of the other headers are included; such heroic efforts would, nonetheless, involve a cyclic physical dependency among all three of these components. Circular intercomponent collaborations are best avoided.9
All members of an explicitly defined template class must be valid
In general, when using a class template, only those members that are actually used get implicitly instantiated. This hallmark allows class templates to provide functionality for parameter types having certain capabilities, e.g., default constructible, while also providing partial support for types lacking those same capabilities. When providing an explicit-instantiation definition, however, all members of a class template are instantiated.
Consider a simple class template having a data member that can be either default-initialized via the template’s default constructor or initialized with an instance of the member’s type supplied at construction:
template <typename T> class W { T d_t; // a data member of type T public: W() : d_t() {} // Create an instance of W with a default-constructed T member. W(const T& t) : d_t(t) {} // Create an instance of W with a copy of the specified t. void doStuff() { /* do stuff */ } };
This class template can be used successfully with a type, such as U in the following code snippet, that is not default constructible:
struct U { U(int i) { /* construct from i */ } // ... }; void useWU() { W<U> wu1(U(17)); // OK, using copy constructor for U wu1.doStuff(); }
As it stands, the code above is well formed even though W<U>::W() would fail to compile if instantiated. Consequently, although providing an explicit-instantiation declaration for W<U> is valid, a corresponding explicit-instantiation definition for W<U> fails to compile, as would an implicit instantiation of W<U>::W():
extern template class W<U>; // Valid: Suppress implicit instantiation of W<U>. template class W<U>; // Error, U::U() not available for W<U>::W() void useWU0() { W<U> wu0; // Error, U::U() not available for W<U>::W() }
Unfortunately, the only workaround to achieve a comparable reduction in code bloat is to provide explicit-instantiation directives for each valid member function of W<U>, an approach that would likely carry a significantly greater maintenance burden:
extern template W<U>::W(const U& u); // suppress individual member extern template void W<U>::doStuff(); // " " " // ... Repeat for all other functions in W except W<U>::W(). template W<U>::W(const U& u); // instantiate individual member template void W<U>::doStuff(); // " " " // ... Repeat for all other functions in W except W<U>::W().
The power and flexibility to make it all work — albeit annoyingly — are there nonetheless.
See Also
- “Variable Templates” (§1.2, p. 157) covers an extension of the template syntax for defining a family of like-named variables or static data members that can be instantiated explicitly.
Further Reading
- For a different perspective on this feature, see lakos20, section 1.3.16, “extern Templates,” pp. 183–185.
- For a more complete discussion of how compilers and linkers work with respect to C++, see lakos20, Chapter 1, “Compilers, Linkers, and Components,” pp. 123–268.