- 1.1 Our First Program
- 1.2 Variables
- 1.3 Operators
- 1.4 Expressions and Statements
- 1.5 Functions
- 1.6 Error Handling
- 1.7 I/O
- 1.8 Arrays, Pointers, and References
- 1.9 Structuring Software Projects
- 1.10 Exercises
1.9 Structuring Software Projects
A big problem of large projects is name conflicts. For this reason, we will discuss how macros aggravate this problem. On the other hand, we will show later in Section 3.2.1 how namespaces help us master name conflicts.
In order to understand how the files in a C++ software project interact, it is necessary to understand the build process, i.e., how an executable is generated from the sources. This will be the subject of our second subsection. In this light, we will present the macro mechanism and other language features.
First, we will briefly discuss a feature that contributes to structuring a program: comments.
1.9.1 Comments
The primary purpose of a comment is evidently to describe in plain language what is not obvious to everybody from the program sources, like this:
// Transmogrification of the anti-binoxe in O(n log n) while (cryptographic(trans_thingy) < end_of(whatever)) { ....
Often, the comment is a clarifying pseudo-code of an obfuscated implementation:
// A= B * C for ( ... ) { int x78zy97= yo6954fq, y89haf= q6843, ... for ( ... ) { y89haf+= ab6899(fa69f) + omygosh(fdab); ... for ( ... ) { A(dyoa929, oa9978)+= ...
In such a case, we should ask ourselves whether we can restructure our software so that such obscure implementations are realized once, in a dark corner of a library, and everywhere else we write clear and simple statements such as:
A= B * C;
as program and not as pseudo-code. This is one of the main goals of this book: to show you how to write the short expression you want while the implementation under the hood squeezes out the maximal performance.
Another frequent usage of comments is to remove snippets of code temporarily to experiment with alternative implementations, e.g.:
for ( ... ) { // int x= a + b + c int x = a + d + e; for ( ... ) { ...
Like C, C++ provides a form of block comments, surrounded by /* and */. They can be used to render an arbitrary part of a code line or multiple lines into a comment. Unfortunately, they cannot be nested: no matter how many levels of comments are opened with /*, the first */ ends all block comments. Many programmers run into this trap sometimes: they want to comment out a longer fraction of code that already contains a block comment so that the comment ends earlier than intended, for instance:
for ( ... ) { /* int x78zy97= yo6954fq; // start new comment int x78zy98= yo6953fq; /* int x78zy99= yo6952fq; // start old comment int x78zy9a= yo6951fq; */ // end old comment int x78zy9b= yo6950fq; */ // end new comment (presumably) int x78zy9c= yo6949fq; for ( ... ) {
Here, the line for setting x78zy9b should have been disabled, but the preceeding */ terminated the comment prematurely.
Nested comments can be realized (correctly) with the preprocessor directive #if as we will illustrate in Section 1.9.2.4. Another way to deactivate multiple lines conveniently is by using the appropriate function of IDEs and language-aware editors. At any rate, commenting out code fractions should only be a temporary solution during development when we investigate various approaches. Once we settle for a certain option, we can delete all the unused code and rely on our version control system that it will remain available for possible later modifications.
1.9.2 Preprocessor Directives
In this section, we will present the commands (directives) that can be used in preprocessing. As they are mostly language independent, we recommend limiting their usage to an absolute minimum, especially macros.
1.9.2.1 Macros
“Almost every macro demonstrates a flaw in the programming language, in the program, or the programmer.”
—Bjarne Stroustrup
This is an old technique of code reuse by expanding macro names to their text definition, potentially with arguments. The use of macros gives a lot of possibilities to empower your program but much more for ruining it. Macros are resistant against namespaces, scopes, or any other language feature because they are reckless text substitution without any notion of types. Unfortunately, some libraries define macros with common names like major. We uncompromisingly undefine such macros, e.g., #undef major, without mercy for people who might want to use those macros. With Visual Studio we have—even today!!!—min and max as macros, and we strongly advise you to disable this by compiling with /DNOMINMAX.23 Almost all macros can be replaced by other techniques (constants, templates, inline functions). But if you really do not find another way of implementing something:
Macros can create weird problems in almost every thinkable and unthinkable way. To give you a general idea, we look at a few examples in Appendix A.2.9 and give some tips for how to deal with them. Feel free to postpone the reading until you run into an issue.
As you will see throughout this book, C++ provides better alternatives like constants, inline functions, templates, and constexpr.
1.9.2.2 Inclusion
To keep the C language simple, many features such as I/O were excluded from the core language and realized by the library instead. C++ follows this design and realizes new features whenever possible by the standard library, and yet nobody would call C++ a simple language.
As a consequence, almost every program needs to include one or more headers. The most frequent one is that for I/O, as seen before:
#include <iostream>
The preprocessor searches that file in standard include directories like /usr/include, and /usr/local/include on Unix-like systems. We can add more directories to this search path with a compiler flag—usually -I in the Unix/Linux/Mac OS world and /I in Windows.
When we write the filename within double quotes, e.g.:
#include "herberts_math_functions.hpp"
the compiler usually searches first in the current directory and then in the standard paths.24 This is equivalent to quoting with angle brackets and adding the current directory to the search path. Some people argue that angle brackets should only be used for system headers and user headers should use double quotes (but we do not agree with them).
To avoid name clashes, often the include’s parent directory is added to the search path and a relative path is used in the directive:
#include "herberts_includes/math_functions.hpp" #include <another_project/math_functions.h>
The slashes are portable and also work under Windows (where both regular slashes and backslashes can be used for subdirectories).
Include guards: Frequently used header files may be included multiple times in one source file due to indirect inclusion. To avoid forbidden repetitions and to limit the text expansion, so-called Include Guards ensure that only the first inclusion is performed. These guards are ordinary macros that state the inclusion of a certain file. A typical include file looks like this:
// Author: me // License: Pay me $100 every time you read this #ifndef HERBERTS_MATH_FUNCTIONS_INCLUDE #define HERBERTS_MATH_FUNCTIONS_INCLUDE #include <cmath> double sine(double x); ... # endif // HERBERTS_MATH_FUNCTIONS_INCLUDE
Thus, the content of the file is only included when the guard is not yet defined. Within the content, we define the guard to suppress further inclusions.
As with all macros, we have to pay close attention that the name is unique, not only in our project but also within all other headers that we include directly or indirectly. Ideally the name should represent the project and filename. It can also contain project-relative paths or namespaces (§3.2.1). It is a common practice to terminate it with _INCLUDE or _HEADER.
Accidentally reusing a guard can produce a multitude of different error messages. In our experience it can take an unpleasantly long time to discover the root of that evil. Advanced developers generate them automatically from the aforementioned information or by using random generators.
A convenient alternative is #pragma once. The preceding example simplifies to:
// Author: me // License: Pay me $100 every time you read this #pragma once #include <cmath> double sine(double x); ...
Pragmas are compiler-specific extensions, which is why we cannot count on them being portable. #pragma once is, however, supported by all major compilers and certainly the pragma with the highest portability. In addition to the shorter notation, we can also delegate responsibility to avoid double inclusions to the compiler.
The advanced technology for organizing code in files is introduced in C++20 with modules. We will present them in Section 7.3.
1.9.2.3 Conditional Compilation
An important and necessary usage of preprocessor directives is the control of conditional compilation. The preprocessor provides the directives #if, #else, #elif, and #endif for branching. Conditions can be comparisons, checking for definitions, or logical expressions thereof. The directives #ifdef and #ifndef are shortcuts for, respectively:
#if defined(MACRO_NAME) #if !defined(MACRO_NAME)
The long form must be used when the definition check is combined with other conditions. Likewise, #elif is a shortcut for #else and #if.
In a perfect world, we would only write portable standard-compliant C++ programs. In reality, we sometimes have to use nonportable libraries. Say we have a library that is only available on Windows, and more precisely only with Visual Studio (where the macro _MSC_VER is predefined). For all other relevant compilers, we have an alternative library. The simplest way for the platform-dependent implementation is to provide alternative code fragments for different compilers:
#ifdef _MSC_VER ... Windows code #else ... Linux/Unix code #endif
Similarly, we need conditional compilation when we want to use a new language feature that is not available on all target platforms, say, modules (§7.3):
#ifdef MY_LIBRARY_WITH_MODULES ... well-structured library as modules #else ... portable library in old-fashioned way #endif
Here we can use the feature when available and still keep the portability to compilers without this feature. Of course, we need reliable tools that define the macro only when the feature is really available.
C++20 Alternatively, we can rely on compiler developers’ opinions as to whether this feature is properly supported. To this end, C++20 introduced a macro for each feature introduced since C++11—for instance, __cpp_modules for module support—so that our example now reads:
# ifdef __cpp_modules ... well-structured library as modules #else ... portable library in old-fashioned way #endif
The value of this macro is the year and month when the feature was added to the standard (draft). For evolving core language and library features, this allows us to find out which version is actually supported. For instance, the <chrono> library (Section 4.5) grew over time, and to check whether its C++20 functionality is available on our system, we can use the value of __cpp_lib_chrono:
#if __cpp_lib_chrono >= 201907L
Conditional compilation is quite powerful but comes at a price: the maintenance of the sources and the testing are more laborious and error-prone. These disadvantages can be lessened by well-designed encapsulation so that the different implementations are used over common interfaces.
1.9.2.4 Nestable Comments
The directive #if can be used to comment out code blocks:
#if 0 ... Here we wrote pretty evil code! One day we will fix it. Seriously. #endif
The advantage over /* ... */ is that it can be nested:
#if 0 ... Here the nonsense begins. #if 0 ... Here we have nonsense within nonsense. #endif ... The finale of our nonsense. (Fortunately ignored.) #endif
Nonetheless, this technique should be used with moderation: if three-quarters of the program are comments, we should consider a serious revision.
More Details: In Appendix A.3, we show a real-world example that recapitulates many features of this first chapter. We haven’t included it in the main reading track to keep the high pace for the impatient audience. For those not in such a rush we recommend taking the time to read it and to see how nontrivial software evolves.