Solution
Unions Redux
-
What are unions, and what purpose do they serve?
Unions allow more than one object, of either class or built-in type, to occupy the same space in memory. For example:
// Example 36-1 // union U { int i; float f; }; U u; u.i = 42; //ok, now i is active std::cout << u.i << std::endl; u.f = 3.14f; // ok, now f is active std::cout << 2 * u.f << std::endl;
But only one of the types can be "active" at a timeafter all, the storage can hold only one value at a time. Also, unions support only some kinds of types, which leads us into the next question:
-
What kinds of types cannot be used as members of unions? Why do these limitations exist? Explain.
From the C++ standard:
An object of a class with a non-trivial constructor, a non-trivial copy constructor, a non-trivial destructor, or a non-trivial copy assignment operator cannot be a member of a union, nor can an array of such objects.
In brief, for a class type to be usable in a union, it must meet all the following criteria:
-
The only constructors, destructors, and copy assignment operators are the compiler-generated ones.
-
There are no virtual functions or virtual base classes.
-
Ditto for all of its base classes and nonstatic members (or arrays thereof).
That's all, but that sure eliminates a lot of types.
Unions were inherited from C. The C language has a strong tradition of efficiency and support for low-level close-to-the-metal programming, which has been compatibly preserved in C++; that's why C++ also has unions. On the other hand, the C language does not have any tradition of language support for an object model supporting class types with constructors and destructors and user-defined copying, which C++ definitely does; that's why C++ also has to define what, if any, uses of such newfangled types make sense with the "oldfangled" unions and do not violate the C++ object model including its object lifetime guarantees.
If C++'s restrictions on unions did not exist, Bad Things could happen. For example, consider what could happen if the following code were allowed:
// Example 36-2: Not standard C++ code, but what if it were allowed? // void f() { union IllegalImmoralAndFattening { std::string s; std::auto_ptr<int> p; }; IllegalImmoralAndFattening iiaf; iiaf.s = "Hello, world"; // has s's constructor run? iiaf.p = new int(4); // has p's constructor run? } // will s get destroyed? should it be? will p get destroyed? should it be?
As the comments indicate, serious problems would exist if this were allowed. To avoid further complicating the language by trying to craft rules that at best might only partly patch up a few of the problems, the problematic operations were simply banished.
But don't think that unions are only a holdover from earlier times. Unions are perhaps most useful for saving space by allowing data to overlap, and this is still desirable in C++ and in today's modern world. For example, some of the most advanced C++ standard library implementations in the world now use just this technique for implementing the "small string optimization," a great optimization alternative that reuses the storage inside a string object itself. Here's the idea: For large strings, space inside the string object stores the usual pointer to the dynamically allocated buffer and housekeeping information like the size of the buffer; for small strings, however, the same space is instead reused to store the string contents directly and completely avoid any dynamic memory allocation. For more about the small string optimization (and other string optimizations and pessimizations in considerable depth), see Items 13 through 16 in More Exceptional C++ [Sutter02]; see also the discussion of current commercial std::string implementations in [Meyers01].
Dissecting Construction Unions
-
The article [Manley02] cites the motivating case of writing a scripting language: Say that you want your language to support a single type for variables that at various times can hold an integer, a string, or a list. Creating a union { int i; list<int> l; string s; }; doesn't work for the reasons covered in Questions 1 and 2. The following code presents a workaround that attempts to support allowing any type to participate in a union. (For a more detailed explanation, see the original article.)
On the plus side, the cited article addresses a real problem, and clearly much effort has been put into coming up with a good solution. Unfortunately, from well-intentioned beginnings, more than one programmer has gone badly astray.
The problems with the design and the code fall into three major categories: legality, safety, and morality.
-
Critique this code and identify:
-
Mechanical errors, such as invalid syntax or nonportable conventions.
-
Stylistic improvements that would improve code clarity, reusability, and maintainability.
-
The first overall comment that needs to be made is that the fundamental idea behind this code is not legal in standard C++. The original article summarizes the key idea:
The idea is that instead of declaring object members, you instead declare a raw buffer [non-dynamically, as a char array member inside the object pretending to act like a union] and instantiate the needed objects on the fly [by in-place construction].
[ Manley02]
The idea is common, but unfortunately it is also unsound.
Allocating a buffer of one type and then using casts to poke objects of another type in and out, is nonconforming and nonportable because buffers that are not dynamically allocated (i.e., that are not allocated via malloc or new) are not guaranteed to be correctly aligned for any other type than the one with which they were originally declared. Even if this technique happens to accidentally work for some types on someone's current compiler, there's no guarantee it will continue to work for other types or for the same types in the next version of the same compiler. For more details and some directly related discussion, see Item 30 in Exceptional C++ [Sutter00], notably the sidebar titled "Reckless Fixes and Optimizations, and Why They're Evil." See also the alignment discussion in [Alexandrescu02].
For C++0x, the standards committee is considering adding alignment aids to the language specifically to enable techniques that rely on alignment like this, but that's all still in the future. For now, to make this work reasonably reliably even some of the time, you'd have to do one of the following:
-
Rely on the max_align hack (see [Manley02] which footnotes the max_align hack, or do a Google search for max_align).)
-
Rely on nonstandard extensions like Gnu's __alignof__ to make this work reliably on a particular compiler that supports such an extension. (Even though Gnu provides an ALIGNOF macro intended to work more reliably on other compilers, it too is admitted hackery that relies on the compiler's laying out objects in certain ways and making guesses based on offsetof inquiries, which might often be a good guess but is not guaranteed by the standard.)
You could work around this by dynamically allocating the array using malloc or new, which would guarantee that the char buffer is suitably aligned for an object of any type, but that would still be a bad idea (it's still not type-safe) and it would defeat the potential efficiency gains that the original article was aiming for as part of its original motivation. An alternative and correct solution would be to use boost::any (see below), which incurs a similar allocation/indirection overhead but is at least both safe and correct; more about that later on.
Attempts to work against the language, or to make the language work the way we want it to work instead of the way it actually does work, are often questionable and should be a big red flag. In the Exceptional C++ [Sutter00] sidebar cited earlier, while in an ornery mood, I also accused a similar technique of "just plain wrongheadedness" followed by some pretty strong language. There can still be cases where it could be reasonable to use constructs that are known to be nonportable but okay in a particular environment (in this case, perhaps using the max_align hack), but even then I would argue that that fact should be noted explicitly and, further, that it still has no place in a general piece of code recommended for wide use.
Into the Code
Let's now consider the code:
#include <list> #include <string> #include <iostream> using namespace std;
Always include necessary headers. Because new is going to be used below, we need to also #include <new>. (Note: The <iostream> header is okay; later in the original code, not shown here, was a test harness that emitted output using iostreams.)
#define max(a,b) (a)>(b)?(a):(b) typedef list<int> LIST; typedef string STRING; struct MYUNION { MYUNION() : currtype(NONE) {} ~MYUNION() {cleanup();}
The first classic mechanical error here is that MYUNION is unsafe to copy because the programmer forgot to provide a suitable copy constructor and copy assignment operator.
MYUNION is choosing to play games that require special work to be done in the constructor and destructor, so these functions are provided explicitly as shown; that's fine as far as it goes. But it doesn't go far enough, because the same games require special work in the copy constructor and copy assignment operator, which are not provided explicitly. That's bad because the default compiler-generated copying operations do the wrong thing; namely, they copy the contents bitwise as a char array, which is likely to have most unsatisfactory results, in most cases leading straight to memory corruption. Consider the following code:
// Example 36-3: MYUNION is unsafe for copying // { MYUNION u1, u2; u1.getstring() = "Hello, world"; u2 = u1; // copies the bits of u1 to u2 } // oops, double delete of the string (assuming the bitwise copy even made sense)
Guideline
Observe the Law of the Big Three [Cline99]: If a class needs a custom copy constructor, copy assignment operator, or destructor, it probably needs all three.
Passing on from the classic mechanical error, we next encounter a duo of classic stylistic errors:
enum uniontype {NONE,_INT,_LIST,_STRING}; uniontype currtype; inline int& getint(); inline LIST& getlist(); inline STRING& getstring();
There are two stylistic errors here. First, this struct is not reusable because it is hard-coded for specific types. Indeed, the original article recommended hand-coding such a struct every time it was needed. Second, even given its limited intended usefulness, it is not very extensible or maintainable. We'll return to this frailty again later, once we've covered more of the context.
Guideline
Avoid hard-wiring information that needlessly makes code more brittle and limits flexibility.
There are also two mechanical problems. The first is that currtype is public for no good reason; this violates good encapsulation and means any user can freely mess with the type flag, even by accident. The second mechanical problem concerns the names used in the enumeration; I'll cover that in its own section, "Underhanded Names," later on.
protected:
Here we encounter another mechanical error: The internals ought to be private, not protected. The only reason to make them protected would be to make the internals available to derived classes, but there had better not be any derived classes because MYUNION is unsafe to derive from for several reasonsnot least because of the murky and abstruse games it plays with its internals and because it lacks a virtual destructor.
Guideline
Always make all data members private. The only exception is the case of a C-style struct which isn't intended to encapsulate anything and where all members are public.
union { int i; unsigned char buff[max(sizeof(LIST),sizeof(STRING))]; } U; void cleanup(); };
That's it for the main class definition. Moving on, consider the three parallel accessor functions:
inline int& MYUNION::getint() { if(currtype==_INT) { return U.i; } else { cleanup(); currtype=_INT; return U.i; } // else } inline LIST& MYUNION::getlist() { if(currtype==_LIST) { return *(reinterpret_cast<LIST*>(U.buff)); } else { cleanup(); LIST* ptype = new(U.buff) LIST(); currtype=_LIST; return *ptype; } // else } inline STRING& MYUNION::getstring() { if(currtype==_STRING) { return *(reinterpret_cast<STRING*>(U.buff)); } else { cleanup(); STRING* ptype = new(U.buff) STRING(); currtype=_STRING; return *ptype; } // else }
A minor nit: The // else comments add nothing. It's unfortunate that the only comments in the code are useless ones.
Guideline
Write (only) useful comments. Never write comments that repeat the code; instead, write comments that explain the code and the reasons why you wrote it that way.
More seriously, there are three major problems here. The first is that the functions are not written symmetrically, and whereas the first use of a list or a string yields a default-constructed object, the first use of int yields an uninitialized object. If that is intended, in order to mirror the ordinary semantics of uninitialized int variables, then that should be documented; because it is not, the int ought to be initialized. For example, if the caller accesses getint and tries to make a copy of the (uninitialized) value, the result is undefined behaviornot all platforms support copying arbitrary invalid int values, and some will reject the instruction at run-time.
The second major problem is that this code hinders const-correct use. If the code is really going to be written this way, then at least it would be useful to also provide const overloads for each of these functions; each would naturally return the same thing as its non-const counterpart, but by a reference to const.
Guideline
Practice const-correctness.
The third major problem is that this approach is fragile and brittle in the face of change. It relies on type switching, and it's easy to accidentally fail to keep all the functions in sync when you add or remove new types.
Stop reading here and consider: What do you have to do in the published code if you want to add a new type? Make as complete a list as you can.
• • • • • •
Are you back? All right, here's the list I came up with. To add a new type, you have to remember to:
-
Add a new enum value;
-
Add a new accessor member;
-
Update the cleanup function to safely destroy the new type; and
-
Add that type to the max calculation to ensure buff is sufficiently large to hold the new type too.
If you missed one or more of those, well, that just illustrates how difficult this code really is to maintain and extend.
Pressing onward, we come to the final function:
void MYUNION::cleanup() { switch(currtype) { case _LIST: { LIST& ptype = getlist(); ptype.~LIST(); break; } // case case _STRING: { STRING& ptype = getstring(); ptype.~STRING(); break; } // case default: break; } // switch currtype=NONE; }
Let's reprise that small commenting nit: The // case and // switch comments add nothing; it's unfortunate that the only comments in the code are useless ones. It is better to have no comments at all than to have comments that are just distractions.
But there's a larger issue here: Rather than having simply default: break;, it would be good to make an exhaustive list (including the int type) and signal a logic error if the type is unknownperhaps via an assert or a throw std::logic_error(. . );.
Again, type switching is purely evil. A Google search for switch C++ Dewhurst will yield all sorts of interesting references on this topic, including [Dewhurst02]. See those references for more details if you need more ammo to convince colleagues to avoid the type-switching beast.
Guideline
Avoid type switching; prefer type safety.
Underhanded Names
There's one mechanical problem I haven't yet covered. This problem first rears its ugly, unshaven, and unshampooed head in the following line:
enum uniontype {NONE,_INT,_LIST,_STRING};
Never, ever, ever create names that begin with an underscore or contain a double underscore; they're reserved for your compiler and standard library vendor's exclusive use so that they have names they can use without tromping on your code. Tromp on their names, and their names might just tromp back on you! [41]
Don't stop! Keep reading! You might have read that advice before. You might even have read it from me. You might even be tired of it, and yawning, and ready to ignore the rest of this section. If so, this one's for you, because this advice is not at all theoretical, and it bites and bites hard in this code.
The enum definition line happens to compile on most of the compilers I tried: Borland 5.5, Comeau 4.3.0.1, gcc 2.95.3 / 3.1.1 / 3.4, Intel 7.0, and Microsoft Visual C++ 6.0 through 8.0 (2005) beta. But under two of themMetrowerks CodeWarrior 8.2 and EDG 3.0.1 used with the Dinkumware 4.0 standard librarythe code breaks horribly.
Under Metrowerks CodeWarrior 8, this one line breaks noisily with the first of 52 errors (that's not a typo). The 225 lines of error messages (again, that's not a typo) begin with the following diagnostics, which point straight at one of the commas:
### mwcc Compiler: # File: 36.cpp # -------------- # 17: enum uniontype {NONE,_INT,_LIST,_STRING}; # Error: ^ # identifier expected ### mwcc Compiler: # 18: uniontype currtype; # Error: ^^^^^^^^^ # declaration syntax error
followed by 52 further error messages and 215 more lines. What's pretty obvious from the second and later errors is that we should ignore them for now because they're just cascades from the first errorbecause uniontype was never successfully defined, the rest of the code which uses uniontype extensively will of course break too.
But what's up with the definition of uniontype? The indicated comma sure looks like it's in a reasonable place, doesn't it? There's an identifier happily sitting in front of it, isn't there? All becomes clear when we ask the Metrowerks compiler to spit out the preprocessed output. . omitting many many lines, here's what the compiler finally sees:
enum uniontype {NONE,_INT,, };
Aha! That's not valid C++, and the compiler rightly complains about the third comma because there's no identifier in front of it.
But what happened to _LIST and _STRING? You guessed ittromped on and eaten by the ravenously hungry Preprocessor Beast. It just so happens that Metrowerks' implementation has macros that happily strip away the names _LIST and _STRING, which is perfectly legal and legitimate because it (the implementation) is allowed to own those _Names (as well as other__names).
So Metrowerks' implementation happens to eat both _LIST and _STRING. That solves that part of the mystery. But what about EDG's/Dinkumware's implementations? Judge for yourself:
"1.cpp", line 17: error: trailing comma is nonstandard enum uniontype {NONE,_INT,_LIST,_STRING}; ^ "1.cpp", line 58: error: expected an expression if(currtype==_STRING) { ^ "1.cpp", line 63: error: expected an expression currtype=_STRING; ^ "1.cpp", line 76: error: expected an expression case _STRING: { ^ 4 errors detected in the compilation of "36.cpp".
This time, even without generating and inspecting a preprocessed version of the file, we can see what's going on: The compiler is behaving as though the word _STRING just wasn't there. That's because it wasyou guessed ittromped on, not to mention thoroughly chewed up and spat out, by the still-peckish Preprocessor Beast.
I hope that this will convince you that when some of us boring writers natter on about not using _Names like__these, the problem is far from theoretical, far more than mere academic tedium. It's a practical problem indeed, because the naming restriction directly affects your relationship with your compiler and standard library writer. Trespass on their turf, and you might get lucky and remain unscathed; on the other hand, you might not.
The C++ landscape is wide open and clear and lets you write all sorts of wonderful and flexible code and wander in pretty much whatever direction your development heart desires, including that it lets you choose pretty much whatever names you like outside of namespace std. But when it comes to names, C++ also has one big fenced-off grove, surrounded by gleaming barbed wire and signs that say things like "Employees__OnlyMust Have Valid _Badge To Enter Here" and "Violators Might be Tromped and Eaten." This is a stellar example of the tromping one gets for disregarding the _Warnings.
Guideline
Never use "underhanded names"ones that begin with an underscore or that contain a double underscore. They are reserved for your compiler and standard library implementation.
Toward a Better Way: boost::any
-
Show a better way to achieve a generalized variant type, and comment on any tradeoffs you encounter.
The original article says:
[Y]ou might want to implement a scripting language with a single variable type that can either be an integer, a string, or a list.
[ Manley02]
This is true, and there's no disagreement so far. But the article then continues:
A union is the perfect candidate for implementing such a composite type.
[ Manley02]
Rather, the article has served to show in some considerable detail just why a union is not suitable at all.
But if not a union, then what? One very good candidate for implementing such a variant type is [Boost]'s any facility, along with its many and any_cast. [42] Interestingly, the complete implementation for the fully general any (covering any number/combination of types and even some platform-specific #ifdefs) is about the same amount of code as the sample MYUNION solution hardwired for just the special case of the three types int, list<int>, and stringand any is fully general, extensible, and type-safe to boot, and part of a healthy low-cholesterol diet.
There is still a tradeoff, however, and it is this: dynamic allocation. The boost::any facility does not attempt to achieve the potential efficiency gain of avoiding a dynamic memory allocation, which was part of the motivation in the original article.
Note too that the boost::any dynamic allocation overhead is more than if the original article's code was just modified to use (and reuse) a single dynamically allocated buffer that's acquired once for the lifetime of MYUNION, because boost::any also performs a dynamic allocation every time the contained type is changed.
Here's how the article's demo harness would look if it instead used boost::any. The old code that uses the original article's version of MYUNION is shown in comments for comparison:
any u; // instead of: MYUNION u;
Instead of a handwritten struct, which has to be written again for each use, just use any directly. Note that any is a plain class, not a template.
// access union as integer u = 12345; // instead of: u.getint() = 12345;
The assignment shows any's more natural syntax.
cout << "int="<< any_cast<int>(u) << endl; // or just int(u) // instead of: cout << "int="<< u.getint() << endl;
I like any's cast form better because it's more general (including that it is a nonmember) and more natural to C++ style; you could also use the less verbose int(u) without an any_cast if you know the type already. On the other hand, MYUNION's get[type] is more fragile, harder to write and maintain, and so forth.
// access union as std::list u = list<int>(); list<int>& l = *any_cast<list<int> >(&u); // instead of: LIST& list = u.getlist(); l.push_back(5); // same: list.push_back(5); l.push_back(10); // same: list.push_back(10); l.push_back(15); // same: list.push_back(15);
I think any_cast could be improved to make it easier to get references, but this isn't too bad. (Aside: I'd discourage using list as a variable name when it's also the name of a template in scope; too much room for expression ambiguity.)
So far we've achieved some typability and readability savings. The remaining differences are more minor:
list<int>::iterator it = l.begin(); // instead of: LIST::iterator it = list.begin(); while(it != l.end()) { cout << "list item="<< *(it) << endl; it++; } // while
Pretty much unchanged. I've let the original comment stand because it's not germane to the side-by-side style comparison with any.
// access union as std::string u = string("Hello world!"); // instead of: STRING& str = u.getstring(); // str = "Hello world!";
Again, about a wash; I'd say the any version is slightly simpler than the original, but only slightly.
cout << "string='"<< any_cast<string>(u) << "'"<< endl; // or just "string(u)" // instead of: cout << "string='"<< str.c_str() << "'"<< endl;
As before.
Alexandrescu's Discriminated Unions
Is it possible to fully achieve both of the original goalssafety and avoiding dynamic memoryin a conforming standard C++ implementation? That sounds like a problem that someone like Andrei Alexandrescu would love to sink his teeth into, especially if it could somehow involve complicated templates. As evidenced in [Alexandrescu02], where he describes his discriminated unions (aka Variant) approach, it turns out that:
-
it is (something he would love to tackle), and
-
it can (involve weird templates, and just one quote from [Alexandrescu02] says it all: "Did you know that unions can be templates?"), so
-
he does.
In short, by performing heroic efforts to push the boundaries of the language as far as possible, Alexandrescu's Variant comes very close to being a truly portable solution. It falls only slightly short and is probably portable enough in practice even though it too goes beyond the pale of what the Standard guarantees. Its main problem is that, even ignoring alignment-related issues, the Variant code is so complex and advanced that it actually works on very few compilersin my testing, I only managed to get it to work with one.
A key part of Alexandrescu's Variant approach is an attempt to generalize the max_align idea to make it a reusable library facility that can itself still be written in conforming standard C++. The reason for wanting this is specifically to deal with the alignment problems in the code we've been analyzing so that a non-dynamic char buffer can continue to be used in relative safety. Alexandrescu makes heroic efforts to use template metaprogramming to calculate a safe alignment. Will it work portably? His discussion of this question follows:
Even with the best Align, the implementation above is still not 100-percent portable for all types. In theory, someone could implement a compiler that respects the Standard but still does not work properly with discriminated unions. This is because the Standard does not guarantee that all user-defined types ultimately have the alignment of some POD type. Such a compiler, however, would be more of a figment of a wicked language lawyer's imagination, rather than a realistic language implementation.
[. . ] Computing alignment portably is hard, but feasible. It never is 100-percent portable.
[Alexandrescu02]
There are other key features in Alexandrescu's approach, notably a union template that takes a typelist template of the types to be contained, visitation support for extensibility, and an implementation technique that will "fake a vtable" for efficiency to avoid an extra indirection when accessing a contained type. These parts are more heavyweight than boost::any but are portable in theory. That "portable in theory" part is importantas with Andrei's great work in Modern C++ Design [Alexandrescu01], the implementation is so heavy on templates that the code itself contains comments like, "Guaranteed to issue an internal compiler error on: [various popular compilers, Metrowerks, Microsoft, Gnu gcc]," and the mainline test harness contains a commented-out test helpfully labeled "The construct below didn't work on any compiler."
That is Variant's major weakness: Most real-world compilers don't even come close to being able to handle this implementation, and the code should be viewed as important but still experimental. I attempted to build Alexandrescu's Variant code using a variety of compilers: Borland 5.5; Comeau 4.3.0.1; EDG 3.0.1; gcc 2.95, 3.1.1, and 3.2; Intel 7.0; Metrowerks 8.2; and Microsoft VC++ 6.0, 7.0 (2002), and 7.1 (2003). As some readers will know, some of the products in that list are very strong and standards-conforming compilers. None of these compilers could successfully compile Alexandrescu's template-heavy source as it was provided.
I tried to massage the code by hand to get it through any of the compilers but was successful only with Microsoft VC++ 7.1 (2003). Most of the compilers didn't stand a chance, because they did not have nearly strong enough template support to deal with Alexandrescu's code. (Some emitted a truly prodigious quantity of warnings and errorsIntel 7.0's response to compiling main.cpp was to spew back an impressive 430K worthreally, nearly half a megabyteof diagnostic messages.)
I had to make three changes to get the code to compile without errors (although still with some narrowing-conversion warnings at the highest warning level) under Microsoft VC++ 7.1 (2003):
-
Added a missing typename in class AlignedPOD.
-
Added a missing this-> to make a name dependent in ConverterTo<>::-Unit<>::DoVisit().
-
Added a final newline character at the end of several headers, as required by the C++ standard (some conforming compilers aren't strict about this and allow the absence of a final newline as a conforming extension; VC++ is stricter and requires the newline). [43]
As the author of [Manley02] commented further about tradeoffs in Alexandrescu's design:
It doesn't use dynamic memory, and it avoids alignment issues and type switching. Unfortunately I don't have access to a compiler that can compile the code, so I can't evaluate its performance vs. myunion and any. Alexandrescu's approach requires 9 supporting header files totaling ~80KB, which introduces its own set of maintenance problems.
K. Manley, private communication
Those points are all valid.
I won't try to summarize Andrei's three articles further here, but I encourage readers who are interested in this problem to look them up. They're available online as indicated in the bibliography.
Guideline
If you want to represent variant types, for now prefer to use boost::any (or something equally simple).
Once the compiler you are using catches up (in template support) and the Standard catches up (in true alignment support) and Variant libraries catch up (in mature implementations), it will be time to consider using Variant-like library tools as type-safe replacements for unions.
Summary
Even if the design and implementation of MYUNION are lacking, the motivating problem is both real and worth considering. I'd like to thank Mr. Manley for taking the time to write his article and raise awareness of the need for variant type support and Kevlin Henney and Andrei Alexandrescu for contributing their own solutions to this area. It is a hard enough problem that Manley's and Alexandrescu's approaches are not strictly portable, standards-conforming C++, although Alexandrescu's Variant makes heroic efforts to get thereAlexandrescu's design is very close to portable in theory, although the implementation is still far from portable in practice because very few compilers can handle the advanced template code it uses.
For now, an approach like boost::any is the preferred way to go. If in certain places your measurements tell you that you really need the efficiency or extra features provided by something like Alexandrescu's Variant, and you have time on your hands and some template know-how, you might experiment with writing your own scaled-back version of the full-blown Variant by applying only the ideas in [Alexandrescu02] that are applicable to your situation.