[Comp.Sci.Dept, Utrecht] Note from archiver<at>cs.uu.nl: This page is part of a big collection of Usenet postings, archived here for your convenience. For matters concerning the content of this page, please contact its author(s); use the source, if all else fails. For matters concerning the archive as a whole, please refer to the archive description or contact the archiver.

Subject: C++ FAQ (part 05 of 14)

This article was archived around: NNTP-Posting-Mon, 17 Jun 2002 22:45:41 EDT

All FAQs in Directory: C++-faq
All FAQs posted in: comp.lang.c++, alt.comp.lang.learn.c-c++
Source: Usenet Version


Archive-name: C++-faq/part05 Posting-Frequency: monthly Last-modified: Jun 17, 2002 URL: http://www.parashift.com/c++-faq-lite/
AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470 COPYRIGHT: This posting is part of "C++ FAQ Lite." The entire "C++ FAQ Lite" document is Copyright(C)1991-2002 Marshall Cline, Ph.D., cline@parashift.com. All rights reserved. Copying is permitted only under designated situations. For details, see section [1]. NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS. THE AUTHOR PROVIDES NO WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as the C++ FAQ Book. The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500% larger than this document, and is available in bookstores. For details, see section [3]. ============================================================================== SECTION [9]: Inline functions [9.1] What's the deal with inline functions? An inline function is a function whose code gets inserted into the caller's code stream. Like a #define macro, inline functions improve performance by avoiding the overhead of the call itself and (especially!) by the compiler being able to optimize through the call ("procedural integration"). ============================================================================== [9.2] How can inline functions help with the tradeoff of safety vs. speed? In straight C, you can achieve "encapsulated structs" by putting a void* in a struct, in which case the void* points to the real data that is unknown to users of the struct. Therefore users of the struct don't know how to interpret the stuff pointed to by the void*, but the access functions cast the void* to the approprate hidden type. This gives a form of encapsulation. Unfortunately it forfeits type safety, and also imposes a function call to access even trivial fields of the struct (if you allowed direct access to the struct's fields, anyone and everyone would be able to get direct access since they would of necessity know how to interpret the stuff pointed to by the void*; this would make it difficult to change the underlying data structure). Function call overhead is small, but can add up. C++ classes allow function calls to be expanded inline. This lets you have the safety of encapsulation along with the speed of direct access. Furthermore the parameter types of these inline functions are checked by the compiler, an improvement over C's #define macros. ============================================================================== [9.3] Why should I use inline functions? Why not just use plain old #define macros? Because #define macros are evil[6.14] in 4 different ways: evil#1[9.3], evil#2[36.2], evil#3[36.3], and evil#4[36.4]. Unlike #define macros, inline functions avoid infamous macro errors since inline functions always evaluate every argument exactly once. In other words, invoking an inline function is semantically just like invoking a regular function, only faster: // A macro that returns the absolute value of i #define unsafe(i) \ ( (i) >= 0 ? (i) : -(i) ) // An inline function that returns the absolute value of i inline int safe(int i) { return i >= 0 ? i : -i; } int f(); void userCode(int x) { int ans; ans = unsafe(x++); // Error! x is incremented twice ans = unsafe(f()); // Danger! f() is called twice ans = safe(x++); // Correct! x is incremented once ans = safe(f()); // Correct! f() is called once } Also unlike macros, argument types are checked, and necessary conversions are performed correctly. Macros are bad for your health; don't use them unless you have to. ============================================================================== [9.4] How do you tell the compiler to make a non-member function inline? When you declare an inline function, it looks just like a normal function: void f(int i, char c); But when you define an inline function, you prepend the function's definition with the keyword inline, and you put the definition into a header file: inline void f(int i, char c) { // ... } Note: It's imperative that the function's definition (the part between the {...}) be placed in a header file, unless the function is used only in a single .cpp file. In particular, if you put the inline function's definition into a .cpp file and you call it from some other .cpp file, you'll get an "unresolved external" error from the linker. ============================================================================== [9.5] How do you tell the compiler to make a member function inline? When you declare an inline member function, it looks just like a normal member function: class Fred { public: void f(int i, char c); }; But when you define an inline member function, you prepend the member function's definition with the keyword inline, and you put the definition into a header file: inline void Fred::f(int i, char c) { // ... } It's usually imperative that the function's definition (the part between the {...}) be placed in a header file. If you put the inline function's definition into a .cpp file, and if it is called from some other .cpp file, you'll get an "unresolved external" error from the linker. ============================================================================== [9.6] Is there another way to tell the compiler to make a member function inline? Yep: define the member function in the class body itself: class Fred { public: void f(int i, char c) { // ... } }; Although this is easier on the person who writes the class, it's harder on all the readers since it mixes "what" a class does with "how" it does them. Because of this mixture, we normally prefer to define member functions outside the class body with the inline keyword[9.5]. The insight that makes sense of this: in a reuse-oriented world, there will usually be many people who use your class, but there is only one person who builds it (yourself); therefore you should do things that favor the many rather than the few. ============================================================================== [9.7] Are inline functions guaranteed to make your performance better? Nope. Beware that overuse of inline functions can cause code bloat, which can in turn have a negative performance impact in paging environments. The term code bloat simply means that the size of the code gets larger (bloated). In the context of inline functions, the concern is that too many inline functions might increase the size of the executable (i.e., cause code bloat), and that might cause the operating system to thrash, which simply means it spends most of its time going out to disk to pull in the next chunk of code. Of course it's also possible that inline functions will decrease the size of the executable. This may seem backwards, but it's really true. In particular, the amount of code necessary to call a function is sometimes greater than the amount of code to expand the function inline. This can happen with very short functions, and it can also happen with long functions when the optimizer is able to remove a lot of redundant code -- that is, when the optimizer is able to make the long function short. So the message is this: there is no simple answer. You have to play with it to see what is best. Do not settle for a simplistic answer like, "Never use inline functions" or "Always use inline functions" or "Use inline functions if and only if the function is less than N lines of code." These one-size-fits-all rules may be easy to use, but they will produce sub-optimal results. ============================================================================== SECTION [10]: Constructors [10.1] What's the deal with constructors? Constructors build objects from dust. Constructors are like "init functions". They turn a pile of arbitrary bits into a living object. Minimally they initialize internally used fields. They may also allocate resources (memory, files, semaphores, sockets, etc). "ctor" is a typical abbreviation for constructor. ============================================================================== [10.2] Is there any difference between List x; and List x();? A big difference! Suppose that List is the name of some class. Then function f() declares a local List object called x: void f() { List x; // Local object named x (of class List) // ... } But function g() declares a function called x() that returns a List: void g() { List x(); // Function named x (that returns a List) // ... } ============================================================================== [10.3] How can I make a constructor call another constructor as a primitive? No way. Dragons be here: if you call another constructor, the compiler initializes a temporary local object; it does not initialize this object. You can combine both constructors by using a default parameter, or you can share their common code in a private init() member function. ============================================================================== [10.4] Is the default constructor for Fred always Fred::Fred()? No. A "default constructor" is a constructor that can be called with no arguments. Thus a constructor that takes no arguments is certainly a default constructor: class Fred { public: Fred(); // Default constructor: can be called with no args // ... }; However it is possible (and even likely) that a default constructor can take arguments, provided they are given default values: class Fred { public: Fred(int i=3, int j=5); // Default constructor: can be called with no args // ... }; ============================================================================== [10.5] Which constructor gets called when I create an array of Fred objects? Fred's default constructor[10.4] (except as discussed below). There is no way to tell the compiler to call a different constructor (except as discussed below). If your class Fred doesn't have a default constructor[10.4], attempting to create an array of Fred objects is trapped as an error at compile time. class Fred { public: Fred(int i, int j); // ... assume there is no default constructor[10.4] in class Fred ... }; int main() { Fred a[10]; // ERROR: Fred doesn't have a default constructor Fred* p = new Fred[10]; // ERROR: Fred doesn't have a default constructor } However if you are constructing an object of the standard std::vector<Fred>[34.1] rather than an array of Fred (which you probably should be doing anyway since arrays are evil[33.1]), you don't have to have a default constructor in class Fred, since you can give the std::vector a Fred object to be used to initialize the elements: #include <vector> int main() { std::vector<Fred> a(10, Fred(5,7)); // The 10 Fred objects in std::vector a will be initialized with Fred(5,7). // ... } Even though you ought to use a std::vector rather than an array, there are times when an array might be the right thing to do, and for those, there is the "explicit initialization of arrays" syntax. Here's how it looks: class Fred { public: Fred(int i, int j); // ... assume there is no default constructor[10.4] in class Fred ... }; int main() { Fred a[10] = { Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7) }; // The 10 Fred objects in array a will be initialized with Fred(5,7). // ... } Of course you don't have to do Fred(5,7) for every entry -- you can put in any numbers you want, even parameters or other variables. The point is that this syntax is (a) doable but (b) not as nice as the std::vector syntax. Remember this: arrays are evil[33.1] -- unless there is a compelling reason to use an array, use a std::vector instead. ============================================================================== [10.6] Should my constructors use "initialization lists" or "assignment"? [UPDATED!] [Recently added symmetry argument wrt non-static const members thanks to Tanmoy Bhattacharya (in 6/02).] Initialization lists. In fact, constructors should initialize all member objects in the initialization list. For example, this constructor initializes member object x_ using an initialization list: Fred::Fred() : x_(whatever) { }. The most common benefit of doing this is improved performance. For example, if the expression whatever is the same as member variable x_, the result of the whatever expression is constructed directly inside x_ -- the compiler does not make a separate copy of the object. Even if the types are not the same, the compiler is usually able to do a better job with initialization lists than with assignments. The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() { x_ = whatever; }. In this case the expression whatever causes a separate, temporary object to be created, and this temporary object is passed into the x_ object's assignment operator. Then that temporary object is destructed at the ;. That's inefficient. As if that wasn't bad enough, there's another source of inefficiency when using assignment in a constructor: the member object will get fully constructed by its default constructor, and this might, for example, allocate some default amount of memory or open some default file. All this work could be for naught if the whatever expression and/or assignment operator causes the object to close that file and/or release that memory (e.g., if the default constructor didn't allocate a large enough pool of memory or if it opened the wrong file). Conclusion: All other things being equal, your code will run faster if you use initialization lists rather than assignment. Note: There is no performance difference if the type of x_ is some built-in/intrinsic type, such as int or char* or float. But even in these cases, my personal preference is to set those data members in the initialization list rather than via assignment for consistency. Another symmetry argument in favor of using initialization lists even for built-in/intrinsic types: non-static const data members can't be assigned a value in the constructor, so for symmetry it makes sense to initialize everything in the initialization list. ============================================================================== [10.7] Should you use the this pointer in the constructor? Some people feel you should not use the this pointer in a constructor because the object is not fully formed yet. However you can use this in the constructor (in the {body} and even in the initialization list[10.6]) if you are careful. Here is something that always works: the {body} of a constructor (or a function called from the constructor) can reliably access the data members declared in a base class and/or the data members declared in the constructor's own class. This is because all those data members are guaranteed to have been fully constructed by the time the constructor's {body} starts executing. Here is something that never works: the {body} of a constructor (or a function called from the constructor) cannot get down to a derived class by calling a virtual member function that is overridden in the derived class. If your goal was to get to the overridden function in the derived class, you won't get what you want[23.3]. Note that you won't get to the override in the derived class independent of how you call the virtual member function: explicitly using the this pointer (e.g., this->method()), implicitly using the this pointer (e.g., method()), or even calling some other function that calls the virtual member function on your this object. The bottom line is this: even if the caller is constructing an object of a derived class, during the constructor of the base class, your object is not yet of that derived class[23.3]. You have been warned. Here is something that sometimes works: if you pass any of the data members in this object to another data member's initializer[10.6], you must make sure that the other data member has already been initialized. The good news is that you can determine whether the other data member has (or has not) been initialized using some straightforward language rules that are independent of the particular compiler you're using. The bad news it that you have to know those language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or virtual inheritance!), then data members defined in the class are initialized in the order in which they appear in the class declaration). If you don't know these rules, then don't pass any data member from the this object (regardless of whether or not you explicitly use the this keyword) to any other data member's initializer[10.6]! And if you do know the rules, please be careful. ============================================================================== [10.8] What is the "Named Constructor Idiom"? A technique that provides more intuitive and/or safer construction operations for users of your class. The problem is that constructors always have the same name as the class. Therefore the only way to differentiate between the various constructors of a class is by the parameter list. But if there are lots of constructors, the differences between them become somewhat subtle and error prone. With the Named Constructor Idiom, you declare all the class's constructors in the private or protected sections, and you provide public static methods that return an object. These static methods are the so-called "Named Constructors." In general there is one such static method for each different way to construct an object. For example, suppose we are building a Point class that represents a position on the X-Y plane. Turns out there are two common ways to specify a 2-space coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle). (Don't worry if you can't remember these; the point isn't the particulars of coordinate systems; the point is that there are several ways to create a Point object.) Unfortunately the parameters for these two coordinate systems are the same: two floats. This would create an ambiguity error in the overloaded constructors: class Point { public: Point(float x, float y); // Rectangular coordinates Point(float r, float a); // Polar coordinates (radius and angle) // ERROR: Overload is Ambiguous: Point::Point(float,float) }; int main() { Point p = Point(5.7, 1.2); // Ambiguous: Which coordinate system? } One way to solve this ambiguity is to use the Named Constructor Idiom: #include <cmath> // To get sin() and cos() class Point { public: static Point rectangular(float x, float y); // Rectangular coord's static Point polar(float radius, float angle); // Polar coordinates // These static methods are the so-called "named constructors" // ... private: Point(float x, float y); // Rectangular coordinates float x_, y_; }; inline Point::Point(float x, float y) : x_(x), y_(y) { } inline Point Point::rectangular(float x, float y) { return Point(x, y); } inline Point Point::polar(float radius, float angle) { return Point(radius*cos(angle), radius*sin(angle)); } Now the users of Point have a clear and unambiguous syntax for creating Points in either coordinate system: int main() { Point p1 = Point::rectangular(5.7, 1.2); // Obviously rectangular Point p2 = Point::polar(5.7, 1.2); // Obviously polar } Make sure your constructors are in the protected section if you expect Point to have derived classes. The Named Constructor Idiom can also be used to make sure your objects are always created via new[16.20]. ============================================================================== [10.9] Why can't I initialize my static member data in my constructor's initialization list? Because you must explicitly define your class's static data members. Fred.h: class Fred { public: Fred(); // ... private: int i_; static int j_; }; Fred.cpp (or Fred.C or whatever): Fred::Fred() : i_(10) // OK: you can (and should) initialize member data this way , j_(42) // Error: you cannot initialize static member data like this { // ... } // You must define static data members this way: int Fred::j_ = 42; ============================================================================== [10.10] Why are classes with static data members getting linker errors? Because static data members must be explicitly defined in exactly one compilation unit[10.9]. If you didn't do this, you'll probably get an "undefined external" linker error. For example: // Fred.h class Fred { public: // ... private: static int j_; // Declares static data member Fred::j_ // ... }; The linker will holler at you ("Fred::j_ is not defined") unless you define (as opposed to merely declare) Fred::j_ in (exactly) one of your source files: // Fred.cpp #include "Fred.h" int Fred::j_ = some_expression_evaluating_to_an_int; // Alternatively, if you wish to use the implicit 0 value for static ints: // int Fred::j_; The usual place to define static data members of class Fred is file Fred.cpp (or Fred.C or whatever source file extension you use). ============================================================================== [10.11] What's the "static initialization order fiasco"? A subtle way to kill your project. The static initialization order fiasco is a very subtle and commonly misunderstood aspect of C++. Unfortunately it's very hard to detect -- the errors occur before main() begins. In short, suppose you have two static objects x and y which exist in separate source files, say x.cpp and y.cpp. Suppose further that the initialization for the y object (typically the y object's constructor) calls some method on the x object. That's it. It's that simple. The tragedy is that you have a 50%-50% chance of dying. If the compilation unit for x.cpp happens to get initialized first, all is well. But if the compilation unit for y.cpp get initialized first, then y's initialization will get run before x's initialization, and you're toast. E.g., y's constructor could call a method on the x object, yet the x object hasn't yet been constructed. I hear they're hiring down at McDonalds. Enjoy your new job flipping burgers. If you think it's "exciting" to play Russian Roulette with live rounds in half the chambers, you can stop reading here. On the other hand if you like to improve your chances of survival by preventing disasters in a systematic way, you probably want to read the next FAQ[10.12]. Note: The static initialization order fiasco can also, in some cases[10.15], apply to built-in/intrinsic types. ============================================================================== [10.12] How do I prevent the "static initialization order fiasco"? Use the "construct on first use" idiom, which simply means to wrap your static object inside a function. For example, suppose you have two classes, Fred and Barney. There is a global Fred object called x, and a global Barney object called y. Barney's constructor invokes the goBowling() method on the x object. The file x.cpp defines the x object: // File x.cpp #include "Fred.hpp" Fred x; The file y.cpp defines the y object: // File y.cpp #include "Barney.hpp" Barney y; For completeness the Barney constructor might look something like this: // File Barney.cpp #include "Barney.hpp" Barney::Barney() { // ... x.goBowling(); // ... } As described above[10.11], the disaster occurs if y is constructed before x, which happens 50% of the time since they're in different source files. There are many solutions to this problem, but a very simple and completely portable solution is to replace the global Fred object, x, with a global function, x(), that returns the Fred object by reference. // File x.cpp #include "Fred.hpp" Fred& x() { static Fred* ans = new Fred(); return *ans; } Since static local objects are constructed the first time control flows over their declaration (only), the above new Fred() statement will only happen once: the first time x() is called. Every subsequent call will return the same Fred object (the one pointed to by ans). Then all you do is change your usages of x to x(): // File Barney.cpp #include "Barney.hpp" Barney::Barney() { // ... x().goBowling(); // ... } This is called the Construct On First Use Idiom because it does just that: the global Fred object is constructed on its first use. The downside of this approach is that the Fred object is never destructed. There is another technique[10.13] that answers this concern, but it needs to be used with care since it creates the possibility of another (equally nasty) problem. Note: The static initialization order fiasco can also, in some cases[10.15], apply to built-in/intrinsic types. ============================================================================== [10.13] Why doesn't the construct-on-first-use idiom use a static object instead of a static pointer? [UPDATED!] [Recently made substantive changes to the second-to-last paragraph thanks to Amitha Perera and Wil Evers (in 6/02).] Short answer: it's possible to use a static object rather than a static pointer[10.12], but doing so opens up another (equally subtle, equally nasty) problem. Long answer: sometimes people worry about the fact that the previous solution[10.12] "leaks." In many cases, this is not a problem, but it is a problem in some cases. Note: even though the object pointed to by ans in the previous FAQ is never deleted, the memory doesn't actually "leak" when the program exits since the operating system automatically reclaims all the memory in a program's heap when that program exits. In other words, the only time you'd need to worry about this is when the destructor for the Fred object performs some important action (such as writing something to a file) that must occur sometime while the program is exiting. In those cases where the construct-on-first-use object (the Fred, in this case) needs to eventually get destructed, you might consider changing function x() as follows: // File x.cpp #include "Fred.hpp" Fred& x() { static Fred ans; // was static Fred* ans = new Fred(); return ans; // was return *ans; } However there is (or rather, may be) a rather subtle problem with this change. To understand this potential problem, let's remember why we're doing all this in the first place: we need to make 100% sure our static object (a) gets constructed prior to its first use and (b) doesn't get destructed until after its last use. Obviously it would be a disaster if any static object got used either before construction or after destruction. The message here is that you need to worry about two situations (static initialization and static deinitialization), not just one. By changing the declaration from static Fred* ans = new Fred(); to static Fred ans;, we still correctly handle the initialization situation but we no longer handle the deinitialization situation. For example, if there are 3 static objects, say a, b and c, that use ans during their destructors, the only way to avoid a static deinitialization disaster is if ans is destructed after all three. The point is simple: if there are any other static objects whose destructors might use ans after ans is destructed, bang, you're dead. If the constructors of a, b and c use ans, you should normally be okay since the runtime system will, during static deinitialization, destruct ans after the last of those three objects is destructed. However if a and/or b and/or c fail to use ans in their constructors and/or if any code anywhere gets the address of ans and hands it to some other static object, all bets are off and you have to be very, very careful. There is a third approach that handles both the static initialization and static deinitialization situations, but it has other non-trivial costs. I'm too lazy (and busy!) to write any more FAQs today so if you're interested in that third approach, you'll have to buy a book that describes that third approach in detail. The C++ FAQs book[3.1] is one of those books, and it also gives the cost/benefit analysis to decide if/when that third approach should be used. ============================================================================== [10.14] How do I prevent the "static initialization order fiasco" for my static data members? Just use the same technique just described[10.12], but this time use a static member function rather than a global function. Suppose you have a class X that has a static Fred object: // File X.hpp class X { public: // ... private: static Fred x_; }; Naturally this static member is initialized separately: // File X.cpp #include "X.hpp" Fred X::x_; Naturally also the Fred object will be used in one or more of X's methods: void X::someMethod() { x_.goBowling(); } But now the "disaster scenario" is if someone somewhere somehow calls this method before the Fred object gets constructed. For example, if someone else creates a static X object and invokes its someMethod() method during static initialization, then you're at the mercy of the compiler as to whether the compiler will construct X::x_ before or after the someMethod() is called. (Note that the ANSI/ISO C++ committee is working on this problem, but compilers aren't yet generally available that handle these changes; watch this space for an update in the future.) In any event, it's always portable and safe to change the X::x_ static data member into a static member function: // File X.hpp class X { public: // ... private: static Fred& x(); }; Naturally this static member is initialized separately: // File X.cpp #include "X.hpp" Fred& X::x() { static Fred* ans = new Fred(); return *ans; } Then you simply change any usages of x_ to x(): void X::someMethod() { x().goBowling(); } If you're super performance sensitive and you're concerned about the overhead of an extra function call on each invocation of X::someMethod() you can set up a static Fred& instead. As you recall, static local are only initialized once (the first time control flows over their declaration), so this will call X::x() only once: the first time X::someMethod() is called: void X::someMethod() { static Fred& x = X::x(); x.goBowling(); } Note: The static initialization order fiasco can also, in some cases[10.15], apply to built-in/intrinsic types. ============================================================================== [10.15] Do I need to worry about the "static initialization order fiasco" for variables of built-in/intrinsic types? Yes. If you initialize your built-in/intrinsic type using a function call, the static initialization order fiasco is able to kill you just as bad as with user-defined/class types. For example, the following code shows the failure: #include <iostream> int f(); // forward declaration int g(); // forward declaration int x = f(); int y = g(); int f() { cout << "using 'y' (which is " << y << ")\n"; return 3*y + 7; } int g() { cout << "initializing 'y'\n"; return 5; } The output of this little program will show that it uses y before initializing it. The solution, as before, is the Construct On First Use Idiom: #include <iostream> int f(); // forward declaration int g(); // forward declaration int& x() { static int ans = f(); return ans; } int& y() { static int ans = g(); return ans; } int f() { cout << "using 'y' (which is " << y() << ")\n"; return 3*y() + 7; } int g() { cout << "initializing 'y'\n"; return 5; } Of course you might be able to simplify this by moving the initialization code for x and y into their respective functions: #include <iostream> int& y(); // forward declaration int& x() { static int ans; static bool firstTime = true; if (firstTime) { firstTime = false; cout << "using 'y' (which is " << y() << ")\n"; ans = 3*y() + 7; } return ans; } int& y() { static int ans; static bool firstTime = true; if (firstTime) { firstTime = false; cout << "initializing 'y'\n"; ans = 5; } return ans; } And, if you can get rid of the print statements you can further simplify these to something really simple: int& y(); // forward declaration int& x() { static int ans = 3*y() + 7; return ans; } int& y() { static int ans = 5; return ans; } Furthermore, since y is initialized using a constant expression, it no longer needs its wrapper function -- it can be a simple variable again. ============================================================================== [10.16] How can I handle a constructor that fails? Throw an exception. See [17.2] for details. ============================================================================== [10.17] What is the "Named Parameter Idiom"? It's a fairly useful way to exploit method chaining[8.4]. The fundamental problem solved by the Named Parameter Idiom is that C++ only supports positional parameters. For example, a caller of a function isn't allowed to say, "Here's the value for formal parameter xyz, and this other thing is the value for formal parameter pqr." All you can do in C++ (and C and Java) is say, "Here's the first parameter, here's the second parameter, etc." The alternative, called named parameters and implemented in the language Ada, is especially useful if a function takes a large number of mostly default-able parameters. Over the years people have cooked up lots of workarounds for the lack of named parameters in C and C++. One of these involves burying the parameter values in a string parameter then parsing this string at run-time. This is what's done in the second parameter of fopen(), for example. Another workaround is to combine all the boolean parameters in a bit-map, then the caller or's a bunch of bit-shifted constants together to produce the actual parameter. This is what's done in the second parameter of open(), for example. These approaches work, but the following technique produces caller-code that's more obvious, easier to write, easier to read, and is generally more elegant. The idea, called the Named Parameter Idiom, is to change the function's parameters to methods of a newly created class, where all these methods return *this by reference. Then you simply rename the main function into a parameterless "do-it" method on that class. We'll work an example to make the previous paragraph easier to understand. The example will be for the "open a file" concept. Let's say that concept logically requires a parameter for the file's name, and optionally allows parameters for whether the file should be opened read-only vs. read-write vs. write-only, whether or not the file should be created if it doesn't already exist, whether the writing location should be at the end ("append") or the beginning ("overwrite"), the block-size if the file is to be created, whether the I/O is buffered or non-buffered, the buffer-size, whether it is to be shared vs. exclusive access, and probably a few others. If we implemented this concept using a normal function with positional parameters, the caller code would be very difficult to read: there'd be as many as 8 positional parameters, and the caller would probably make a lot of mistakes. So instead we use the Named Parameter Idiom. Before we go through the implementation, here's what the caller code might look like, assuming you are willing to accept all the function's default parameters: File f = OpenFile("foo.txt"); That's the easy case. Now here's what it might look like if you want to change a bunch of the parameters. File f = OpenFile("foo.txt"). readonly(). createIfNotExist(). appendWhenWriting(). blockSize(1024). unbuffered(). exclusiveAccess(); Notice how the "parameters", if it's fair to call them that, are in random order (they're not positional) and they all have names. So the programmer doesn't have to remember the order of the parameters, and the names are (hopefully) obvious. So here's how to implement it: first we create a new class (OpenFile) that houses all the parameter values as private data members. Then all the methods (readonly(), blockSize(unsigned), etc.) return *this (that is, they return a reference to the OpenFile object, allowing the method calls to be chained[8.4]. Finally we make the required parameter (the file's name, in this case) into a normal, positional, parameter on OpenFile's constructor. class File; class OpenFile { public: OpenFile(const string& filename); // sets all the default values for each data member OpenFile& readonly(); // changes readonly_ to true OpenFile& createIfNotExist(); OpenFile& blockSize(unsigned nbytes); // ... private: friend File; bool readonly_; // defaults to false [for example] // ... unsigned blockSize_; // defaults to 4096 [for example] // ... }; The only other thing to do is make the constructor for class File to take an OpenFile object: class File { public: File(const OpenFile& params); // vacuums the actual params out of the OpenFile object // ... }; Note that OpenFile declares File as its friend[14], that way OpenFile doesn't need a bunch of (otherwise useless) public: get methods[14.2]. Since each member function in the chain returns a reference, there is no copying of objects and the chain is highly efficient. Furthermore, if the various member functions are inline, the generated object code will probably be on par with C-style code that sets various members of a struct. Of course if the member functions are not inline, there may be a slight increase in code size and a slight decrease in performance (but only if the construction occurs on the critical path of a CPU-bound program; this is a can of worms I'll try to avoid opening; read the C++ FAQs book[3.1] for a rather thorough discussion of the issues), so it may, in this case, be a trade-off for making the code more reliable. ==============================================================================