[Comp.Sci.Dept, Utrecht] Note from archiver<at>cs.uu.nl: This page is part of a big collection of Usenet postings, archived here for your convenience. For matters concerning the content of this page, please contact its author(s); use the source, if all else fails. For matters concerning the archive as a whole, please refer to the archive description or contact the archiver.

Subject: C++ FAQ (part 09 of 14)

This article was archived around: NNTP-Posting-Mon, 17 Jun 2002 22:46:35 EDT

All FAQs in Directory: C++-faq
All FAQs posted in: comp.lang.c++, alt.comp.lang.learn.c-c++
Source: Usenet Version


Archive-name: C++-faq/part09 Posting-Frequency: monthly Last-modified: Jun 17, 2002 URL: http://www.parashift.com/c++-faq-lite/
AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470 COPYRIGHT: This posting is part of "C++ FAQ Lite." The entire "C++ FAQ Lite" document is Copyright(C)1991-2002 Marshall Cline, Ph.D., cline@parashift.com. All rights reserved. Copying is permitted only under designated situations. For details, see section [1]. NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS. THE AUTHOR PROVIDES NO WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as the C++ FAQ Book. The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500% larger than this document, and is available in bookstores. For details, see section [3]. ============================================================================== SECTION [17]: Exceptions and error handling [17.1] What are some ways try / catch / throw can improve software quality? By eliminating one of the reasons for if statements. The commonly used alternative to try / catch / throw is to return a return code (sometimes called an error code) that the caller explicitly tests via some conditional statement such as if. For example, printf(), scanf() and malloc() work this way: the caller is supposed to test the return value to see if the function succeeded. Although the return code technique is sometimes the most appropriate error handling technique, there are some nasty side effects to adding unnecessary if statements: * Degrade quality: It is well known that conditional statements are approximately ten times more likely to contain errors than any other kind of statement. So all other things being equal, if you can eliminate conditionals / conditional statements from your code, you will likely have more robust code. * Slow down time-to-market: Since conditional statements are branch points which are related to the number of test cases that are needed for white-box testing, unnecessary conditional statements increase the amount of time that needs to be devoted to testing. Basically if you don't exercise every branch point, there will be instructions in your code that will never have been executed under test conditions until they are seen by your users/customers. That's bad. * Increase development cost: Bug finding, bug fixing, and testing are all increased by unnecessary control flow complexity. So compared to error reporting via return-codes and if, using try / catch / throw is likely to result in code that has fewer bugs, is less expensive to develop, and has faster time-to-market. Of course if your organization doesn't have any experiential knowledge of try / catch / throw, you might want to use it on a toy project first just to make sure you know what you're doing -- you should always get used to a weapon on the firing range before you bring it to the front lines of a shooting war. ============================================================================== [17.2] How can I handle a constructor that fails? Throw an exception. Constructors don't have a return type, so it's not possible to use return codes. The best way to signal constructor failure is therefore to throw an exception. If you don't have or won't use exceptions, here's a work-around. If a constructor fails, the constructor can put the object into a "zombie" state. Do this by setting an internal status bit so the object acts sort of like it's dead even though it is technically still alive. Then add a query ("inspector") member function to check this "zombie" bit so users of your class can find out if their object is truly alive, or if it's a zombie (i.e., a "living dead" object). Also you'll probably want to have your other member functions check this zombie bit, and, if the object isn't really alive, do a no-op (or perhaps something more obnoxious such as abort()). This is really ugly, but it's the best you can do if you can't (or don't want to) use exceptions. ============================================================================== [17.3] How can I handle a destructor that fails? Write a message to a log-file. Or call Aunt Tilda. But do not throw an exception! Here's why (buckle your seat-belts): The C++ rule is that you must never throw an exception from a destructor that is being called during the "stack unwinding" process of another exception. For example, if someone says throw Foo(), the stack will be unwound so all the stack frames between the throw Foo() and the } catch (Foo e) { will get popped. This is called stack unwinding. During stack unwinding, all the local objects in all those stack frames are destructed. If one of those destructors throws an exception (say it throws a Bar object), the C++ runtime system is in a no-win situation: should it ignore the Bar and end up in the } catch (Foo e) { where it was originally headed? Should it ignore the Foo and look for a } catch (Bar e) { handler? There is no good answer -- either choice loses information. So the C++ language guarantees that it will call terminate() at this point, and terminate() kills the process. Bang you're dead. The easy way to prevent this is never throw an exception from a destructor. But if you really want to be clever, you can say never throw an exception from a destructor while processing another exception. But in this second case, you're in a difficult situation: the destructor itself needs code to handle both throwing an exception and doing "something else", and the caller has no guarantees as to what might happen when the destructor detects an error (it might throw an exception, it might do "something else"). So the whole solution is harder to write. So the easy thing to do is always do "something else". That is, never throw an exception from a destructor. Of course the word never should be "in quotes" since there is always some situation somewhere where the rule won't hold. But certainly at least 99% of the time this is a good rule of thumb. ============================================================================== [17.4] How should I handle resources if my constructors may throw exceptions? Every data member inside your object should clean up its own mess. If a constructor throws an exception, the object's destructor is not run. If your object has already done something that needs to be undone (such as allocating some memory, opening a file, or locking a semaphore), this "stuff that needs to be undone" must be remembered by a data member inside the object. For example, rather than allocating memory into a raw Fred* data member, put the allocated memory into a "smart pointer" member object, and the destructor of this smart pointer will delete the Fred object when the smart pointer dies. The standard class auto_ptr is an example of such as "smart pointer" class. You can also write your own reference counting smart pointer[16.21]. You can also use smart pointers to "point" to disk records or objects on other machines[13.3]. ============================================================================== [17.5] How do I change the string-length of an array of char to prevent memory leaks even if/when someone throws an exception? If what you really want to do is work with strings, don't use an array of char in the first place, since arrays are evil[33.1]. Instead use an object of some string-like class. For example, suppose you want to get a copy of a string, fiddle with the copy, then append another string to the end of the fiddled copy. The array-of-char approach would look something like this: void userCode(const char* s1, const char* s2) { // Make a copy of s1: char* copy = new char[strlen(s1) + 1]; strcpy(copy, s1); // Now that we have a local pointer to freestore-allocated memory, // we need to use a try block to prevent memory leaks: try { // ... now we fiddle with copy for a while... // Append s2 onto the end of copy: // ... [Here's where people want to reallocate copy] ... char* copy2 = new char[strlen(copy) + strlen(s2) + 1]; strcpy(copy2, copy); strcpy(copy2 + strlen(copy), s2); delete[] copy; copy = copy2; // ... finally we fiddle with copy again... } catch (...) { delete[] copy; // Prevent memory leaks if we got an exception throw; // Re-throw the current exception } delete[] copy; // Prevent memory leaks if we did NOT get an exception } Using char*s like this is tedious and error prone. Why not just use an object of some string class? Your compiler probably supplies a string-like class, and it's probably just as fast and certainly it's a lot simpler and safer than the char* code that you would have to write yourself. For example, if you're using the std::string class from the standardization committee[6.12], your code might look something like this: #include <string> // Let the compiler see class std::string void userCode(const std::string& s1, const std::string& s2) { std::string copy = s1; // Make a copy of s1 // ... now we fiddle with copy for a while... copy += s2; // Append s2 onto the end of copy // ... finally we fiddle with copy again... } That's a total of two (2) lines of code within the body of the function, as compared with twelve (12) lines of code in the previous example. Most of the savings came from memory management, but some also came because we didn't have to explicitly call strxxx() routines. Here are some high points: * We do not need to explicitly write any code that reallocates memory when we grow the string, since std::string handles memory management automatically. * We do not need to delete[] anything at the end, since std::string handles memory management automatically. * We do not need a try block in this second example, since std::string handles memory management automatically, even if someone somewhere throws an exception. ============================================================================== SECTION [18]: Const correctness [18.1] What is "const correctness"? A good thing. It means using the keyword const to prevent const objects from getting mutated. For example, if you wanted to create a function f() that accepted a std::string, plus you want to promise callers not to change the caller's std::string that gets passed to f(), you can have f() receive its std::string parameter... * void f1(const std::string& s); // Pass by reference-to-const * void f2(const std::string* sptr); // Pass by pointer-to-const * void f3(std::string s); // Pass by value In the pass by reference-to-const and pass by pointer-to-const cases, any attempts to change to the caller's std::string within the f() functions would be flagged by the compiler as an error at compile-time. This check is done entirely at compile-time: there is no run-time space or speed cost for the const. In the pass by value case (f3()), the called function gets a copy of the caller's std::string. This means that f3() can change its local copy, but the copy is destroyed when f3() returns. In particular f3() cannot change the caller's std::string object. As an opposite example, if you wanted to create a function g() that accepted a std::string, but you want to let callers know that g() might change the caller's std::string object. In this case you can have g() receive its std::string parameter... * void g1(std::string& s); // Pass by reference-to-non-const * void g2(std::string* sptr); // Pass by pointer-to-non-const The lack of const in these functions tells the compiler that they are allowed to (but are not required to) change the caller's std::string object. Thus they can pass their std::string to any of the f() functions, but only f3() (the one that receives its parameter "by value") can pass its std::string to g1() or g2(). If f1() or f2() need to call either g() function, a local copy of the std::string object must be passed to the g() function; the parameter to f1() or f2() cannot be directly passed to either g() function. E.g., void g1(std::string& s); void f1(const std::string& s) { g1(s); // Compile-time Error since s is const std::string localCopy = s; g1(localCopy); // OK since localCopy is not const } Naturally in the above case, any changes that g1() makes are made to the localCopy object that is local to f1(). In particular, no changes will be made to the const parameter that was passed by reference to f1(). ============================================================================== [18.2] How is "const correctness" related to ordinary type safety? Declaring the const-ness of a parameter is just another form of type safety. It is almost as if a const std::string, for example, is a different class than an ordinary std::string, since the const variant is missing the various mutative operations in the non-const variant (e.g., you can imagine that a const std::string simply doesn't have an assignment operator). If you find ordinary type safety helps you get systems correct (it does; especially in large systems), you'll find const correctness helps also. ============================================================================== [18.3] Should I try to get things const correct "sooner" or "later"? At the very, very, very beginning. Back-patching const correctness results in a snowball effect: every const you add "over here" requires four more to be added "over there." ============================================================================== [18.4] What does "const Fred* p" mean? It means p points to an object of class Fred, but p can't be used to change that Fred object (naturally p could also be NULL). For example, if class Fred has a const member function[18.9] called inspect(), saying p->inspect() is OK. But if class Fred has a non-const member function[18.9] called mutate(), saying p->mutate() is an error (the error is caught by the compiler; no run-time tests are done, which means const doesn't slow your program down). ============================================================================== [18.5] What's the difference between "const Fred* p", "Fred* const p" and "const Fred* const p"? You have to read pointer declarations right-to-left. * const Fred* p means "p points to a Fred that is const" -- that is, the Fred object can't be changed via p[18.13]. * Fred* const p means "p is a const pointer to a Fred" -- that is, you can change the Fred object via p[18.13], but you can't change the pointer p itself. * const Fred* const p means "p is a const pointer to a const Fred" -- that is, you can't change the pointer p itself, nor can you change the Fred object via p[18.13]. ============================================================================== [18.6] What does "const Fred& x" mean? It means x aliases a Fred object, but x can't be used to change that Fred object. For example, if class Fred has a const member function[18.9] called inspect(), saying x.inspect() is OK. But if class Fred has a non-const member function[18.9] called mutate(), saying x.mutate() is an error (the error is caught by the compiler; no run-time tests are done, which means const doesn't slow your program down). ============================================================================== [18.7] Does "Fred& const x" make any sense? No, it is nonsense. To find out what the above declaration means, you have to read it right-to-left[18.5]. Thus "Fred& const x" means "x is a const reference to a Fred". But that is redundant, since references are always const. You can't reseat a reference[8.5]. Never. With or without the const. In other words, "Fred& const x" is functionally equivalent to "Fred& x". Since you're gaining nothing by adding the const after the &, you shouldn't add it since it will confuse people. I.e., the const will make some people think that the Fred is const, as if you had said "const Fred& x". ============================================================================== [18.8] What does "Fred const& x" mean? "Fred const& x" is functionally equivalent to "const Fred& x"[18.6]. The problem with using "Fred const& x" (with the const before the &) is that it could easily be mis-typed as the nonsensical "Fred &const x"[18.7] (with the const after the &). Better to simply use const Fred& x. ============================================================================== [18.9] What is a "const member function"? A member function that inspects (rather than mutates) its object. A const member function is indicated by a const suffix just after the member function's parameter list. Member functions with a const suffix are called "const member functions" or "inspectors." Member functions without a const suffix are called "non-const member functions" or "mutators." class Fred { public: void inspect() const; // This member promises NOT to change *this void mutate(); // This member function might change *this }; void userCode(Fred& changeable, const Fred& unchangeable) { changeable.inspect(); // OK: doesn't change a changeable object changeable.mutate(); // OK: changes a changeable object unchangeable.inspect(); // OK: doesn't change an unchangeable object unchangeable.mutate(); // ERROR: attempt to change unchangeable object } The error in unchangeable.mutate() is caught at compile time. There is no runtime space or speed penalty for const. The trailing const on inspect() member function means that the abstract (client-visible) state of the object isn't going to change. This is slightly different from promising that the "raw bits" of the object's struct aren't going to change. C++ compilers aren't allowed to take the "bitwise" interpretation unless they can solve the aliasing problem, which normally can't be solved (i.e., a non-const alias could exist which could modify the state of the object). Another (important) insight from this aliasing issue: pointing at an object with a pointer-to-const doesn't guarantee that the object won't change; it promises only that the object won't change via that pointer. ============================================================================== [18.10] What do I do if I want a const member function to make an "invisible" change to a data member? [UPDATED!] [Recently reworded the question to distinguish "invisible change" from "invisible data member" thanks to Thomas Hansen (in 6/02).] Use mutable (or, as a last resort, use const_cast). A small percentage of inspectors need to make innocuous changes to data members (e.g., a Set object might want to cache its last lookup in hopes of improving the performance of its next lookup). By saying the changes are "innocuous," I mean that the changes wouldn't be visible from outside the object's interface (otherwise the member function would be a mutator rather than an inspector). When this happens, the data member which will be modified should be marked as mutable (put the mutable keyword just before the data member's declaration; i.e., in the same place where you could put const). This tells the compiler that the data member is allowed to change during a const member function. If your compiler doesn't support the mutable keyword, you can cast away the const'ness of this via the const_cast keyword (but see the NOTE below before doing this). E.g., in Set::lookup() const, you might say, Set* self = const_cast<Set*>(this); // See the NOTE below before doing this! After this line, self will have the same bits as this (e.g., self == this), but self is a Set* rather than a const Set* (technically a const Set* const, but the right-most const is irrelevant to this discussion). Therefore you can use self to modify the object pointed to by this. NOTE: there is an extremely unlikely error that can occur with const_cast. It only happens when three very rare things are combined at the same time: a data member that ought to be mutable (such as is discussed above), a compiler that doesn't support the mutable keyword, and an object that was originally defined to be const (as opposed to a normal, non-const object that is pointed to by a pointer-to-const). Although this combination is so rare that it may never happen to you, if it ever did happen the code may not work (the Standard says the behavior is undefined). If you ever want to use const_cast, use mutable instead. In other words, if you ever need to change a member of an object, and that object is pointed to by a pointer-to-const, the safest and simplest thing to do is add mutable to the member's declaration. You can use const_cast if you are sure that the actual object isn't const (e.g., if you are sure the object is declared something like this: Set s;), but if the object itself might be const (e.g., if it might be declared like: const Set s;), use mutable rather than const_cast. Please don't write and tell me that version X of compiler Y on machine Z allows you to change a non-mutable member of a const object. I don't care -- it is illegal according to the language and your code will probably fail on a different compiler or even a different version (an upgrade) of the same compiler. Just say no. Use mutable instead. ============================================================================== [18.11] Does const_cast mean lost optimization opportunities? In theory, yes; in practice, no. Even if the language outlawed const_cast, the only way to avoid flushing the register cache across a const member function call would be to solve the aliasing problem (i.e., to prove that there are no non-const pointers that point to the object). This can happen only in rare cases (when the object is constructed in the scope of the const member function invocation, and when all the non-const member function invocations between the object's construction and the const member function invocation are statically bound, and when every one of these invocations is also inlined, and when the constructor itself is inlined, and when any member functions the constructor calls are inline). ============================================================================== [18.12] Why does the compiler allow me to change an int after I've pointed at it with a const int*? Because "const int* p" means "p promises not to change the *p," not "*p promises not to change." Causing a const int* to point to an int doesn't const-ify the int. The int can't be changed via the const int*, but if someone else has an int* (note: no const) that points to ("aliases") the same int, then that int* can be used to change the int. For example: void f(const int* p1, int* p2) { int i = *p1; // Get the (original) value of *p1 *p2 = 7; // If p1 == p2, this will also change *p1 int j = *p1; // Get the (possibly new) value of *p1 if (i != j) { std::cout << "*p1 changed, but it didn't change via pointer p1!\n"; assert(p1 == p2); // This is the only way *p1 could be different } } int main() { int x; f(&x, &x); // This is perfectly legal (and even moral!) } Note that main() and f(const int*,int*) could be in different compilation units that are compiled on different days of the week. In that case there is no way the compiler can possibly detect the aliasing at compile time. Therefore there is no way we could make a language rule that prohibits this sort of thing. In fact, we wouldn't even want to make such a rule, since in general it's considered a feature that you can have many pointers pointing to the same thing. The fact that one of those pointers promises not to change the underlying "thing" is just a promise made by the pointer; it's not a promise made by the "thing". ============================================================================== [18.13] Does "const Fred* p" mean that *p can't change? No! (This is related to the FAQ about aliasing of int pointers[18.12].) "const Fred* p" means that the Fred can't be changed via pointer p, but there might be other ways to get at the object without going through a const (such as an aliased non-const pointer such as a Fred*). For example, if you have two pointers "const Fred* p" and "Fred* q" that point to the same Fred object (aliasing), pointer q can be used to change the Fred object but pointer p cannot. class Fred { public: void inspect() const; // A const member function[18.9] void mutate(); // A non-const member function[18.9] }; int main() { Fred f; const Fred* p = &f; Fred* q = &f; p->inspect(); // OK: No change to *p p->mutate(); // Error: Can't change *p via p q->inspect(); // OK: q is allowed to inspect the object q->mutate(); // OK: q is allowed to mutate the object f.inspect(); // OK: f is allowed to inspect the object f.mutate(); // OK: f is allowed to mutate the object } ============================================================================== SECTION [19]: Inheritance -- basics [19.1] Is inheritance important to C++? Yep. Inheritance is what separates abstract data type (ADT) programming from OO programming. ============================================================================== [19.2] When would I use inheritance? As a specification device. Human beings abstract things on two dimensions: part-of and kind-of. A Ford Taurus is-a-kind-of-a Car, and a Ford Taurus has-a Engine, Tires, etc. The part-of hierarchy has been a part of software since the ADT style became relevant; inheritance adds "the other" major dimension of decomposition. ============================================================================== [19.3] How do you express inheritance in C++? By the : public syntax: class Car : public Vehicle { public: // ... }; We state the above relationship in several ways: * Car is "a kind of a" Vehicle * Car is "derived from" Vehicle * Car is "a specialized" Vehicle * Car is a "subclass" of Vehicle * Car is a "derived class" of Vehicle * Vehicle is the "base class" of Car * Vehicle is the "superclass" of Car (this not as common in the C++ community) (Note: this FAQ has to do with public inheritance; private and protected inheritance[24] are different.) ============================================================================== [19.4] Is it OK to convert a pointer from a derived class to its base class? Yes. An object of a derived class is a kind of the base class. Therefore the conversion from a derived class pointer to a base class pointer is perfectly safe, and happens all the time. For example, if I am pointing at a car, I am in fact pointing at a vehicle, so converting a Car* to a Vehicle* is perfectly safe and normal: void f(Vehicle* v); void g(Car* c) { f(c); } // Perfectly safe; no cast (Note: this FAQ has to do with public inheritance; private and protected inheritance[24] are different.) ============================================================================== [19.5] What's the difference between public, private, and protected? * A member (either data member or member function) declared in a private section of a class can only be accessed by member functions and friends[14] of that class * A member (either data member or member function) declared in a protected section of a class can only be accessed by member functions and friends[14] of that class, and by member functions and friends[14] of derived classes * A member (either data member or member function) declared in a public section of a class can be accessed by anyone ============================================================================== [19.6] Why can't my derived class access private things from my base class? To protect you from future changes to the base class. Derived classes do not get access to private members of a base class. This effectively "seals off" the derived class from any changes made to the private members of the base class. ============================================================================== [19.7] How can I protect derived classes from breaking when I change the internal parts of the base class? A class has two distinct interfaces for two distinct sets of clients: * It has a public interface that serves unrelated classes * It has a protected interface that serves derived classes Unless you expect all your derived classes to be built by your own team, you should declare your base class's data members as private and use protected inline access functions by which derived classes will access the private data in the base class. This way the private data declarations can change, but the derived class's code won't break (unless you change the protected access functions). ============================================================================== SECTION [20]: Inheritance -- virtual functions [20.1] What is a "virtual member function"? From an OO perspective, it is the single most important feature of C++: [6.8], [6.9]. A virtual function allows derived classes to replace the implementation provided by the base class. The compiler makes sure the replacement is always called whenever the object in question is actually of the derived class, even if the object is accessed by a base pointer rather than a derived pointer. This allows algorithms in the base class to be replaced in the derived class, even if users don't know about the derived class. The derived class can either fully replace ("override") the base class member function, or the derived class can partially replace ("augment") the base class member function. The latter is accomplished by having the derived class member function call the base class member function, if desired. ============================================================================== [20.2] How can C++ achieve dynamic binding yet also static typing? When you have a pointer to an object, the object may actually be of a class that is derived from the class of the pointer (e.g., a Vehicle* that is actually pointing to a Car object; this is called "polymorphism"). Thus there are two types: the (static) type of the pointer (Vehicle, in this case), and the (dynamic) type of the pointed-to object (Car, in this case). Static typing means that the legality of a member function invocation is checked at the earliest possible moment: by the compiler at compile time. The compiler uses the static type of the pointer to determine whether the member function invocation is legal. If the type of the pointer can handle the member function, certainly the pointed-to object can handle it as well. E.g., if Vehicle has a certain member function, certainly Car also has that member function since Car is a kind-of Vehicle. Dynamic binding means that the address of the code in a member function invocation is determined at the last possible moment: based on the dynamic type of the object at run time. It is called "dynamic binding" because the binding to the code that actually gets called is accomplished dynamically (at run time). Dynamic binding is a result of virtual functions. ============================================================================== [20.3] What's the difference between how virtual and non-virtual member functions are called? Non-virtual member functions are resolved statically. That is, the member function is selected statically (at compile-time) based on the type of the pointer (or reference) to the object. In contrast, virtual member functions are resolved dynamically (at run-time). That is, the member function is selected dynamically (at run-time) based on the type of the object, not the type of the pointer/reference to that object. This is called "dynamic binding." Most compilers use some variant of the following technique: if the object has one or more virtual functions, the compiler puts a hidden pointer in the object called a "virtual-pointer" or "v-pointer." This v-pointer points to a global table called the "virtual-table" or "v-table." The compiler creates a v-table for each class that has at least one virtual function. For example, if class Circle has virtual functions for draw() and move() and resize(), there would be exactly one v-table associated with class Circle, even if there were a gazillion Circle objects, and the v-pointer of each of those Circle objects would point to the Circle v-table. The v-table itself has pointers to each of the virtual functions in the class. For example, the Circle v-table would have three pointers: a pointer to Circle::draw(), a pointer to Circle::move(), and a pointer to Circle::resize(). During a dispatch of a virtual function, the run-time system follows the object's v-pointer to the class's v-table, then follows the appropriate slot in the v-table to the method code. The space-cost overhead of the above technique is nominal: an extra pointer per object (but only for objects that will need to do dynamic binding), plus an extra pointer per method (but only for virtual methods). The time-cost overhead is also fairly nominal: compared to a normal function call, a virtual function call requires two extra fetches (one to get the value of the v-pointer, a second to get the address of the method). None of this runtime activity happens with non-virtual functions, since the compiler resolves non-virtual functions exclusively at compile-time based on the type of the pointer. Note: the above discussion is simplified considerably, since it doesn't account for extra structural things like multiple inheritance, virtual inheritance, RTTI, etc., nor does it account for space/speed issues such as page faults, calling a function via a pointer-to-function, etc. If you want to know about those other things, please ask comp.lang.c++; PLEASE DO NOT SEND E-MAIL TO ME! ============================================================================== [20.4] I have a heterogeneous list of objects, and my code needs to do class-specific things to the objects. Seems like this ought to use dynamic binding but can't figure it out. What should I do? It's surprisingly easy. Suppose there is a base class Vehicle with derived classes Car and "Truck". The code traverses a list of Vehicle objects and does different things depending on the type of Vehicle. For example it might weigh the "Truck" objects (to make sure they're not carrying too heavy of a load) but it might do something different with a Car object -- check the registration, for example. The initial solution for this, at least with most people, is to use an if statement. E.g., "if the object is a "Truck", do this, else if it is a Car, do that, else do a third thing": typedef std::vector<Vehicle*> VehicleList; void myCode(VehicleList& v) { for (VehicleList::iterator p = v.begin(); p != v.end(); ++p) { Vehicle& v = **p; // just for shorthand // generic code that works for any vehicle... ... // perform the "foo-bar" operation. // note: the details of the "foo-bar" operation depend // on whether we're working with a car or a truck. if (v is a Car) { // car-specific code that does "foo-bar" on car v ... } else if (v is a Truck) { // truck-specific code that does "foo-bar" on truck v ... } else { // semi-generic code that does "foo-bar" on something else ... } // generic code that works for any vehicle... ... } } The problem with this is what I call "else-if-heimer's disease" (say it fast and you'll understand). The above code gives you else-if-heimer's disease because eventually you'll forget to add an else if when you add a new derived class, and you'll probably have a bug that won't be detected until run-time, or worse, when the product is in the field. The solution is to use dynamic binding rather than dynamic typing. Instead of having (what I call) the live-code dead-data metaphor (where the code is alive and the car/truck objects are relatively dead), we move the code into the data. This is a slight variation of Bertrand Meyer's Inversion Principle. The idea is simple: use the description of the code within the {...} blocks of each if (in this case it is "the foo-bar operation"; obviously your name will be different). Just pick up this descriptive name and use it as the name of a new virtual member function in the base class (in this case we'll add a fooBar() member function to class Vehicle). class Vehicle { public: // performs the "foo-bar" operation virtual void fooBar() = 0; }; Then you remove the whole if...else if... block and replace it with a simple call to this virtual function: typedef std::vector<Vehicle*> VehicleList; void myCode(VehicleList& v) { for (VehicleList::iterator p = v.begin(); p != v.end(); ++p) { Vehicle& v = **p; // just for shorthand // generic code that works for any vehicle... ... // perform the "foo-bar" operation. v.fooBar(); // generic code that works for any vehicle... ... } } Finally you move the code that used to be in the {...} block of each if into the fooBar() member function of the appropriate derived class: class Car : public Vehicle { public: virtual void fooBar(); }; void Car::fooBar() { // car-specific code that does "foo-bar" on 'this' ... // this code was in {...} of if (v is a Car) } class Truck : public Vehicle { public: virtual void fooBar(); }; void Truck::fooBar() { // truck-specific code that does "foo-bar" on 'this' ... // this code was in {...} of if (v is a Truck) } If you actually have an else block in the original myCode() function (see above for the "semi-generic code that does the 'foo-bar' operation on something other than a Car or Truck"), change Vehicle's fooBar() from pure virtual to plain virtual and move the code into that member function: class Vehicle { public: // performs the "foo-bar" operation virtual void fooBar(); }; void Vehicle::fooBar() { // semi-generic code that does "foo-bar" on something else ... // this code was in {...} of the else case // you can think of this as "default" code... } That's it! The point, of course, is that we try to avoid decision logic with decisions based on the kind-of derived class you're dealing with. In other words, you're trying to avoid if the object is a car do xyz, else if it's a truck do pqr, etc., because that leads to else-if-heimer's disease. ============================================================================== [20.5] When should my destructor be virtual? When you may delete a derived object via a base pointer. virtual functions bind to the code associated with the class of the object, rather than with the class of the pointer/reference. When you say delete basePtr, and the base class has a virtual destructor, the destructor that gets invoked is the one associated with the type of the object *basePtr, rather than the one associated with the type of the pointer. This is generally A Good Thing. TECHNO-GEEK WARNING; PUT YOUR PROPELLER HAT ON. Technically speaking, you need a base class's destructor to be virtual if and only if you intend to allow someone to invoke an object's destructor via a base class pointer (this is normally done implicitly via delete), and the object being destructed is of a derived class that has a non-trivial destructor. A class has a non-trivial destructor if it either has an explicitly defined destructor, or if it has a member object or a base class that has a non-trivial destructor (note that this is a recursive definition (e.g., a class has a non-trivial destructor if it has a member object (which has a base class (which has a member object (which has a base class (which has an explicitly defined destructor)))))). END TECHNO-GEEK WARNING; REMOVE YOUR PROPELLER HAT If you had a hard grokking the previous rule, try this (over)simplified one on for size: A class should have a virtual destructor unless that class has no virtual functions. Rationale: if you have any virtual functions at all, you're probably going to be doing "stuff" to derived objects via a base pointer, and some of the "stuff" you may do may include invoking a destructor (normally done implicitly via delete). Plus once you've put the first virtual function into a class, you've already paid all the per-object space cost that you'll ever pay (one pointer per object; note that this is theoretically compiler-specific; in practice everyone does it pretty much the same way), so making the destructor virtual won't generally cost you anything extra. ============================================================================== [20.6] What is a "virtual constructor"? An idiom that allows you to do something that C++ doesn't directly support. You can get the effect of a virtual constructor by a virtual clone() member function (for copy constructing), or a virtual create() member function (for the default constructor[10.4]). class Shape { public: virtual ~Shape() { } // A virtual destructor[20.5] virtual void draw() = 0; // A pure virtual function[22.4] virtual void move() = 0; // ... virtual Shape* clone() const = 0; // Uses the copy constructor virtual Shape* create() const = 0; // Uses the default constructor[10.4] }; class Circle : public Shape { public: Circle* clone() const; // Covariant Return Types; see below Circle* create() const; // Covariant Return Types; see below // ... }; Circle* Circle::clone() const { return new Circle(*this); } Circle* Circle::create() const { return new Circle(); } In the clone() member function, the new Circle(*this) code calls Circle's copy constructor to copy the state of this into the newly created Circle object. (Note: unless Circle is known to be final (AKA a leaf)[23.7], you can reduce the chance of slicing[30.8] by making its copy constructor protected.) In the create() member function, the new Circle() code calls Circle's default constructor[10.4]. Users use these as if they were "virtual constructors": void userCode(Shape& s) { Shape* s2 = s.clone(); Shape* s3 = s.create(); // ... delete s2; // You probably need a virtual destructor[20.5] here delete s3; } This function will work correctly regardless of whether the Shape is a Circle, Square, or some other kind-of Shape that doesn't even exist yet. Note: The return type of Circle's clone() member function is intentionally different from the return type of Shape's clone() member function. This is called Covariant Return Types, a feature that was not originally part of the language. If your compiler complains at the declaration of Circle* clone() const within class Circle (e.g., saying "The return type is different" or "The member function's type differs from the base class virtual function by return type alone"), you have an old compiler and you'll have to change the return type to Shape*. Amazingly Microsoft Visual C++ is one of those compilers that does not, as of version 6.0, handle Covariant Return Types. This means: * MS VC++ 6.0 will give you an error message on the overrides of clone() and create(). * Do not write me about this. The above code is correct with respect to the C++ Standard (see section 10.3p5); the problem is with MS VC++ 6.0, not with the above code. Simply put, MS VC++ 6.0 doesn't support Covariant Return Types. ==============================================================================