Wednesday, April 06, 2005
The Four Non-negotiables of C++
The best way to understand C++'s strengths and weaknesses is to look at four major design principles, which I've called the Four Non-Negotiables because they are essential to what C++ has become and will not be compromised.
Non-Negotiable #1: C Compatibility
Bjarne Stroustrup is quite happy with C programmers using C++ as 'a better C', and in fact a lot of modern C comes from C++, like the insistence on function prototypes and use of const, etc. (The C99 standard will even allow declarations to appear inside for-loops!) He seems more keen than ever to preserve compatibility in things like mathematical libraries, etc. The most important part of this is not the syntax (after all, C# and Java share most of the syntax, but are very different beasts) but certain old slack C conversions. For example, the condition of an if-statement can be a pointer or an integer, as well as the bool type. This makes it easy to write nasty logic bugs. C++ continues to allow raw access to memory, and people continue to run over the ends of arrays and cause obscure crashes (as opposed to the definite crash you would get in other languages.)
Perhaps the most important thing is the way C++ programs are built. They compile to standard object files and are linked almost exactly like C programs. C++ mangles symbols so that an essentially dumb linker can tell the difference between overloaded functions, and when an attempt is being made to link to a function with a wrong signature. (You may be forgiven for thinking that it was to make life difficult for debuggers.) It remains trivial to link C code with C++ programs.
The upshot of this is that C++ still uses a method of creating executables that grew up with Unix in the Seventies.
Non-Negotiable #2: Minimal Core
This may come across as a joke, but C++ remains a small language in the sense that C is; a C++ program requires very little from the runtime library. The minimal library support is for the new and delete operators, and for exception handling, and even that is optional. Contrast this with older languages like FORTRAN, which had built-in statements for doing IO, etc. So it remains possible to make very small executables for embedded environments, etc. Everything else is libraries, and there sure are a lot of those.
This has consequences. For example, it's unlikely that a C++ equivalent of C#'s foreach statement would be accepted, because foreach must make assumptions about libraries. In the case of C# it must assume that there are a pair of interfaces called IEnumerable and IEnumerator; the container object must implement IEnumerable, which involves generating an iterator object which implements IEnumerator. In the case of C++, such a statement would probably have to assume that the container had STL-style iterators (begin(),end(), etc). But that would introduce a detailed dependence of the language on particular libraries, and that would probably be unacceptable.
Non-Negotiable #3: Abstraction and Strong Typing
C++ can sing in the hands of a master because code can be written at the level of the problem. The art is generating a new idiom that can precisely and succintly express the solution. So C++ allows the programmer almost complete power to define a type's behaviour, its conversions to and from other types, what operators will mean, etc. Most of the language's complexity comes from this machinery, and so it is much more difficult to write libraries than to use them. (Which is how it should be.). From a Java or C# perspective, C++ seems more weakly typed - it will implicitly convert from a double to an int and only issue a warning. But in important matters there are few allowed conversions.
Non-Negotiable #4: Performance
One of Stroustrup's mottos is "you don't pay for what you don't use".
The C++ ideal is to not have to pay for abstraction. For instance, OOP doctrine tells us that we should access class data using accessor functions, to allow that class freedom to change its implementation later. In C++, such simple functions will be inlined and so there's no penalty for their use. (This isn't a unique virtue, since Java does this also). So a programmer can make decisions based on a higher-level design and not be constrained by performance issues. An example of C++'s performance mindset is that function calls are not virtual unless specified otherwise because of the slight extra cost and (most importantly) the difficulty of inlining virtual functions. Typecasts in C++ are all static except for dynamic_cast.
Very powerful template techiques have been developed (e.g Blitz) that allow vector and matrix operations which are essentially inlineable.
All obsessions have a downside. An over-emphasis on speed and strong typing results in bloated executables. Most of the standard library is templatized and much of it gets inlined by default. Such all-over optimizations probably even effect ultimate performance because of cache overload.
