C is well-designed. Its keywords are few yet comprehensive, and its syntax permits succint yet clear code. However C is far from flawless.
Chief among C’s faults is the preprocessor, a second language introduced to rectify weaknesses in the original language. Preprocessor directives are a necessary evil; the best we can do is avoid them as much as possible.
The loss of the # character to the preprocessor is a minor blow, requiring C to define a new comment syntax.
Avoid #ifdef and its ilk.
Conditional compilation is the most egregious preprocessor
evil, as it obfuscates code for humans and automated tools
alike. Tweaking compilation is a task best left for the
build system. For example, code specific to particular
architectures should be placed in separate files; the build
system should choose which is appropriate.
A close second is misuse of the #define directive. Never use macros to
define functions. They can cause mysterious bugs due to
side effects from insufficient insulation, and make the
code harder to analyze by human or machine. Moreover, they
are unnecessary now that C99 supports the inline keyword
(in header files, simply declare the function to be
static inline).
There are times where textual substitution makes sense,
and #define is appropriate.
Kernighan and Pike’s example is:
#define NELEM(a) (sizeof(a)/sizeof(a[0]))
which returns the size of a static array, a macro so useful that there should have been a keyword or operator for this to begin with.
When used be sure a macro is truly warranted; in the above example, observe the quantity is computed at compile time, and no function could take its place.
For programming contests, save typing by defining macros for common tasks such as looping in the interval [0..N-1], and reading and writing integers.
Another application is debugging. We can access the source file name and line number via macros.
#define REP(x, n) for (int x = 0; x < n; x++)
#define REP1(x, n) for (int x = 1; x <= n; x++)
#define EXPECT(condition) \
if (!(condition)) fprintf(stderr, "%s:%d: FAIL\n", __FILE__, __LINE__)
// Prints the quadratic residues (squares) for a given modulus.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
enum { MIN = 2, MAX = 1 << 15 };
if (2 != argc) return printf("Usage: %s NUMBER\n", *argv), 0;
int m = atoi(argv[1]);
// This next line is buggy. The && should be ||.
if (m < MIN && m > MAX) return printf("Modulus out of range\n"), 0;
// These lines will catch the bug when the argument is out of range.
EXPECT(m >= MIN);
EXPECT(m <= MAX);
REP(i, m) printf(" %d", i * i % m);
puts("");
return 0;
}
Rather than #define integer
constants, use enum, as
const int is insufficient.
Declaring i as const merely tells the compiler to report
errors if code attempts to write to i directly. We can break the rules by
casting away the const, but the results are undefined. For
example, on Linux, gcc places a static const int in a
read-only data segment: attempts to write to it cause
segmentation faults.
If our code never touches i, can it ever change? A classic C riddle
along the same lines asks if const
volatile ever makes sense. Wikipedia has the answer. Unlike enum, the rules
prohibit replacing a const variable with a true constant at
compile time.
Even so, for other types, a const variable may be
preferable to #define
constants.
Unfortunately C lacks Java-style packages, so C
programmers must use include files. We should follow
Rob Pike’s advice: include files should not
include files. If there are dependencies, they should be
mentioned in comments and it is the .c file’s duty to include them. Also, if
we must have header guards, they should be the reverse of
common practice: check if you can avoid including a file
before including it, not while it’s being included. This
saves the preprocessor from churning through thousands of
lines.
The choice to make = the
assign operator was short-sighted. Beginners and experts
alike confuse it with the equality test, so much so that some
veterans write:
if (1 == i) {
instead of:
if (i == 1) {
so that if one = is
accidentally omitted, compilation fails.
I would have chosen := as the
assignment operator.
If statements should require braces. I have been bitten by bugs of the form:
if (foo) bar(); baz();
Namely, poor indentation and lack of braces meant that I
thought baz() would only be
called if the condition were true.
A similar problem arises with simple-minded macros: if
bar() expanded to a series of
statements, only the first executes conditionally.
For consistency we should insist braces for loops.
The binary operations &,
|, &&, and || all have fairly weak precedence. This is
expected for the logical variants, but unintuitive for the
bitwise variants.
Missing break statments in
switch statements can cause
hard-to-find bugs. The default behaviour should have been to
break before the next case, and require the programmer to
write a special keyword if fallthrough is desired.
In it current form, I recommend writing something like
// FALLTHROUGH when it is
intended.
Implicit casting is often more trouble than it is worth. While type promotion is convenient when printing integers as floats or vice versa, I cannot easily determine the types of the terms in an arithmetic expression. Usually I throw in explicit casts to ensure the code is doing what I want.
Once you get the hang of it, it is fun to translate
declarations like (void *)(*fun[])(void
(*)(void)) (an array of pointers to functions
returning a pointer to void, where each takes one argument
that is a pointer to a function that takes no arguments and
returns void), but it would have been better to have simpler
notation. For starters, prefix should not be mixed with
postfix.
The upwards funarg problem is nontrivial, but the downwards variant can be elegantly implemented in a compiler via trampolining. So why not make nested functions and anonymous functions part of standard C? Even standard C++ is getting lambda expressions!
Also, standard C has a subtle issue with function pointers: it’s up to the implementation to decide what happens when casting function pointers to or from void pointers. In particular, dynamically linked functions should not return function pointers. Luckily, on my computers, casting function pointers to and from void pointers behaves as one would expect, and elicits no warnings from GCC.
Functions are visible to other files by default, though can be restricted to file-scope with a keyword. It should be the other way around, namely opt-in, not opt-out, to encourage the programmer to minimize what is shown to the outside world.
Similarly, when linking, symbols have full visibility by default, and compiler-specific directives are needed to hide them. The default should be to hide symbols and have the programmer explicitly designate those that are suitable for public viewing via a keyword.
Lack of namespaces hurts larger projects and libraries. To avoid collisions, one is forced to pick unwieldy names. For example, every function in the SDL library is prefixed with "SDL_". Approximating namespaces by defining static inline functions to abbreviate such names is a chore.