The "basic types" are augmented with "derived types", and C has three of them:
- * pointer to...
- This is denoted by the familiar * character, and it should be self evident that a pointer always has to point to something.
- [] array of...
- "Array of" can be undimensioned -- [] -- or dimensioned -- [10] -- but the sizes don't really play significantly into reading a declaration. We typically include the size in the description. It should be clear that arrays have to be "arrays of" something.
- () function returning...
- This is usually denoted by a pair of parentheses together - () - though it's also possible to find a prototype parameter list inside. Parameters lists (if present) don't really play into reading a declaration, and we typically ignore them. We'll note that parens used to represent "function returning" are different than those used for grouping: grouping parens surround the variable name, while "function returning" parens are always on the right.
- Functions are meaningless unless they return something (and we accommodate the void type by waving the hand and pretend that it's "returning" void).
A derived type always modifies something that follows, whether it be the basic type or another derived type, and to make a declaration read properly one must always include the preposition ("to", "of", "returning"). Saying "pointer" instead of "pointer to" will make your declarations fall apart.
It's possible that a type expression may have no derived types (e.g., "int i" describes "i is an int"), or it can have many. Interpreting the derived types is usually the sticking point when reading a complex declaration, but this is resolved with operator precedence in the next section.
Operator Precedence
Almost every C programmer is familiar with the operator precedence tables, which give rules that say (for instance) multiply and divide have higher precedence than ("are preformed before") addition or subtraction, and parentheses can be used to alter the grouping. This seems natural for "normal" expressions, but the same rules do indeed apply to declarations - they are type expressions rather thancomputational ones.
The "array of" [] and "function returning" () type operators have higher precedence than "pointer to" *, and this leads to some fairly straightforward rules for decoding.
Always start with the variable name:
foo is ...
and always end with the basic type:
foo is ... int
The "filling in the middle" part is usually the trickier part, but it can be summarize with this rule:
"go right when you can, go left when you must"
Working your way out from the variable name, honor the precedence rules and consume derived-type tokens to the right as far as possible without bumping into a grouping parenthesis. Then go left to the matching paren.
A simple example
We'll start with a simple example:
long **foo[7];
We'll approach this systematically, focusing on just one or two small part as we develop the description in English. As we do it, we'll show the focus of our attention in red, and strike out the parts we've finished with.
- long **foo [7];
- Start with the variable name and end with the basic type:
- foo is ... long
- long ** foo[7];
- At this point, the variable name is touching two derived types: "array of 7" and "pointer to", and the rule is to go right when you can, so in this case we consume the "array of 7"
- foo is array of 7 ... long
- long ** foo[7];
- Now we've gone as far right as possible, so the innermost part is only touching the "pointer to" - consume it.
- foo is array of 7 pointer to ... long
- long * *foo[7];
- The innermost part is now only touching a "pointer to", so consume it also.
- foo is array of 7 pointer to pointer to long
This completes the declaration!