Programmers often use the terms local and global to describe variables in their programs, and while these might be adequate in general discussion, these are not terms recognised by the C Standard and do not fully describe how objects are defined and can be used. This article looks at the different concepts used to describe objects in the C programming language and how you can change an object’s declaration to suit your program’s requirements.
The C language uses three concepts to describe how an object and its identifier (the symbol or name that designates the object) can be used in a program. These are an identifier’s scope and linkage, and the object’s storage duration.
Scope
The scope of an identifier describes the region in the source code where that identifier can be referenced. Attempts to use an identifier in a part of the program where that identifier is out of scope will result in an error. The C language has four terms to describe an identifier’s scope, but this article will only deal with the two most common: block and file scope. (You can read about function and function prototype scope in a good C text.)
If an identifier's declaration is placed inside a block (i.e., code delimited by open { and close } braces) or if the declaration is within a list of function parameters, then that identifier has block scope and can be used anywhere following the declaration to the end of the block in which the declaration was placed.
If an identifier's declaration appears outside of any block and is not within a list of function parameters, then that identifier has file scope and can be used anywhere following the declaration to the end of the translation unit in which the declaration was placed. A translation unit (or module) is the term used to describe a source file after it has been preprocessed and which might then include the content of header files.
In the following trivial example of code in a single source file, the identifier aaa can be used anywhere in the file and will refer to the int defined in the first line. The identifier bbb, defined within the list of parameters for the function plus(), can be used anywhere in the function plus(). The identifier ccc, defined inside the block associated with main() can be used anywhere inside that block (anywhere inside main()); and ddd, defined in the block associated with the while() statement can be used anywhere inside the while() block.
Interestingly, it is not illegal to have several declarations for an identifier with overlapping scopes and have these identifiers refer to different objects. If, in the above example, the parameter to plus() was also called aaa, as shown below, then the scope of that parameter declaration would overlap with that of the file scope identifier on the first line.
When the identifier aaa is accessed in plus(), it is the object with the inner scope (the parameter) that is accessed, and during this function, the outer, file scope object is hidden and is not accessible. This example highlights the difference between an object and its identifier. Here, there are two different objects, but they share the same identifier. Although legal, such a practice is extremely risky and is not recommended, since you and others can be easily confused as to which object is being accessed.
It is good practice to reduce the scope of an object’s identifier to the smallest part of the program that needs to access the object. This reduces the chance of you accidentally reading or modifying the object in parts of the program where it shouldn’t be accessed. This also makes it easier to analyse code and verify its operation.
Storage Duration
The next concept we will look at is an object’s storage duration. This is the portion of the program’s execution during which an object will have memory allocated to it and the object's value is valid. Of the three storage duration types possible, static, automatic, and allocated, we will only look at the first two and not consider objects created via memory allocation functions like malloc().
An object has automatic storage duration if it is defined in a block or within a function’s parameter list and it does not use the static or extern specifier. These objects can explicitly use the auto specifier, but that is optional, and they are often called auto variables. Such objects only have storage allocated to them for the duration of the block in which they are defined. All other objects have static storage duration, and such objects have memory allocated to them for the duration of the entire program.
The storage duration of an object differs from the scope of its identifier in that storage duration relates to a time, during which a part of the program is executing, whereas scope refers to a region defined by lines of source code in which the object's identifier is accessible. Take the following example:
In the function plus(), the aaa parameter can be accessed (has scope) only within that function. The storage for that parameter is valid for the entire time that plus() is executing, however, which implies that when offset() is called from plus() and is executing, the parameter still has memory allocated to it, even though the aaa identifier cannot be accessed from within offset(), as it is out of scope in that function.
In the following example, all the objects have static storage duration, which is to say that they all retain their memory for the entire time the program is executing.
Even though, in the above example, ccc is defined in a function and has limited scope, the static specifier indicates that it will be assigned memory for the entire program, and hence it can hold its value even after bias() has returned. If bias() is called again later in the program, the value of ccc will be exactly as it was when bias() last terminated (assuming it has not been modified via other means). Note that the use of the static specifier for an object defined outside of a block has a different meaning to when it is used with an object defined inside a block (a meaning which we shall cover in the next concept) but all object definitions that use static will have static storage duration.
So what storage duration should objects have? Block scope objects that exist only until the smallest surrounding block is exited are preferable to those that permanently consume memory, thus defining auto objects, whose memory can be used by other objects at other times in the program’s execution, is the typical choice. Often, auto objects (along with function parameters) are stored on a stack so that they can be easily allocated memory when required at runtime, and they are sometimes referred to as stack-based variables. However, there are times when you want an object to have a scope limited to a function but that object must not lose its value once the function exits, and so you will use the static specifier with these objects. The following summarizes your choice of block-scope declarations, which are discussed in the following paragraphs.
The object aaa is an auto variable, which might reside on a stack, and which will have memory allocated to it when shift() is called, and loose that memory when shift() terminates. Its initial value, 0x12, will be assigned to the object every time that shift() is called and a new instance of aaa is created.
The object bbb, on the other hand, is not an auto, will never be placed on a stack, and it will have memory allocated to it for the entire time the program is running. Its initial value is assigned to it just once, before your program begins execution.
Linkage
The last concept is an identifier’s linkage. Linkage allows more than one declaration of an identifier to refer to the same object. There are three types of linkage: external, internal, and none.
Identifiers with no linkage are those that are declared as a function parameter or those declared inside a block without the use of the extern specifier (discussed later). Each declaration of an identifier with no linkage denotes an object that is unique and not associated with any other object, even if they use the same identifier. So you can, for example, define a loop counter with the same name in two functions and each counter will have its own storage and not interfere with the other. This is because there is no linkage between the two.
If the declaration of a file scope (defined outside of any block) identifier contains the storage class specifier static, the identifier has internal linkage. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object. That is to say that you could define a static object at the top of one file, and this object could be used by code anywhere inside that file, but you could also define objects with the same name in other files and these objects will be independent and each have their own storage.
All other objects (i.e., file scope objects with no storage-class specifier) have external linkage. All the declarations for the same identifier that have external linkage represent the same object. This allows you to define an object in one file, and have that same object accessed by the entire program, regardless of how many source files the code is spread over.
The linkage of each object (and for good measure, its storage duration and identifier scope) is marked in the comments in the following example.
Linking one declaration of an identifier to the object defined by another is performed by using the extern storage-class specifier. This specifier indicates that the identifier in the declaration is not defining a new object (and hence not allocating new storage), but that it is linking the identifier with an object defined elsewhere. The following example shows two sections of code, each in a different source file.
In this example, there is only one object called aaa. It is defined and can be accessed in file 1, but it can also be accessed in file 2 because of the extern declaration in that file. There are two separate objects called bbb, one in each file, and it would be an error to instead use the extern specifier with one declaration to link it to the definition in the other, because that identifier does not have external linkage.
The placement of the extern declaration is also important and will dictate the scope of the linked identifier in the file. In this next example, the position of the declaration has been moved to inside a block.
Now, there is still only one object called aaa, and both files can access this object, but the scope of aaa in file 2 has been limited to only that code inside lower(). (Remember that the usual scoping rules still apply: an identifier declared inside a block has scope only in that block.)
The terms scope, linkage and storage duration might not roll off the tongue quite as well as local and global, but they are critical in the understanding of how objects are defined and can be used in a program. As we have seen, the C language allows degrees of ‘localness’, not just local and global. An object can be defined to be visible throughout the entire program, throughout just one file, within a single function, or even just within one block inside a function. The terms local and global also do not describe how an object is stored. As we saw, it is possible to define a 'local' variable, which comes and goes as the function is executed, or a 'local' that retains its value for the duration of the entire program. Understanding all the possible ways to declare identifiers allows you to use variables safely and effectively.