Rules
- Must be surrounded by double quotes (")
- May include any printable or non-printable (using escape sequences) characters
- Always terminated by a NUL character: '\0' (added automatically by the compiler in most cases)
Examples of Valid String Literals
"Microchip" | "Hi!\n" | "PIC" | "2500" | "someone@example.com" | "He said, \"Hi!\"" |
Examples of Invalid String Literals
"He said, "Hi!""
Double quotes cannot be used directly in a string. They must be included using the escape sequence \".
Although you never explicitly typed in a \0 NUL character and you will never see it in your strings, it really is there. Anytime the compiler sees a string enclosed in double-quotes ("), it automatically adds the NUL character to terminate it.
The NUL character is important because C compilers all provide libraries to work with strings. The library code is designed to work with strings one character at a time by looping through all of them. The NUL character makes this easier to do because the library code doesn't need to know the length of the string ahead of time. Functions will just process strings one character at a time until the \0 is encountered, telling the function that it has reached the end of the string and can stop. This will prove useful for your own code too if you deal with strings.
Although escape sequences look like two characters, the backslash (\) is treated specially in character and string literals. It signals that the character that follows is special. It either represents a non-printable character, as in the case of '\0' for NUL, or a character that cannot be used on its own in a character literal, such as the double-quote or backslash themselves, which are represented in strings as "\"" and "\\" respectively.
Strings: Behind the Scenes
Strings are a special case in C. There is no string data type, so how can we have string literals? Strings are actually just a group of characters collected together in an array. We will see much more of strings and arrays later in the class, but let's take a look at a bit of what is to come.
If a string array is declared without a dimension (the number inside the square brackets that determines the number of characters), it will automatically be sized to hold all the characters in the string, plus the required '\0' NUL character, which the compiler adds automatically.
Example 1: The Wrong Way
This will be stored as:
color[0] = 'R'
color[1] = 'E'
color[2] = 'D'
This is NOT a string because there is no \0 at the end. Not enough room was provided in the array - it needs four characters.
Example 2: The Right Way
This will be stored as:
color[0] = 'R'
color[1] = 'E'
color[2] = 'D'
color[3] = '\0'
This IS a string because it is terminated by a NUL character.