Strings
The C++ character string originated within the C language and continues to be supported within C++. In C programming, the char type is used to store characters. The char type is an integer type. Using a numerical code, each char integer value is mapped with a corresponding character. The most common numerical code is ASCII.
To declare a variable with character type, use the char keyword followed by the variable name –
char ch;
A char character variable can be initialised with a character literal or an integer type. A character literal contains one character surrounded by a single quotation (‘’).
The following example declares char variable ch and initialises it with a character literal ‘a’
char ch='a';
Because a char is an integer type, it can also be initialised using an integer. -
char ch=65;
Char Arrays
C-strings are arrays of type char terminated with the null character '\0'.
To declare and initialise a character string array
Character arrays can be declared and initialised on a character-by-character basis using an array-style initialiser however it’s much easier to initialise a character array with a string literal –
char greeting[] = "hello" ;
Initialising a char array with the value '\0' creates a NULL or empty string -
char greeting[0] = {'\0'};
Inserting ‘\0’ anywhere in the middle of the array would not change the size of the array but it would mean that string processing would stop at that point. Sending the following char array to the screen would only produce the characters ‘hel’
char greeting[] = "hel\0lo"
Single Quotes vs Double Quotes
In C++ single quotes identify a single character and double quotes create a string literal. ‘a’ is a single character literal, while “a” is a string literal containing an ‘a’ and a null terminator (that is a 2 char array).
Assign New Values to a Char Array
To change the contents of the string after the initial assignment, it is necessary to change the array contents individually. Trying to assign a new value to an existing char array ie greetings[]=”HELLO”, won’t work because the = operator isn’t defined to copy the contents of a string literal to a char array.
greetings[0] = 'H';
greetings[1] = 'E';
greetings[2] = 'L';
greetings[3] = 'L';
greetings[4] = 'O';
greetings[5] = '\0';
Since assigning new array string values individually is not very practical, C++ uses the function strcpy/strncpy (found in the string.h header) to assign the contents of an array outside of a declaration. The syntax for strcpy is -
strcpy(greetings,"hello");
Pointers and Arrays
The use of pointers can also access array elements. The pointer is declared and assigned to the first element of the array. After assigning the array pointer, the individual elements can be accessed by increasing or decreasing the pointer value.
The code section below outputs the same letters of an array string by using pointers and array indexing
#include <iostream> using namespace std; int main() { char str[31]="this is a string to array test"; //declares char arrar char *pChar=str;//declares char pointer and sets to start of array int i; for(i=0; i<=31; i++) { cout << *(pChar+i) ; //outputs pointer value cout << str[i] ; //outputs array value } return 0; }
Pointers and String Literals
A string literal in C++ is a sequence of characters enclosed in double quotation marks. Programmers can allocate their pointers to store and access characters held in string tables. The code sample below creates and prints a string literal –
#include <iostream> using namespace std; int main() { const char *ptrsl= "this is a string literal.\n"; //creates pointer ptrsl to to start of string array cout << ptrsl; return 0; }
Attempting to modify a string literal using a pointer can result in undefined behaviour so any pointer to a string literal should be declared as a const.
Array of Pointers
An array of pointers to strings is an array of character pointers where each pointer points to the first character of the string or the base address of the string. To declare and initialise an array of pointers to strings.
char *day[] = { "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday","Friday","Saturday"};
Since initialisation of the array is done at the time of declaration then the size of the array can be omitted.
To access the values pointed to by an array of pointers it is necessary to iterate and dereference each pointer individually.
In the code section below, each element of the array day is a pointer to the base address of the first character of an individual string literal which is then output by means of a for-next loop.
#include <iostream> const int MAX = 7; int main () { char *day[] = {"Sunday","Monday","Tuesday", "Wednesday", "Thursday","Friday","Saturday"}; int i = 0; for ( i = 0; i < MAX; i++){ printf("day[%d] = %s\n", i, day[i] ); } return 0; }
wchar_t, char16_t, char32_t
A wchar_t, or wide char is similar to the char data type but is usually 2 bytes and can represent characters requiring more memory than a regular char data type such as the Unicode standard UTF-16LE.
Since the size of wchar_t is compiler-dependent it is better to use the dedicated data types char32_t and char16_ to ensure cross-compiler compatibility,
The char16_t and char32_t types represent 16-bit and 32-bit wide characters, respectively. Unicode encoded as UTF-16 can be stored in the char16_t type, and Unicode encoded as UTF-32 can be stored in the char32_t type.
The wcout object in C++ is an object of the class wostream and is used to send Unicode strings that do not fit in a char variable to the screen. To declare a wide-character string literal it is necessary to put L before the literal.
The following code demonstrates char and widechar arrays with the associated size data.
#include <iostream> #include <string.h> #include <cwchar> using namespace std; int main() { char str[]="string"; // wide-char type array string wchar_t wstr[]=L"string" ; cout << "The size of '" << str <<"' is " << sizeof(str) << endl; wcout << "The size of '" << wstr << "' is " << sizeof(wstr) << endl; return 0; }