Concatenating Strings
Last week we reviewed
wide
character strings. Today we'll begin looking at common string
operations using C strings and NSString
s, starting with
string concatenation.
Many languages have built in support for string concatenation, but C
and Objective-C isn't among them. Instead, joining strings is
accomplished using library functions in C and member functions of the
NSString
class in Objective-C.
C strings
Concatenating two C strings is particularly error prone, since it
typically requires manually calculating the required buffer size and
allocating it.
// concatenating two C strings
char const *s1 = "foo";
char const *s2 = "bar";
size_t size = (strlen(s1) * sizeof(char)) + (strlen(s2) * sizeof(char)) + sizeof('\0');
char *s3 = malloc(size);
if (s3) {
strcpy(s3, s1);
strcat(s3, s2);
} else {
// handle memory allocation failure
}
Exploring this code in logical chunks, the first two lines are specific
to this example: they define the two strings we're going to join,
s1
and s2
. The next line calculates the
number of bytes required to hold the new string.
// calculate size of new string
size_t size = (strlen(s1) * sizeof(char)) + (strlen(s2) * sizeof(char)) + sizeof('\0');
The strlen()
function counts the number of
char
s in a string, not including the null
terminator. To be pedantically correct, we multiply the length of each
string by the size of a char
, but since the
char
integer type is one byte in size, we can write the
size calculation this way instead:
size_t size = strlen(s1) + strlen(s2) + sizeof('\0');
If we were concatenating two wide character strings instead, we
wouldn't be able to take that shortcut:
size_t size = (wcslen(ws1) * sizeof(wchar_t)) + (wcslen(ws2) * sizeof(wchar_t)) + sizeof(L'\0');
As a matter of style, I like to use the expression
sizeof('\0')
to account for the size of the null
terminator, but it's more common to simply add one:
size_t size = strlen(s1) + strlen(s2) + 1;
The malloc()
function allocates a block of memory. If
malloc()
succeeds, it returns a non-NULL
pointer to the memory you requested.
char *s3 = malloc(size);
After checking that the value of pointer s3
is not
NULL
, we first call strcpy()
("string copy") to copy the
string pointed to by s1
into the memory pointed to
s3
.
if (s3) {
strcpy(s3, s1);
The strcpy()
function always places a null terminator at
the end of the destination string. When strcpy()
returns,
s1
and s3
point to identical C strings at
different locations in memory.
Finally, we call strcat()
("string
catenate") to append s2
to the end of
s3
. ("Catenate" is a synonym for "concatenate". Isn't
English strange?)
strcat(s3, s2);
The strcat()
function first walks down the destination
string until it finds the null terminator, then it copies the source
string there, overwriting the original null terminator and putting a
new null terminator at the end of the appended string. When using
strcat()
you need to be sure that the destination memory
block contains enough space to hold the concatenated strings. If it's
too small, you will overwrite memory some other memory block, causing
data corruption or a program crash.
If there isn't enough memory available, malloc()
returns
NULL
.
if (s3) {
// ...
} else {
// handle memory allocation failure
}
Checking this return value is important; trying to use a
NULL
pointer will cause your program to be killed by the
system. Unfortunately handling errors like this deep in your code is
generally a pain in the butt; frequently there's no good option except
to abort the current operation.
using a fixed buffer
If you know the maximum size of the strings before hand and the
concatenated string is an intermediate value, you can often use a fixed
buffer instead of a call to malloc()
:
// concatenating two C strings
// using a fixed buffer
char const *s1 = "foo";
char const *s2 = "bar";
char buffer[80];
strcpy(buffer, s1);
strcat(buffer, s2);
// buffer now holds concatenated strings
This greatly simplifies C string concatenation, but if your input
strings are too big, you'll overflow your buffer and cause a program
crash.
NSString
s
Appending one NSString
to another is pretty straight
forward. The -stringByAppendingString:
instance method
performs string concatenation, returning a new NSString
instance
// concatenating two NSStrings
NSString *s1 = @"foo";
NSString *s2 = @"bar";
NSString *s3 = [s1 stringByAppendingString:s2];
The resulting NSString
(s3
) is autoreleased
and contains "foobar".
appending a formatted string
There's an alternate way to do NSString
concatenation by
using -stringByAppendingFormat:
// concatenating two NSStrings using a format
NSString *s1 = @"foo";
NSString *s2 = @"bar";
NSString *s3 = [s1 stringByAppendingFormat:@"%@", s2];
Here, we specify a format string that contains an object replacement
(%@
) only. Additional arguments after the format string
must match the replacement specifiers in the format string. This
method first generates the formatted string then appends it to the
receiver (s1
). It's not as efficient as using
-stringByAppendingString:
directly, but it's more
flexible. You can just as easily append an integer or a C string:
NSString *s1 = @"foo";
// appending a number
NSString *s2 = [s1 stringByAppendingFormat:@"%i", 1234];
// s2 is "foo1234"
// appending a C string
char const *s3 = "bar";
NSString *s4 = [s1 stringByAppendingFormat:@"%s", s3];
// s4 is "foobar"
NSString
preferred
It should be apparent that NSString
concatenation is much
easier to deal with than the multi-step procedure required for C
strings. In iOS programs, you should generally use
NSString
whenever possible.
Next time, we'll look at
comparison
operations and equality of C strings and NSString
s.