Replacing In Strings
Welcome back to Objective-C Tuesdays! Today we follow closely on last
week's topic of
searching in
strings with it's sibling, replacing in strings.
It's a nightmare in C
In our series on strings in Objective-C, we've usually started by
looking at C strings then moved on to NSString
s. Today is
no different. In most cases, using NSString
is easier
than doing the equivalent operation on C strings. When it comes to
replacing characters in a string, using NSString
is
significantly easier and safer. The standard C library
doesn't provide much support for doing common string replacement
operations, so you have to implement them yourself. Because of all the
manual memory management required when working with C strings, this
code is very error prone -- writing off the end of a buffer and
forgetting to add the null terminator are two very common types of
errors you have to watch out for when working with C strings.
Replacing a character
The only replacement operation that's fairly straightforward on C
strings is replacing a single character with another character. Since
C strings are just pointers to arrays of char
s, you simply
calculate the pointer to the char
you want to change,
dereference the pointer and assign the new char
value.
There are two variations of this. The first one uses array notation
and the second pointer operations. In both examples below, we use the
strdup()
function to make a copy of our original C string.
The strdup()
function isn't part of the C standard
library, but most systems have one available (possibly named
_strdup()
) and it's easy to write one if it's missing on
your system (it's available on iOS). You own the string returned by
strdup()
and are responsible for calling
free()
when you're done with it.
Here's how you change a character in a C string by treating it as an
array of char
s:
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
copy[3] = 'B'; // change char at index 3
NSLog(@"copy = %@", copy);
// prints "copy = fooBar"
free(copy); // free copy when done
The alternative way uses pointer arithmetic:
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
char *c3 = copy + 3; // get pointer to char at index 3
*c3 = 'B'; // change char at address of c3
NSLog(@"copy = %@", copy);
// prints "copy = fooBar"
free(copy); // free copy when done
As far as the compiler is concerned, this is basically the same code so
use whichever method makes the most sense. If you know the index of
the char
you want to change, use array notation. If you
already have a pointer to the char
, perhaps from calling
strchr()
, use the pointer directly.
Replacing a substring
Replacing a substring of a C string is harder. In the case where the
original and the replacement have the same number of
char
s, you can call strncpy()
to
copy over the characters.
// replacing a substring of equal length
char const *source = "foobar";
char *copy = strdup(source); // make a non-const copy of source
char *c2 = copy + 2; // get pointer to char at index 2
strncpy(c2, "OBA", 3); // copy 3 chars
NSLog(@"copy = %s", copy);
// prints "copy = foOBAr"
free(copy); // free copy when done
Replacing a substring with a different sized one is even more complex.
There are three special cases that need to be handled: the substring to
replace is at the start of the original, in the middle, or at the end.
When the replacement substring is smaller than the original, there are
some short cuts you can take to make the code a little simpler, but
we'll only show the general case.
We'll look at the second case, replacing a substring in the middle of
the original. With a little extra logic, this code can be adapted to
handle all three of our cases.
char const *source = "The rain in Spain";
char const *original = "rain"; // substring to find
char const *replacement = "plane"; // substring to replace
// calculate the required buffer size
// including space for the null terminator
size_t size = strlen(source) - strlen(original)
+ strlen(replacement) + sizeof(char);
// allocate buffer
char *buffer = calloc(size, sizeof(char));
if ( ! buffer) {
// handle allocation failure
}
// find original substring in source and
// calculate the length of the unchanged prefix
char *originalInSource = strstr(source, original);
size_t prefixLength = originalInSource - source;
// copy prefix "The " into buffer
strncpy(buffer, source, prefixLength);
// calculate where the replacement substring goes in the buffer
char *replacementInBuffer = buffer + prefixLength;
// copy replacement "plane" into buffer
strcpy(replacementInBuffer, replacement);
// find position of unchanged suffix in source and
// calculate where it goes in the buffer
char const *suffixInSource = originalInSource + strlen(original);
char *suffixInBuffer = replacementInBuffer + strlen(replacement);
// copy suffix " in Spain" into buffer
strcpy(suffixInBuffer, suffixInSource);
NSLog(@"buffer = %s", buffer);
// prints "buffer = The plane in Spain"
free(buffer); // free buffer when done
I won't even waste your time explaining this in detail. No one
programming in a modern computer language should have to write this
code! It's extremely error prone and is one of the main causes of
security vulnerabilities. If you find yourself doing this, stop
immediately and seek out one of the
many managed string
libraries for C that are available. If you're writing code for
iOS, you should be using NSString
to do this.
Replacing using NSString
The NSString
class has a number of useful methods for
replacing characters and substrings in an NSString
.
Because NSString
is immutable, these methods all return a
new NSString
instance containing the replacements, leaving
the source NSString
unchanged.
When you know the exact area of the string you want to replace, you can
use the -stringByReplacingCharactersInRange:withString:
method with an NSRange
structure, which has fields for
location
(the zero-based index to start at) and
length
(the number of characters in the source string to
replace). Because NSString
does all the memory management
for you and returns a new autoreleased NSString
, it's
child's play compared to doing this with C strings.
// replace a range in an NSString
NSString *source = @"The rain in Spain";
NSRange range;
range.location = 4; // starting index in source
range.length = 3; // number of characters to replace in source
NSString *copy = [source stringByReplacingCharactersInRange:range
withString:@"trai"];
NSLog(@"copy = %@", copy);
// prints "copy = The train in Spain"
// no need to release anything
// copy is autoreleased
This is a definite improvement over working with C strings. You might
actually do this in real code without tearing your hair out or causing
a buffer
overrun bug. We can make this code even more compact by using the
NSMakeRange()
function to create the NSRange
structure.
// replace a range in an NSString
NSString *source = @"The rain in Spain";
// create range in line
NSString *copy = [source stringByReplacingCharactersInRange:NSMakeRange(4, 3)
withString:@"trai"];
NSLog(@"copy = %@", copy);
// prints "copy = The train in Spain"
// no need to release anything
// copy is autoreleased
If you don't know ahead of time what part of the string you want to
replace, you can do a find and replace in one method. The
-stringByReplacingOccurrencesOfString:withString:
method
will find all occurrences of one NSString
in
another and replace them, returning a new autoreleased
NSString
.
// find and replace one substring with another
NSString *source = @"The rain in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"ain"
withString:@"oof"];
NSLog(@"copy = %@", copy);
// prints "copy = The roof in Spoof"
There is another variation of this method that gives you more control
over how substrings are found and replaced. The
-stringByReplacingOccurrencesOfString:withString:options:range:
method allows you to specify a mask containing one or more options and
an NSRange
structure allowing you to restrict the
operation to a section of the string. The most common option is
NSCaseInsensitiveSearch
, which matches the substring
without regard to case.
// case insensitive replace
NSString *source = @"<BR>The rain<BR>in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"<br>"
withString:@"<p>"
options:NSCaseInsensitiveSearch
range:NSMakeRange(0, [source length])];
NSLog(@"copy = %@", copy);
// prints "copy = "<p>The rain<p>in Spain"
Another handy search option is NSAnchoredSearch
, which
searches only at the start of the source string. Notice that you use
the bitwise or (|
) operator to combine multiple options
together.
// anchored, case insensitive replace
NSString *source = @"<BR>The rain<BR>in Spain";
NSString *copy = [source stringByReplacingOccurrencesOfString:@"<br>"
withString:@"<p>"
options:NSAnchoredSearch | NSCaseInsensitiveSearch
range:NSMakeRange(0, [source length])];
NSLog(@"copy = %@", copy);
// prints "copy = "<p>The rain<BR>in Spain"
You can combine the NSBackwardsSearch
with
NSAnchoredSearch
to only replace the substring if it
occurs at the end of the source instead of at the beginning.
Replacing in NSMutableString
If you're working with an NSMutableString
, you can still
call any of the -stringByReplacing...
methods to produce a
new NSString
, but you have the option of making the
replacements in the NSMutableString
directly. The method
-replaceCharactersInRange:withString:
is very similar to
the -stringByReplacingCharactersInRange:withString
method:
// replace a range in an NSMutableString
NSMutableString *source = [NSMutableString stringWithString:@"The rain in Spain"];
[source replaceCharactersInRange:NSMakeRange(4, 3)
withString:@"trai"];
NSLog(@"source = %@", source);
// prints "source = The train in Spain"
The method
-replaceOccurrencesOfString:withString:options:range:
works similarly.
In most cases, there's not much of an advantage to replacing in place
in an NSMutableString
versus creating a new
NSString
containing the replacement. Use whichever
operation is most convenient. If you need to make many replacements on
a very long string, there may be an advantage to replacing in
place rather than creating many large temporary NSString
instances that live in the autorelease pool.
So far, the searching and replacing methods we've seen have done only
simple string matching. Next week, we'll look at
more powerful
string matching using regular expressions.