Searching In Strings
    
        Last week we looked at 
        creating 
        substrings of C strings and NSStrings.  Today we look 
        at another common string operation: searching within a string.
    
Find a character in a C string
    
        As with all operations on C strings, searching requires you to deal 
        with pointers.  To find the first occurrence of a character in a C 
        string, use the strchr() function.  If the character is 
        found, a pointer to that character is returned.  If the character isn't 
        present in the string, NULL is returned.
    
// find a character in a C string
char const *s = "foobar";
char const *character = strchr(s, 'b');
if (character) {
  NSLog(@"Found b");
} else {
  NSLog(@"Didn't find b");
}
// prints "Found b"
    
        As we saw last week when we looked at 
        substrings, 
        the pointer returned by strchr() is effectively a 
        substring of the source string starting at the first occurrence of the 
        character you were searching for:
    
char const *s = "foobar";
char const *substring = strchr(s, 'b');
if (substring) {
  NSLog(@"The substring is %s", substring);
}
// prints "The substring is bar"
    
        Once you find the character you're looking for, it's common to want to 
        create a substring containing everything up to that position 
        in the string:
    
char const *filename = "myfile.txt";
char const *dot = strchr(filename, '.');
if (dot) {
  size_t length = dot - filename;
  char *baseFilename = calloc(length + 1, sizeof(char));
  if (baseFilename) {
    strncpy(baseFilename, filename, length);
    NSLog(@"The base filename is %s");
  }
}
// prints "The base filename is myfile"
    
        You use the difference between the two string pointers to calculate the 
        number of chars up to (but not including) the character 
        you searched for.  After allocating a buffer to hold the new substring 
        (and the null terminator), you use the 
        strncpy() function to copy the first part of 
        the source string.  Because we called calloc(), the last 
        char in our buffer is already set to zero; if you use 
        malloc() or a fixed buffer instead, you need to remember 
        to set the null terminator since strncpy() 
        isn't guaranteed to do it for you. 
    
        Very often, you want to find the last occurrence of a 
        character; you can use the strrchr() function 
        to search in reverse:
    
// find a character in reverse
char const *filename = "myfile.txt";
char const *extension = strrchr(filename, '.');
if (extension) {
  NSLog(@"The extension is %s", extension);
}
// prints "The extension is .txt"
    Find one C string in another
    
        To find the first occurrence of one C string in another, use the 
        strstr() function.  Like strchr(), it returns 
        a pointer to the first occurrence of the string, or NULL 
        if it wasn't found.
    
// find one C string in another
char const *s1 = "The quick brown fox";
char const *s2 = strchr(s1, "ick");
if (s2) {
  NSLog(@"Found ick");
} else {
  NSLog(@"Didn't find ick");
}
// prints "Found ick"
    
        Unfortunately the C standard library doesn't have a 
        strrstr() function to search for the last 
        occurrence of one string in another.  You'll need to roll your own by 
        calling strstr() in a loop until you reach the end of the 
        string.  (The implementation of this is left as an exercise for the 
        reader, or better yet convert your C string to an NSString 
        and keep reading :-)
    
C string encoding issues
    
        The standard library functions for searching C strings work great with 
        ASCII and similar single byte encodings.  If you need to search inside 
        UTF-8 encoded C strings, you'll quickly realize that 
        strchr() and strrchr() are only useful for 
        finding the basic ASCII characters (which are also valid UTF-8 
        characters).  If you need to find non-ASCII characters like 'é', you'll 
        need to use strstr() to search for the byte sequence that 
        UTF-8 uses to represent it ("\xc3\xa9" for 'é').  Even then, Unicode 
        characters like 'é' can be represented two ways: as the single Unicode 
        character 'é' or as the base character 'e' followed by the combining 
        character '´'.  In general, it's better to use a C library designed to 
        deal with the encoding such as the 
        International Components for Unicode
         for handling UTF-8 encoded strings.  Or if you're developing for iOS 
        or Mac OS X, use NSString instead.
    
Find one NSString in another
    
        The NSString class doesn't have separate methods to search 
        for a single character or a string; you use -rangeOfString:
         to do either:
    
// find a character in an NSString
NSString *s = @"foobar";
NSRange range = [s rangeOfString:@"b"];
if (range.location != NSNotFound) {
  NSLog(@"Found b at %u", range.location);
}
// prints "Found b at 3"
    
        Searching for the last occurrence of a string is done using the related 
        method -rangeOfString:options: with the 
        NSBackwardsSearch option.
    
// find last occurrence in an NSString
NSString *s = @"The rain in Spain falls mainly on the plain";
NSRange range = [s rangeOfString:@"ain" options:NSBackwardsSearch];
if (range.location != NSNotFound) {
  NSLog(@"Found ain at %u", range.location);
}
// prints "Found ain at 40"
    
        The options are a combination of the following bit flags: 
        NSCaseInsensitiveSearch, NSLiteralSearch, 
        NSBackwardsSearch and NSAnchoredSearch.  You 
        use the bitwise or (|) operator to combine them together, 
        or pass in zero for no options.
    
        Use the NSCaseInsensitiveSearch option to find the first 
        match, ignoring the case of both strings.  The 
        NSLiteralSearch option is used when you want to match a 
        specific Unicode string form, such as the single character 'é' (Unicode 
        character U+00E9) and not match equivalent character sequences like 'e' 
        + '´' (Unicode characters U+0065 and U+0301).  Most applications won't 
        care about this option, but it's really handy when you need it.
    
        NSAnchoredSearch checks for a match only at the start of 
        the string (or the end if combined with 
        NSBackwardsSearch).  This option is occasionally handy, 
        but the methods -hasPrefix: and -hasSuffix: 
        are easier to read equivalents.
    
// anchored search
NSString *s = @"The rain in Spain falls mainly on the plain";
NSRange range = [s rangeOfString:@"ain" 
                         options:NSAnchoredSearch];
if (range.location == NSNotFound) {
  NSLog(@"Doesn't start with ain");
}
// prints "Doesn't start with ain"
// same thing using -hasPrefix:
if ( ! [s hasPrefix:@"ain"]) {
  NSLog(@"Doesn't have prefix ain");
}
// prints "Doesn't have prefix ain"
// now from the end
range = [s rangeOfString:@"ain"
                 options:NSAnchoredSearch | NSBackwardsSearch];
if (range.location != NSNotFound) {
  NSLog(@"Ends with ain");
}
// prints "Ends with ain"
// same thing using -hasSuffix:
if ([s hasSuffix:@"ain"]) {
  NSLog(@"Has suffix ain");
}
// prints "Has suffix ain"
    
        There are two other variations of -rangeOfString:.  The 
        first, -rangeOfString:options:range:, allows you to search 
        within a section of a larger string without having to create a 
        substring.
    
        The second, -rangeOfString:options:range:locale:, allows 
        you to specify a locale as well as a range.  In most cases you want to 
        use the current locale, which is taken from the language setting on the 
        user's device.  The other variations of -rangeOfString: 
        use the current locale, and you can pass nil for the 
        locale to use the current one.  Sometimes you know that the string 
        contains text in a particular language, in an app that teaches German 
        for instance.  In this case you should specify a locale when searching 
        the string; the locale can affect how text is matched, especially when 
        using the NSCaseInsensitiveSearch option.
    
        Next week, we'll look at 
        replacing 
        characters in C strings and NSStrings.