Sweetened Cocoa: The String Class

We've spent the last month or so considering all the interesting ways that one can use Apple's Objective-C++ compiler to improve Cocoa code. But so far, we've ignored perhaps the most interesting way: using C++ wrappers to improve the standard Cocoa classes.

Let's start with a review of the problem. The Cocoa framework is a collection of classes and global functions which, as a whole, are really well designed. They do most of the heavy lifting of most apps, ensure that your app has the native look & feel, and provide compatibility with future version of the OS.

The problem is the language which, for historical reasons, has been dragged along with it. Objective-C is, to be blunt, a primitive language. Compare it to any modern language — Java, C#, REALbasic, etc. — and you quickly find it lacking. No operator overloading, crippled type checking, no generics, no true constructor or destructor, no objects on the stack, and manual futzing about with memory management. As a result, Cocoa code tends to be longer and more complex than it needs to be, obscuring the real intent of the code. And, it's far too easy to screw up.

But the engineers at Apple must know this, because they quietly gave us a secret weapon to modernize Cocoa: the Objective-C++ compiler. It's still Objective-C, but it's also C++. So now we have nearly everything that was lacking before: operator overloading, generics, strong type checking, stack objects, destructors, even much simpler memory management.

This week we're going to look at how, by making a fairly simple C++ class to wrap a Cocoa (or CoreFoundation) type, you can get all the benefits of a modern language while still using the native Cocoa framework.

Let's start with one of the most common data types missing from C: the string. In standard Cocoa, we use a pointer to an NSString, and write things like this:

 

    NSString *a = @"app";
    NSString *b = [a stringByAppendingString:@"le"];
    assert([b isEqualToString:@"apple"]);
    assert([b compare:@"banana"] == NSOrderedAscending);
    b = [b stringByAppendingString:@" pie"];
    assert(![b isEqualToString:@"cherry pie"]);

 

But you can declare a C++ class that wraps an NSString*, and provides a cleaner syntax on top of the same functionality. Suppose you call this wrapper simply "String". Then the code above would become:

 

    String a = "app";
    String b = a + "le";
    assert(b == "apple");
    assert(b < "banana");
    b += " pie";
    assert(b != "cherry pie");

 

Moreover, memory management is automatic. Normally, if you had an NSString* "foo", pointing to something not in the autorelease pool, then doing

 

    foo = @"some other string";

 

would cause the first string to leak. Not so with String; if foo is a String, then you can do the above assignment (with or without the @ symbol!) in confidence that the previous string object will be released.

So how does it work? We simply declary a C++ class that has, as its only data member, an ordinary NSString*:

 

    class String {
    protected:
        NSString *obj;
    };

 

Then we add a bunch of assignment operators that do all the gruntwork of retaining and releasing NSString objects. Our version has four of them (one each for String, NSString*, CFStringRef, and char* values), but they all look more or less like this one:

 

    String& String::operator= (NSString *o) {
        [o retain];
        [obj release];
        obj = o;
        return *this;
    }

 

As you can see, this is just standard Cocoa memory management — retain the new value, release the old value, copy new pointer over old. But once you've written this once, you never need to worry about it again; you just do the assignment.

We'll also need a destructor that releases the Cocoa object when the C++ object goes out of scope, as well as constructors for each type we want to automatically convert. The latter are trivial though, since they can just call through to the assignment operator.

 

    String(void) { obj = nil; }
    String(const String& s) { obj = nil; *this = s; }
    String(NSString *o) { obj = nil; *this = o; }
    String(CFStringRef o) { obj = nil; *this = (NSString *)o; }
    String(const char *c) { obj = nil; *this = c; }
    ~String(void) { [obj release]; }

 

To round out the core functionality, we'll add conversion operators to common equivalents. These let you use a String in any context where an NSString* or CFStringRef is expected.

 

    operator NSString *(void) const { 
        return [[obj retain] autorelease];
    }
    
    operator CFStringRef (void) const { 
        return (CFStringRef)[[obj retain] autorelease];
    }

 

The net effect is that, given a CFStringRef or an NSString*, you can trivially (and implicitly) convert it to a String, and vice versa. Doing so incurs no extra overhead in terms of memory or instructions, apart from what you would normally need to retain and release objects anyway.

Now we get to the cool parts. Strings are a common, basic data type, not so different from numbers, and you ought to be able to compare them as easily as numbers. With Objective-C++, you can do that. We need a batch of comparison operators for each type we want to support on the right-hand side, but they all look pretty much like this:

 

    BOOL operator== (const String& s) const { return strcomp(s) == 0; }    
    BOOL operator!= (const String& s) const { return strcomp(s) != 0; }
    BOOL operator> (const String& s) const { return strcomp(s) > 0; }
    BOOL operator< (const String& s) const { return strcomp(s) < 0; }
    BOOL operator>= (const String& s) const { return strcomp(s) >= 0; }
    BOOL operator<= (const String& s) const { return strcomp(s) <= 0; }

 

As you can see, these are just calling through to the strcomp method, which we implement via Cocoa's localizedCompare method:

 

    int strcomp(const String& s) const {
        if (obj) {
            if (s.obj) return [obj localizedCompare:s.obj];
            return 1;
        } else {
            if (s.obj) return -1;
            return 0;
        }
    }

 

(Unlike Cocoa's version, ours sensibly handles nil strings.) So this is the bit of magic that lets you write clean, readable code like "if (a > b)" rather than "if ([a localizedCompare:b] > 0)".

Of course sometimes you will want to call some other comparison function, to sort in a different order or without case sensitivity or some such. But you can still do that, since a String is still an NSString*.

Our String implementation also implements concatenation via the + and += operators, along with a host of other functions we've found handy for finding, extracting, and replacing substrings, and so on. (And remind me sometime to present our SplitJoin module, which makes it trivial to split a string into an array or combine an array back into a string.) But I'm sure you have the idea by now; go grab String.h and String.mm for the full code, including unit tests.

You might be wondering by now whether it's practical to use these goodies in a real application. Well, we've done it, and I'll give you an example. Here's some code from a viewWillAppear method, whose job it is to get some HTML data from a cache, and if it's different from the last time this view was loaded, load that data into a web view.

 

    String newData = [[CacheMaster instance] fileWithPath:filePath];
    if (newData.empty()) {
        newData = [NSString stringWithFormat:kFileNotFoundMsg, filePath];
    }

    if (newData != pageData) {
        self.pageData = newData;
        [self loadHTML];
    }

 

Now, filePath here is a regular Cocoa property of type NSString*, as is the result of the CacheMaster's fileWithPath method. But I stuff the result into a String local variable. Then I can use its empty() method to see if there is any data at all; and I can compare it to pageData (again, an ordinary NSString property) to see if it's changed, using the very natural != operator. If it has changed, we assign it to self.pageData, and call the loadHTML method, which reloads the UIWebView.

As you can see, the String's ability to convert seamlessly to and from an NSString means that using it is simple, even in a community of ordinary NSString properties. In fact the code is considerably simpler with it than it would have been without it, even in this small example.

We've been using the String class (and some other similar wrappers, which I'll present in upcoming blog posts) for a couple months now, and we love them. The 1980s have stopped calling to ask for their code back. Why not give it a try?