3D Clipping in Homogeneous Coordinates.

Recently I got an Arduboy, and I thought it’d be an interesting exercise to blow the dust off my computer graphics knowledge and put a simple 3D vector drawing system together. After all, while by today’s standards a 16 MHz doesn’t sound like a lot, back when I was learning Computer Graphics at Caltech in the 1980’s in Jim Blinn’s computer graphics class, that was quite a bit of computational power.

A limiting factor on the Arduboy is available RAM; after you factor in the off-screen bitmap for drawing, you only have perhaps 1250 bytes left–not a lot for a polygonal renderer. But when a 4×4 transformation matrix fits into 64 bytes, that’s plenty of space for a vector graphics pipeline.

First step in building a vector graphics pipeline, of course, is building a 3D line clipping system; you want to be able to clip lines not just to the left and right, top and bottom, but also to the near and far clipping planes. And you do want to clip before you draw the lines; on the Arduboy, line drawing coordinates are represented as 16-bit signed integers, and with a 3D rendering system it would be very easy to blow those limits.

Sadly, however, I was unable to find a concise description of the homogeneous coordinate system clipping algorithm that I used when I was at Caltech. It’d been 30 years, and I had a vague memory of how we did it–how we combined the Cohen-Sutherland algorithm with Clipping using homogeneous coordinates and some general observations made during class to build an efficient 3D line clipper. And an Internet search yielded very little.

So I thought I’d write an article here, in case in another 20 years I need to dip back into the archives to build yet another line clipping system.

If you want to skip the math and get to the algorithm, scroll down to the section marked “Algorithm.”


Some preliminaries, before we get started.

In Computer Graphics, it is advantageous to represent 3D points using homogeneous coordinates. That is, for any point in 3D space (x,y,z), you add an additional term w, giving (x,y,z,w).

To convert from a homogeneous coordinate (x,y,z,w), you use the following relationship:

NewImage

Typically when we convert a point from 3D space to a homogeneous coordinates we simply tack a 1 at the end. There are special cases where you may want to do something simple (such as representing stars plotted infinitely far away, but typically you simply tack a 1 on:

NewImage

The purpose of this extra parameter is to simplify coordinate transformation through transformation matrices. This extra 1 term, for example, allows us to represent translation (that is, moving a point from one location to another) as a simple matrix multiplication. So if we are moving the points in our system over by (5,3,2), we can create a 4×4 translation matrix:

NewImage

and a point (x,y,z) would be translated through multiplication:

NewImage


One nice thing about using homogeneous coordinates is that we can represent perspective through a matrix multiplication. I’ll save you the diagrams you see on other countless web sites by noting that to represent point perspective, you divide the location (x,y) by the distance z; objects far away appear smaller and closer objects appear larger.

If we use a perspective matrix which adjusts the size (x,y) to fit our screen (so objects to the far left are at x = -1, objects to the far top are at y = -1, right and bottom are at +1 respectively), then a point at (x,y,z) would map in our perspective transform:

NewImage

(Side note: we use a right hand rule for coordinates. This means if up is +y and right is +x, then distance is -z. Which brings me to a rule of thumb with computer graphics: if the image looks wrong, look for an incorrect minus sign. While writing this article and testing the code below, I encountered at least 4 places where I had the wrong minus sign somewhere.)

Now if we convert our coordinates using the homogeneous coordinate to 3D point transformation above:

NewImage

That is, our 3D point (in what we call “screen space”) has the point perspective transformation baked in, and points in 3D space at (x,y,z) show up with perspective on our 2D screen: we just drop the third term and plot points on our screen at the location (ax/-nz,by/-nz).


Notice that we’ve set our boundaries for screen space with the screen space rectangle (-1,-1)-(1,1). It is advantageous for us to clip our drawings at those boundaries, and (for reasons I’ll skip here), it helps to also clip Z coordinates at a near clipping plane (generally z=-1), and a far clipping plane (z=-∞).

The rest of this article will describe in concrete steps how to do this clipping in homogeneous coordinates, using the two articles cited at the start of this article.


First, let us note that to clip in the screen space x/w >= -1, in homogeneous coordinates we really are clipping at x >= -w.

Second, let us define what we mean by clipping. When we say we are clipping a line, what we’re doing is finding the point where a line intersects our clipping plane x >= -w. We then shorten the line at that new point, drawing only the segment that is visible.

So, given two points A and B, we want to find the point C where C intersects the plane x = -w, or rather, where x + w = 0:

Canvas1

We can do this by observing that the dot product of a point and a vector is the distance that point is along the vector, and that our clipping plane x + w = 0 can also be defined as the set of points P where

NewImage

So to find our point C, we calculate the distance along the vector (1,0,0,1) (that is, x = -w) for the points A and B, and linearly interpolate to find the location C.

NewImage

There are some other useful observations we can make about the set of relationships above.

First, if x is greater than -w — that is, if we are on the inside of the clipping plane x/w >= -1, then the dot product A(1,0,0,1) will be greater than or equal to 0.

Second, if both A(1,0,0,1) and B(1,0,0,1) are both negative, then trivially our line is not visible, because both endpoints of the line are not visible.

And we can repeat this operation for all the relationships: for x >= -w, x <= w, y >= -w, y <= w, and z >= -w.

The far clipping plane can also be taken care of, by observing that our far clipping plane at infinity is reached as z/w approaches infinity; that is, as w approaches zero. So our far clipping plane would be the relationship z <= 0.

This gives us the following normal vectors representing our 6 clipping planes:

NewImage

Note that the vectors are set up so that when we are outside of the clipping plane, the value of the dot product is negative: that is, if the relationship x <= w holds, then 0 < w – x, and our clipping plane for that relationship is (-1 0 0 1).


The algorithm

This allows us to start sketching our solution.

Note our definition of a 3D vector is a homogeneous vector:

struct Vector3D
{
    double x,y,z,w;
}

First, we need a method to calculate all our clipping dot products. Note that because we’ve transformed our process into clipping against a set of vectors, the routine below could be generalized to add additional clipping planes. (For example, you may want to create a renderer that only renders the stuff in front of a wall in your game.)

static double Dot(int clipPlane, const Vector3D &v)
{
    switch (clipPlane) {
        case 0:
            return v.x + v.w;        /* v * (1 0 0 1) */
        case 1:
            return - v.x + v.w;      /* v * (-1 0 0 1) */
        case 2:
            return v.y + v.w;        /* v * (0 1 0 1) */
        case 3:
            return - v.y + v.w;      /* v * (0 -1 0 1) */
        case 4:
            return v.z + v.w;        /* v * (0 0 1 1) */
        case 5:
            return - v.z;            /* v * (0 0 -1 0) */
    }
    return 0; /* Huh? */
}

Now we borrow from the Cohen-Sutherland algorithm, and create a method which generates an “outcode”; this is a bitmap with 6 bits, each bit is set if a point is “outside” our clipping plane–that is, if the dot product for the given coordinate is negative.

static uint8_t OutCode(const Vector3D &v)
{
    uint8_t outcode = 0;
    for (int i = 0; i < 6; ++i) {
        if (Dot(i,v) < 0) outcode |= (1 << i);
    }
    return outcode;
}

(Note that we could easily optimize this method by using direct compares rather than by calling into our Dot method. We can also generalize it if the dot product method above is generalized.)

A nice feature about our OutCodes is that if we have the outcodes for points A and B, the following relationships are true:

  • if ((a_outcode & b_outcode) != 0) then both points are on the wrong side of at least one edge, and we can trivially reject this line.
  • if ((a_outcode | b_outcode) == 0) then both points are inside our viewing area, and we can trivially accept this line.

Now if it turns out both relationships are true, then we must clip our line A – B. And we can do that by calculating the alpha coordinates along A and B–if A is outside, we track the alpha on the A side of the line, and if B is outside, we track an alpha along B. And if the two flip–then we have a case like the one shown below:

Canvas1

This allows us to set up our final clipping algorithm.

First, we track the last point that was added, and the out code for that point:

static Vector3D oldPos;      // The last point we moved or draw to
static uint8_t oldOutCode;   // The last outcode for the old point.

We also rely on a drawing routine which actually draws in screen space. This should take the 3D homogeneous point provided, transform (using the transformation to 3D space noted above) to a 3D point, drop the Z component, convert the (x,y) components from the rectangle (-1,-1) – (1,1) to screen pixels, and draw the line as appropriate:

void MoveDrawScreen(bool draw, const Vector3D &v);

One last method we need is a utility method that, given a start point and an end point, and an alpha, calculates the linear interpolation between the start and end points.

static void Lerp(const Vector3D &a, const Vector3D &b, float alpha, Vector3D &out)
{
    float a1 = 1.0f - alpha;
    out.x = a1 * a.x + alpha * b.x;
    out.y = a1 * a.y + alpha * b.y;
    out.z = a1 * a.z + alpha * b.z;
    out.w = a1 * a.w + alpha * b.w;
}

Now our method declaration takes a new point and a flag that is true if we’re drawing a line, or false if we are moving to the point:

void MoveDrawClipping(bool draw, const Vector3D &v)
{
    Vector3D lerp;
    float aold;
    float anew;
    float alpha;
    float olda,newa;
    uint8_t m;
    uint8_t i;
    uint8_t newOutCode;
    uint8_t mask;

    /*
     *    Calculate the new outcode
     */

    newOutCode = OutCode(v);

In the above we calculate the outcode for our new point v. This allows us to quickly determine if the point is inside or outside our visible area. This is useful when we want to just move to a point rather than draw to a line:

    /*
     *  If we are not drawing, and the point we're moving to is visible,
     *  pass the location upwards.
     */

    if (!draw) {
        if (newOutCode == 0) {
            MoveDrawScreen(draw,v);
        }
    } else {

If draw is true, then we’re drawing a line, and so we get to the meat of our clipping algorithm. Note that our old out code and old points (stored in the globals above) have already been initialized. (Or rather, we assume they have been with an initial move call.)

So we can borrow from the Cohen-Sutherland algorithm and quickly determine if our line is entirely clipped or entirely visible:

        /*
         *  Fast accept/reject, from the Cohen-Sutherland algorithm.
         */

        if (0 == (newOutCode & oldOutCode)) {
            /*
             *  The AND product is zero, which means the line may be
             *  visible. Calculate the OR product for fast accept.
             *  We also use the mask to quickly ignore those clipping
             *  planes which do not intersect our line
             */

            mask = newOutCode | oldOutCode;
            if (0 == mask) {
                /*
                 *  Fast accept. Just draw the line. Note that our
                 *  code is set up so that the previous visible line has
                 *  already been moved to
                 */

                MoveDrawScreen(true,v);
            } else {

If we get to the code below, this means we have a value for mask which indicates the clipping planes that we intersect with. So now we want to interate through all the clipping planes, calculating the dot product for only those planes we intersect with:

                /*
                 *  At this point mask contains a 1 bit for every clipping
                 *  plane we're clipping against. Calculate C alpha to find
                 *  C, and adjust the endpoints A alpha (aold) and B alpha
                 *  (anew)
                 */

                aold = 0;            /* (1 - alpha) * old + alpha * new = pt */
                anew = 0;            /* so alpha = 0 == old, alpha = 1 == new */
                m = 1;
                for (i = 0; i < 6; ++i) {
                    if (mask & m) {

                        /* Calculate alpha; the intersection along the line */
                        /* vector intersecting the specified edge */

                        olda = Dot(i,oldPos);
                        newa = Dot(i,v);
                        alpha = olda/(olda-newa);

And next we need to bring in the side of the line which we know has clipped against the clipping plane. Meaning if A is outside our visible area and B is inside, we want to adjust Aα to the new value calculated in alpha; but if A is inside and B outside, we want to adjust Bα instead.

                        /* Now adjust the aold/anew endpoints according to */
                        /* which side (old or v) is outside. */
                        if (oldOutCode & m) {
                            if (aold < alpha) aold = alpha;
                        } else {
                            if (anew > alpha) anew = alpha;
                        }

And finally if somehow the edges cross, we have the circumstance above where the line is not visible but crosses two clipping planes. So we quickly reject that case:

                        if (aold > anew) {
                            /*
                             *  Our line is not visible, so reject.
                             */
                            break;
                        }

The rest simply closes our loop:

                    }
                    m <<= 1;
                }

At this point if we have iterated through all six planes, we have a line which is visible. If our starting point in oldPos was not visible, then we need to move to the place where our starting point intersects with the clipping boundaries of our view space:

                if (i >= 6) {
                    /* Ran all clipping edges. */
                    if (oldOutCode) {
                        /* Old line coming into view */
                        Lerp(oldPos,v,aold,lerp);
                        MoveDrawScreen(false,lerp);
                    }

And if our destination point is visible, we draw to it; otherwise, we draw to the point on the line we’re drawing to which intersects our clipping boundaries:

                    /* Draw to the new point */
                    if (newOutCode) {
                        Lerp(oldPos,v,anew,lerp);
                        mdl2(true,lerp);
                    } else {
                        mdl2(true,v);
                    }

Our clipping is done; all that is left is to close up the loops and conditionals, and store away in our globals the outcode for the point we just moved to, as well as the location of that point:

                }
            }
        }
    }

    /*
     *    Now update the old point to what we just moved to
     */

    oldOutCode = newOutCode;
    oldPos = v;
}

There are a number of optimizations that can take place with the clipping code above. For example, we can also track an array of dot products so we’re not repeatedly calculating the same dot product over and over again. We could replace our outcode routine with simple compare operations. We could also extend the above routine to handle additional clipping planes.

But this gives us a vector line drawing clipping system that clips in homogeneous coordinates, complete with source code that can be compiled and executed.


What is old is new again; what started as a powerful desktop computer has now appeared as an embedded processor. So it’s good to remember the old techniques as we see more and more microcontrollers show up in the products we use.

Things to remember: broken singletons and XCTests

Ran into a bizarre problem this morning where a singleton (yeah, I know…) was being created twice during the execution of an XCTestCase.

That is, with code similar to:

+ (MyClass *)shared
{
	static MyClass *instance;
	static dispatch_once_t onceToken;
	dispatch_once(&onceToken, ^{
		instance = [[MyClass alloc] init];
	});
	return instance;
}

During testing, if you set a breakpoint at the alloc/init line inside the dispatch_once block, you would see instance being created twice.

Which caused me all sorts of hair pulling this morning.


The solution? Well, the unit test code was including the main application during linking.

And the MyClass class was also explicitly included (through Target Membership, on the right hand side when selecting the MyClass.m file) in our unit tests as well.

What this means is that two instances of the MyClass class is being included. That means two sets of global variables, two sets of ‘onceToken’, two sets of ‘instance’. And two separate calls to initialize two separate instances, causing all sorts of confusion.

The answer?

Remove the MyClass.m class from Target Membership.


Well, I guess the real solution is to design the application without singletons, but that’s an exercise for another day.

Besides, there are times when you really want a singleton: you really want only one instance of a particular class to exist because it represents a common object shared across the entire application–and the semantics creates confusion if multiple objects are created. (This is also the case with NSNotificationCenter, for example.)

Things to remember: opening an SSL socket using CFStream

This was a pain in the neck, but I finally figured out the answer.

So I was attempting to open a connection to a server from iOS which may have a self-signed certificate installed.

The specific steps I used to open the connection (from my test harness) was:

1. Set up the variables

	NSString *host = @"127.0.0.1";
	NSInteger port = 12345;
	CFReadStreamRef readStream;
	CFWriteStreamRef writeStream;

2. Set debugging diagnostics. Noted here so I can remember this trick.

	setenv("CFNETWORK_DIAGNOSTICS","3",1);

3. Open the sockets.

	CFStreamCreatePairWithSocketToHost(kCFAllocatorDefault, (__bridge CFStringRef)host, (uint32_t)port, &readStream, &writeStream);

	NSInputStream *inStream = (__bridge_transfer NSInputStream *)readStream;
	NSOutputStream *outStream = (__bridge_transfer NSOutputStream *)writeStream;

4. Set up SSL. The part that tripped me up: you must set the properties for kCFStreamPropertySSLSettings after setting NSStreamSocketSecurityLevelKey. It appears the NSStreamSocketSecurityLevelKey setting overwrites the kCFStreampropertySSLSettings parameter.

	// Note: inStream and outStream are linked by an underlying object, so
	// parameters only need to be set on one of the two streams.

	NSDictionary *d = @{ (NSString *)kCFStreamSSLValidatesCertificateChain: (id)kCFBooleanFalse };
	[inStream setProperty:NSStreamSocketSecurityLevelNegotiatedSSL forKey:NSStreamSocketSecurityLevelKey];
	[inStream setProperty:d forKey:(id)kCFStreamPropertySSLSettings];

5. Open and initalize as needed. Here I’m just opening (because I don’t care if this blocks; this is test code). If using blocking API, use threads. Otherwise use the runloop and delegate APIs.

	[inStream open];
	[outStream open];

On the Java (server) side, the way I set up my server socket for listening to incoming connections was:

1. Set up the variables. Note that my Config class is an internal class that reads properties, and is beyond the scope of this exercise.

	Properties p = Config.get();
	String keystore = p.getProperty("keystorefile");
	String password = p.getProperty("keystorepassword");

	int port = 12345;

2. Load the keystore and key manager. This can be a signed or (in my case) self-signed certificate.

	FileInputStream keyFile = new FileInputStream(keystore); 
	KeyStore keyStore = KeyStore.getInstance(KeyStore.getDefaultType());
	keyStore.load(keyFile, password.toCharArray());

	KeyManagerFactory keyManagerFactory = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
	keyManagerFactory.init(keyStore, password.toCharArray());

	KeyManager keyManagers[] = keyManagerFactory.getKeyManagers();

3. Create an SSLContext. Note that you cannot use “Default” for getInstance below, because that returns an already initialized context, and we want to initialize it with our parameters above. Also note that iOS 9 prefers TLS 1.2.

	SSLContext sslContext = SSLContext.getInstance("TLSv1.2");
	sslContext.init(keyManagers, null, new SecureRandom());

4. Open a ServerSocket class to listen for incoming connections. Note the constant 50 below is arbitrary.

	SSLServerSocketFactory socketFactory = sslContext.getServerSocketFactory();

	socket = socketFactory.createServerSocket(port, 50);

Note that the way I loaded the keystore is sort of the “hard way” to do this; my eventual goal is to have the Java startup code generate a self-signed certificate internally if a keystore is not provided, but I haven’t figured out how to do that yet. (There are plenty of pages out there that show how, but most of them rely on internal Java APIs, and I’m sort of allergic to using undocumented stuff.)

Creating a custom Key in Objective C

When creating a custom key in Objective C for NSDictionary or NSCache or the like, you need to create an object which does the following:

Implements <NSCopying> protocol.

If your key is invariant, you can implement the method copyWithZone: as follows:

- (id)copyWithZone:(NSZone *)zone
{
    return self;
}

Of course if your key is invariant, ideally you would create the key entirely using a custom init function, and mark all the properties (readonly).

Implement the isEqual: method.

This is part of the NSObject protocol. Note that any class (or nil) could be passed in as the argument to isEqual: method, so you may want to use the method isKindOfClass: to verify that you got what you expected as the parameter.

Implement the hash method.

This is also part of the hash function of the data you passed in.

The hash function doesn’t need to be complicated. For example, if your key is three integers, your hash function could be as simple as:

- (NSUInteger)hash
{
    return (self.a << 8) ^ (self.b << 4) ^ self.c;
}

What is important is that two keys passed into your system are unlikely to have a similar value.

Also note that many of the classes that you see used routinely as keys (such as NSString or NSNumber) also follow this protocol. Meaning if your custom key has a string in it, you can use the NSString’s hash function as one of the inputs to your own hashing function:

- (NSUInteger)hash
{
    return (self.intVal << 16) ^ [self.stringVal hash];
}

Objective C subscripting operators

I keep forgetting where this link is, so I’m putting it here: CLang Reference: Objective C, Object Subscripting.

This documents the methods we need to define if we are declaring our own Objective C class and we want to implement array-style subscripting or dictionary-style subscripting.

Extracting from the text:

… Moreover, because the method names are selected by the type of the subscript, an object can be subscripted using both array and dictionary styles.

Meaning if the compiler detects the subscript is an integral type, it uses the array-style subscript method calls, and when the subscript is an Objective C pointer type, it uses a dictionary-style subscript method call.

For array subscripting, use either or both of the methods

- (id<NSObject>)objectAtIndexedSubscript:(NSInteger)index;
- (void)setObject:(id<NSObject>)value atIndexedSubscript:(NSInteger)index;

For dictionary-style subscripting, use either or both of the methods

- (id<NSObject>)objectForKeyedSubscript:(id<NSObject>)key;
- (void)setObject:(id<NSObject>)value forKeyedSubscript:(id<NSObject>)key;

This goes hand-in-hand with my earlier post Objective C declaration shortcuts, and has been a public service announcement.

Moving views around when the keyboard shows in iOS

When the keyboard shows or hides in iOS, we receive an event to notify us that the keyboard is being shown and being hidden.

Ideally we want to get the animation parameters and the size of that keyboard so we can rearrange the views inside of our application to fit the keyboard. I’m only covering the case of the keyboard on the iPhone; on the iPad you also have the problem of the split keyboard, but the same ideas should hold there as well.

Step 1:

When the view controller that may show a keyboard appears, register for notifications for the keyboard being shown and hidden:

- (void)viewDidLoad
{
    [super viewDidLoad];

	// Do any additional setup after loading the view.
	[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(keyboardShowHide:) name:UIKeyboardWillShowNotification object:nil];
	[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(keyboardShowHide:) name:UIKeyboardWillHideNotification object:nil];

	// other stuff here
}

Step 2:

Remember to unregister notifications when this goes away, to prevent problems.

- (void)dealloc
{
	[[NSNotificationCenter defaultCenter] removeObserver:self];
}

Step 3:

Receive the event in our new method, and extract the keyboard parameters and animation parameters.

- (void)keyboardShowHide:(NSNotification *)n
{
	CGRect krect;

	/* Extract the size of the keyboard when the animation stops */
	krect = [n.userInfo[UIKeyboardFrameEndUserInfoKey] CGRectValue];

	/* Convert that to the rectangle in our primary view. Note the raw
	 * keyboard size from above is in the window's frame, which could be
	 * turned on its side.
	 */
	krect = [self.view convertRect:krect fromView:nil];

	/* Get the animation duration, and animation curve */
	NSTimeInterval duration = [[n.userInfo objectForKey:UIKeyboardAnimationDurationUserInfoKey] doubleValue];
	UIViewAnimationCurve curve = [[n.userInfo objectForKey:UIKeyboardAnimationCurveUserInfoKey] intValue];

	/* Kick off the animation. What you do with the keyboard size is up to you */
	[UIView animateWithDuration:0 delay:duration options:UIViewAnimationOptionBeginFromCurrentState | curve animations:^{
			/* Set up the destination rectangle sizes given the keyboard size */
			Do something interesting here
		} completion:^(BOOL finished) {
			/* Finish up here */
			Do something interesting here
		}];
}

Basic Lessons: Object Oriented Programming with Objects

The really stupid thing, by the way, about most code that I review is how few people know about object oriented development. Yes, yes, yes; they say they know all about object oriented development–but when you then review their code (say, in an iOS application with a table) do they practice proper encapsulation? Nooooooooooo…

All too often I see something like this:

MyTableViewCell.h

@interface MyTableViewCell: UITableViewCell
@property (strong) IBOutlet UILabel *leftLabel;
@property (strong) IBOutlet UILabel *rightLabel;
@end

MyTableViewController.m

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    MyTableViewCell *c = (MyTableViewCell *)[tableView dequeueReusableCellWithIdentifier:@"MyTableViewCell" forIndexPath:indexPath];

    c.leftLabel.text = [NSString stringWithFormat:@"Left %d",indexPath.row];
    c.rightLabel.text = @"Right Label";

    return c;
}

The text in red above: no, no, no, no, no.

Haven’t you even heard of encapsulation? No? Well, that’s probably because (a) most developers have no idea what they’re doing beyond copying someone else’s models, and (b) they’ve been taught some bad habits by other developers who also have no idea what they’re doing.

Of course this isn’t helped by Apple, whose own UITableViewCell by default exposes the fields contained within.


What is object encapsulation?

The idea of object encapsulation is central to the idea of object oriented programming. Essentially it refers to the idea of creating “objects”–chunks of data associated with functions designed to work on that data.

To understand how useful this is we need to dive into a pre-OOP language, such as C.

Back in the “bad old days” of C, you had functions, and you had structures. And that was it:

struct MyRecord
{
    int a;
    int b;
};

int AddFields(struct MyRecord x)
{
    return x.a + x.b;
}

While this sort of procedural programming has it’s place–and languages such as C are extremely good at embedded development or in Kernel programming (where execution efficiency is important), it falls short with developing user interfaces, simply because with user interfaces we manipulate things like buttons and text fields and table view cells.

In fact, it turns out that object-oriented programming is tied to user interface development, by abstracting the idea of user interface elements into a new concept of an “object” as a self-contained unit that combines the idea of a structure or record with the idea of functions or procedures that operate on that record.

In C++, we can express this idea as a class:

class MyRecord
{
	int a;
	int b;

	int AddFields();
};


int MyRecord::AddFields()
{
	return a + b;
}

Notice that this expresses the same idea as our C snippet above, which adds the contents of the two fields in MyRecord. Except now, AddFields is associated with a record. This means if we have a record and we want the sum of the fields, instead of writing

MyRecord a;
int sum = AddFields(a);

we write

MyRecord a;
int sum = a.AddFields();

That is, we apply the message against the object.

Now we haven’t really done anything new yet. In fact, if you were to write in C++ the method ‘AddFields()’ from our C example, it would still work with our C++ declaration of MyRecord.

But C++ gives us a new tool: a way of marking fields “private”–that is, only accessible from the methods that are associated with the class.

Thus:

class MyRecord
{
	private:
		int a;
		int b;

	public:
		int AddFields();
};

We’ve hidden a and b from view. Now the only way you can change a and b or get their values is through methods which are then made public with MyRecord.

Encapsulation is the process of creating self-contained objects: objects which provide a clear interface for manipulating the object, but which hide the details as to how the object does it’s work.

Now there was no need to actually use the new features of C++ to provide this sort of data hiding. In C, we can take advantage of the fact that things declared within a single C file stay within that file: we could declare a pointer to a structure in our header file, but hide the details by declaring them in the C file that contains the implementation. C++ makes this easier for us by giving us better tools to manipulate access to the contents of the object.


Why is encapsulation important?

Simply put, encapsulation allows us to separate the “what” from the “how”; separate what we want an object to do from how the object actually does the work.

This becomes important for two reasons.

First, it means that we have an object which has a clearly defined “purpose.” For example we can define an object which represents a button on the screen: a rectangular region the user can tap on or click with their mouse, which then responds to that tap by visually changing appearance and by firing an event which represents the response to that tap or button press.

And second, tied to the first, we can isolate all of the code which handles the button’s behavior within the button itself. A user of the button doesn’t need to know the details of how a button works to put one on the screen, nor does the user need to know how a button receives click or tap events, or how it processes those events. A user doesn’t even need to know the details of how a button draws itself: they don’t need to know how the button handles details such as switching text alignment for languages which read right to left instead of left to right.

And because the details are isolated away, it means those details can change: instead of firing an event on the down click of the mouse the event can be fired when the mouse click is released. The button’s appearance can change–or even be changeable depending on the skin the user selects. The button can even be handled as a spectrum of button-like objects. None of this matter to the user of that button: drop one on the screen, set the text, and wire up the event for the response, and you’re done.


How we can change our object above to respect proper encapsulation

This idea of encapsulation is one that we can–and should follow in our own code. That way if we later have to change the details of how we implement a thing, it doesn’t require us to hunt through all of our code and change the details everywhere else. Change the object, not all the callers manipulating the object.

So, for our UITableViewCell example above, the change is simple. First, hide the details how our table view cell is implemented:

MyTableViewCell.h

@interface MyTableViewCell: UITableViewCell
- (void)setLeftText:(NSString *)left rightText:(NSString *)right;
@end

MyTableViewCell.m

@implementation MyTableViewCell
@property (strong) IBOutlet UILabel *leftLabel;
@property (strong) IBOutlet UILabel *rightLabel;

- (void)setLeftText:(NSString *)left rightText:(NSString *)right
{
    self.leftLabel.text = left;
    self.rightLabel.text = right;
}
@end

And in our caller code:

MyTableViewController.m

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    MyTableViewCell *c = (MyTableViewCell *)[tableView dequeueReusableCellWithIdentifier:@"MyTableViewCell" forIndexPath:indexPath];

    c.leftLabel.text = [NSString stringWithFormat:@"Left %d",indexPath.row];
    c.rightLabel.text = @"Right Label";
    [c setLeftText:[NSString stringWithFormat:@"Left %d",indexPath.row]
         rightText:@"Right Label"];
 

    return c;
}

The original offending code was wrong because it confused the “how” to do something (setting the internal structure of the table cell) with the “what”: set the left and right text.

And notice that now we’ve hidden the details inside the table view. So later, if for some reason we change our implementation of the UITableViewCell:

MyTableViewCell.m

@implementation MyTableViewCell
@property (copy) NSString *leftLabel;
@property (copy) NSString UILabel *rightLabel;

- (void)setLeftText:(NSString *)left rightText:(NSString *)right
{
    self.leftLabel = left;
    self.rightLabel = right;
	[self setNeedsDisplay];
}

- (void)drawRect:(CGRect)r
{
    CGRect textRect = CGRectMake(10, 0, 146, 44);
    {
        NSString* textContent = self.leftLabel;
        UIFont* textFont = [UIFont fontWithName: @"HelveticaNeue-Light" size: UIFont.labelFontSize];
        [UIColor.blackColor setFill];
        [textContent drawInRect: CGRectOffset(textRect, 0, (CGRectGetHeight(textRect) - [textContent sizeWithFont: textFont constrainedToSize: textRect.size lineBreakMode: UILineBreakModeWordWrap].height) / 2) withFont: textFont lineBreakMode: UILineBreakModeWordWrap alignment: UITextAlignmentLeft];
    }

    CGRect text2Rect = CGRectMake(164, 0, 146, 44);
    {
        NSString* textContent = self.rightLabel;
        UIFont* text2Font = [UIFont fontWithName: @"HelveticaNeue-Light" size: UIFont.labelFontSize];
        [UIColor.blueColor setFill];
        [textContent drawInRect: CGRectOffset(text2Rect, 0, (CGRectGetHeight(text2Rect) - [textContent sizeWithFont: text2Font constrainedToSize: text2Rect.size lineBreakMode: UILineBreakModeWordWrap].height) / 2) withFont: text2Font lineBreakMode: UILineBreakModeWordWrap alignment: UITextAlignmentRight];
    }
}
@end

Notice that we don’t have to change a single thing in MyTableViewController.m, simply because it never knew how the table view drew itself; it only knew how to ask.

Wrapping text with ellipsis on iOS

This is an extremely common pattern I encounter quite a few times, and worthy of it’s own “things to remember” post.

So you want to have some text which wraps, but when it hits the bottom of the window, you get an ellipsis, right? But [NSString drawInRect…] doesn’t do the trick, right?

Well, it turns out the answer is using an NSAttributedString.

The first step is to turn your label string into an attributed string. At this point you need to know the font and color of the text that will be rendered, so you need to bake these attributes into the string. So, for example:

- (NSAttributedString *)attributedStringWithString:(NSString *)str
{
	UIFont *font = [UIFont boldSystemFontWithSize:19]; /* Some font */
	UIColor *color = [UIColor blackColor]; /* Some color */
	NSDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys:
		font, NSFontAttributeName,
		color, NSForegroundColorAttributeName,
		nil];
	return [[NSAttributedString alloc] initWithString:str attributes:d];
}

The next step is to draw the string in the rectangle area where it belongs:

	NSAttributedString *astr = [self attributedStringWithString:self.myLabel];
	[astr drawWithRect:r
		options:NSStringDrawingUsesLineFragmentOrigin | NSStringDrawingTruncatesLastVisibleLine
		context:nil];

And this will wrap the text, truncating the last line with an ellipsis.


The advantage of using NSAttributedString is that you can also easily insert segments of text which are of a different color or style. I find it easiest to build up the attributed string through concatenation of attributed segments into an NSMutableAttributedString object. So, for example, if you’re displaying a tweet from Twitter and you wish to bold all the hashtags, you simply parse the string (scanning for sequences staring with whitespace and a ‘#’, and including all the alphanumeric characters past the ‘#’), and create an attributed string of that segment as bold, concatenating all the strings into a single mutable string.

And for optimization purposes you can then hang onto the attributed string rather than the original string.