Finding the boundary of a one-bit-per-pixel monochrome blob

Recently I’ve had need to develop an algorithm which can find the boundary of a 1-bit per pixel monochrome blob. (In my case, each pixel had a color quality which could be converted to a single-bit ‘true’/’false’ test, rather than a literal monochrome image, but essentially the problem set is the same.)

In this post I intend to describe the algorithm I used.


Given a 2D monochrome image with pixels aligned in an X/Y grid:

Arbitrary 2D Blob

We’d like to find the “boundary” of this blob. For now I’ll ignore the problem of what the “boundary” is, but suffice to say that with the boundary we can discover which pixels are inside the blob which border the exterior of the blob, which pixels on the outside border the pixels inside the blob, or to find other attributes of the blob such as a rectangle which bounds the blob.

If we treat each pixel as a discrete 1×1 square rather than a dimensionless point, then I intend to define the boundary as the set of (x,y) coordinates which define the borders around each of those squares, with the upper-left corner of the square representing the coordinate of the point:

Pixel Definition

So, in this case, I intend to define the boundary as the set of (x,y) coordinates which define the border between the black and the white pixels:

outlined blob

We do this by walking the edge of the blob through a set of very simple rules which effectively walk the boundary of the blob in a clock-wise fashion. This ‘blob walking’ only requires local knowledge at the point we are currently examining.

Because we are walking the blob in a clock-wise fashion, it is easy to find the first point in a blob we are interested in walking: through iteratively searching all of the pixels from upper left to lower right:

(Algorithm 1: Find first point)

-- Given: maxx the maximum width of the pixel image
          maxy the maximum height of the pixel image

   Returns the found pixel for an eligable blob or 'PixelNotFound'.

for (int x = 0; x < maxx; ++x) {
    for (int y = 0; y < maxy; ++y) {
        if (IsPixelSet(x,y)) {
            return Pixel(x,y);
        }
    }
}
return PixelNotFound;

Once we have found our elegable pixel, we can start walking counter clockwise, tracking each of the coordinates we have found as we walk around the perimeter of the blob, traversing either horizontally or vertically.


Given the way we’ve encountered our first pixel in the algorithm above, the pixels around the immediate location at (x,y) looks like the following:

Point In Center

That’s because the way we iterated through, the first pixel we encountered at (x,y) implies that (x-1,y), (x,y-1) and (x-1,y-1) must be clear. Also, if we are to progress in a clock-wise fashion, clearly we should move our current location from (x,y) to (x+1,y):

Blob5


The design of our algorithm proceeds in a similar fashion: by examining each of the 16 possible pixel configurations we can find in the surrounding 4 pixels, and tracking the one of 4 possible incoming paths we take, we can construct a matrix of directions in which we should take to continue progressing in a clockwise fashion around our eligible blob. And in all but two configuration cases, there was only one possible incoming path we could have taken to get to the center point, since as we presume we are following the edge of the blob we could not have entered between two black pixels or between two white pixels. Some combinations are also illegal since we presume we are walking around the blob in a clock-wise fashion rather than in a counter-clockwise fashion. (This means that we should be, when standing at the point location, pixels should be on the right and never on the left. There is a proof for this which I will not sketch here.)

The 16 possible configurations and the outgoing paths we can take are illustrated below:

All directions

Along the top of this table shows the four possible incoming directions: from the left, from the top, from the right and from the bottom. Each of the 16 possible pixel combinations are shown from top to bottom, and blanks indicate where an incoming path was illegal–either because it comes between two blacks or two whites, or because the path would have placed the black pixel on the left of the incoming line rather than on the right.

Note that with only two exceptions each possible combination of pixels produces only one valid outgoing path. Those two exceptions we arbitrarily pick an outgoing of two possible paths which excludes the diagonal pixel; we could have easily gone the other way and included the diagonal, but this may have had the property of including blobs with holes. (If this is acceptable or not depends on how you are using the algorithm.)

This indicates that we could easily construct a switch statement, converting each possible row into an integer from 0 to 15:

Algorithm 2: converting to a pixel value.

-- Given: IsPixel(x,y) returns true if the pixel is set and false 
          if it is not set or if the pixel is out of the range from
          (0,maxx), (0,maxy)
   Return an integer value from 0 to 15 indicating the pixel combination

int GetPixelState(int x,y int y)
{
    int ret = 0;
    if IsPixel(x-1,y-1) ret |= 1;
    if IsPixel(x,y-1) ret |= 2;
    if IsPixel(x-1,y) ret |= 4;
    if IsPixel(x,y) ret |= 8;
    return ret;

}

We now can build our switch statement:

Algorithm 3: Getting the next pixel location

-- Given: the algorithm above to get the current pixel state,
          the current (x,y) location,
          the incoming direction dir, one of LEFT, UP, RIGHT, DOWN

   Returns the outgoing direction LEFT, UP, RIGHT, DOWN or ILLEGAL if the
   state was illegal.

Note: we don't test the incoming path when there was only one choice. We
could, by adding some complexity to this algorithm, for testing purposes.
The values below are obtained from examining the table above.

int state = GetPixelState(x,y);
switch (state) {
    case 0:    return ILLEGAL;
    case 1:    return LEFT;
    case 2:    return UP;
    case 3:    return LEFT;
    case 4:    return DOWN;
    case 5:    return DOWN;
    case 6:    {
                   if (dir == RIGHT) return DOWN;
                   if (dir == LEFT) return UP;
                   return ILLEGAL;
               }
    case 7:    return DOWN;
    case 8:    return RIGHT;
    case 9:    {
                   if (dir == DOWN) return LEFT;
                   if (dir == UP) return RIGHT;
                   return ILLEGAL;
               }
    case 10:   return UP;
    case 11:   return LEFT;
    case 12:   return RIGHT;
    case 13:   return RIGHT;
    case 14:   return UP;
    case 15:   return ILLEGAL;
}

From all of this we can now easily construct an algorithm which traces the outline of a blob. First, we use Algorithm 1 to find an eligible point. We then set ‘dir’ to UP, since the pixel state we discovered was pixel state 8, and the only legal direction had we been tracing around the blob was UP.

We store away the starting position (x,y), and then we iterate through algorithm 3, getting new directions and updating (x’,y’), traversing (x+1,y) for RIGHT, (x,y+1) for UP, (x-1,y) for LEFT and (x,y-1) for DOWN, until we arrive at our starting point.

As we traverse the parameter, we could either build an array of found (x,y) values, or we could do something else–such as maintaining a bounding rectangle, bumping the boundaries as we find (x,y) values that are outside the rectangle’s boundaries.

First pass at a more formal language for JSON.

So the single most common thing I run into, which is a source of all sorts of headaches when writing custom software for clients, is hooking into their back end system.

A very common pattern for me is to create a single interface which can perform a HTTP ‘get’ or ‘post’ call in order to obtain the contents, run everything through a JSON parser, and then handing the resulting NSDictionary or NSArray to an object which converts the results into a set of Objective C classes.

Up until now I’ve been using JSON Accelerator, which is a really nice little tool for converting JSON into a set of classes. But this runs into a couple of problems.

(1) A number of sites I integrate with have multiple JSON endpoints, each which return subtly different JSON results. Using JSON Accelerator and I wind up generating a lot of duplicate classes which represent more or less the same thing.

(2) Often those sites will change; after all, the back end is under development as well as the front end. I often have a hard time seeing the structure from the JSON; sometimes buried in a few hundred lines is a field that contains a null pointer or which was changed from a string to a JSON field–and tracking those bugs down can be a pain in the ass.

It seemed to me the best way to handle this is to have an intermediate representational language which allows me to see what it is that I’m working with, and to allow allow me to ‘tweak’ the results, so I can point out that the ‘Person’ record in call A is the exact same thing as the ‘Person’ record in call B, except for one of the fields being omitted.

So I built a simple analysis app and a simple compiler app to resolve this problem.

You can download the compiled tools and read the documentation (such as it is) from here.

The representational language is fairly simple: a set of objects, which can be compiled into Objective C and (when I have time) into Java. Each field in an object can be a primitive, an object or an array of objects. So, for example:

/*  Feed 
 *
 *      Top level of the feed
 */

Feed {
    id: integer,
    name: string,
    date: string,
    active: boolean,
    addressList: arrayof Address,
    phoneList: arrayof Phone,
}

/*  Address
 *
 *      The user's address
 */

Address {
    id: integer,
    name: string,
    address: string,
    address2: (optional) string, // optional in the data stream
    city: string,
    state: string,
    zip: string
}

/*  Phone
 *
 *      The user's phone
 */

Phone {
    id: integer,
    name: string,
    phone: string
}

Note that fields can also be marked as ‘nullable':

Feed {
    id: integer,
    name: string,
    value: (nullable) real
}

This will translate into an NSNumber * field rather than into a double.

There are also a couple of tools: one that generates the Objective C code, and one which reads in a bunch of JSON (in fact, it will read multiple JSON objects all in a row), and makes a best guess at the underlying structure, collapsing common objects as needed, and even noting when the same field appears to contain ambiguous content.


At some point I will need to clean this up, add Java support, and push this out to GIT. But for now, there you go.

Let me know if this seems useful.

Handy trick: trap back button in a UINavigationController stack

- (void)didMoveToParentViewController:(UIViewController *)parent
{
	[super didMoveToParentViewController:parent];

	if (![parent isEqual:self.parentViewController]) {
		NSLog(@"Back");
	}
}

Inserted into a view controller pushed into a UINavigationController stack, this will fire the ‘back’ message when the user presses ‘Back’ to back up the view controller stack.

As seen on Stack Overflow

Moving views around when the keyboard shows in iOS

When the keyboard shows or hides in iOS, we receive an event to notify us that the keyboard is being shown and being hidden.

Ideally we want to get the animation parameters and the size of that keyboard so we can rearrange the views inside of our application to fit the keyboard. I’m only covering the case of the keyboard on the iPhone; on the iPad you also have the problem of the split keyboard, but the same ideas should hold there as well.

Step 1:

When the view controller that may show a keyboard appears, register for notifications for the keyboard being shown and hidden:

- (void)viewDidLoad
{
    [super viewDidLoad];

	// Do any additional setup after loading the view.
	[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(keyboardShowHide:) name:UIKeyboardWillShowNotification object:nil];
	[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(keyboardShowHide:) name:UIKeyboardWillHideNotification object:nil];

	// other stuff here
}

Step 2:

Remember to unregister notifications when this goes away, to prevent problems.

- (void)dealloc
{
	[[NSNotificationCenter defaultCenter] removeObserver:self];
}

Step 3:

Receive the event in our new method, and extract the keyboard parameters and animation parameters.

- (void)keyboardShowHide:(NSNotification *)n
{
	CGRect krect;

	/* Extract the size of the keyboard when the animation stops */
	krect = [n.userInfo[UIKeyboardFrameEndUserInfoKey] CGRectValue];

	/* Convert that to the rectangle in our primary view. Note the raw
	 * keyboard size from above is in the window's frame, which could be
	 * turned on its side.
	 */
	krect = [self.view convertRect:krect fromView:nil];

	/* Get the animation duration, and animation curve */
	NSTimeInterval duration = [[n.userInfo objectForKey:UIKeyboardAnimationDurationUserInfoKey] doubleValue];
	UIViewAnimationCurve curve = [[n.userInfo objectForKey:UIKeyboardAnimationCurveUserInfoKey] intValue];

	/* Kick off the animation. What you do with the keyboard size is up to you */
	[UIView animateWithDuration:0 delay:duration options:UIViewAnimationOptionBeginFromCurrentState | curve animations:^{
			/* Set up the destination rectangle sizes given the keyboard size */
			Do something interesting here
		} completion:^(BOOL finished) {
			/* Finish up here */
			Do something interesting here
		}];
}

Snippet: code to convert RGB to HSV and back again

I sketched this code together needing a quick way to convert from RGB to HSV and back again, then realized I didn’t need it. So I’m putting it here in the theory that this may be useful someday…

/*	RGBToHSV
 *
 *		Conver to HSV
 */

static void RGBToHSV(float r, float g, float b, float *h, float *s, float *v)
{
	float max = r;
	if (max < g) max = g;
	if (max < b) max = b;
	float min = r;
	if (min > g) min = g;
	if (min > b) min = b;
	
	/*
	 *	Calculate h
	 */
	
	*h = 0;
	if (max == min) h = 0;
	else if (max == r) {
		*h = 60 * (g - b)/(max - min);
		if (*h < 0) *h += 360;
		if (*h >= 360) *h -= 360;
	} else if (max == g) {
		*h = 60 * (b - r) / (max - min) + 120;
	} else if (max == b) {
		*h = 60 * (r - g) / (max - min) + 240;
	}

	if (max == 0) *s = 0;
	else *s = 1 - (min / max);

	*v = max;
}

/*	HSVToRGB
 *
 *		Convert to RGB
 */

static void HSVToRGB(float h, float s, float v, float *r, float *g, float *b)
{
	if (h < 0) h = 0;
	if (h > 359) h = 359;
	if (s < 0) s = 0;
	if (s > 1) s = 100;
	if (v < 0) v = 0;
	if (v > 1) v = 100;

	float tmp = h/60.0;
	int hi = floor(tmp);
	float f = tmp - hi;
	float p = v * (1 - s);
	float q = v * (1 - f * s);
	float t = v * (1 - (1 - f) * s);
	
	switch (hi) {
		case 0:
			*r = v;
			*g = t;
			*b = p;
			break;
		case 1:
			*r = q;
			*g = v;
			*b = p;
			break;
		case 2:
			*r = p;
			*g = v;
			*b = t;
			break;
		case 3:
			*r = p;
			*g = q;
			*b = v;
			break;
		case 4:
			*r = t;
			*g = p;
			*b = v;
			break;
		case 5:
			*r = v;
			*g = p;
			*b = q;
			break;
	}
}

Basic Lessons: Object Oriented Programming with Objects

The really stupid thing, by the way, about most code that I review is how few people know about object oriented development. Yes, yes, yes; they say they know all about object oriented development–but when you then review their code (say, in an iOS application with a table) do they practice proper encapsulation? Nooooooooooo…

All too often I see something like this:

MyTableViewCell.h

@interface MyTableViewCell: UITableViewCell
@property (strong) IBOutlet UILabel *leftLabel;
@property (strong) IBOutlet UILabel *rightLabel;
@end

MyTableViewController.m

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    MyTableViewCell *c = (MyTableViewCell *)[tableView dequeueReusableCellWithIdentifier:@"MyTableViewCell" forIndexPath:indexPath];

    c.leftLabel.text = [NSString stringWithFormat:@"Left %d",indexPath.row];
    c.rightLabel.text = @"Right Label";

    return c;
}

The text in red above: no, no, no, no, no.

Haven’t you even heard of encapsulation? No? Well, that’s probably because (a) most developers have no idea what they’re doing beyond copying someone else’s models, and (b) they’ve been taught some bad habits by other developers who also have no idea what they’re doing.

Of course this isn’t helped by Apple, whose own UITableViewCell by default exposes the fields contained within.


What is object encapsulation?

The idea of object encapsulation is central to the idea of object oriented programming. Essentially it refers to the idea of creating “objects”–chunks of data associated with functions designed to work on that data.

To understand how useful this is we need to dive into a pre-OOP language, such as C.

Back in the “bad old days” of C, you had functions, and you had structures. And that was it:

struct MyRecord
{
    int a;
    int b;
};

int AddFields(struct MyRecord x)
{
    return x.a + x.b;
}

While this sort of procedural programming has it’s place–and languages such as C are extremely good at embedded development or in Kernel programming (where execution efficiency is important), it falls short with developing user interfaces, simply because with user interfaces we manipulate things like buttons and text fields and table view cells.

In fact, it turns out that object-oriented programming is tied to user interface development, by abstracting the idea of user interface elements into a new concept of an “object” as a self-contained unit that combines the idea of a structure or record with the idea of functions or procedures that operate on that record.

In C++, we can express this idea as a class:

class MyRecord
{
	int a;
	int b;

	int AddFields();
};


int MyRecord::AddFields()
{
	return a + b;
}

Notice that this expresses the same idea as our C snippet above, which adds the contents of the two fields in MyRecord. Except now, AddFields is associated with a record. This means if we have a record and we want the sum of the fields, instead of writing

MyRecord a;
int sum = AddFields(a);

we write

MyRecord a;
int sum = a.AddFields();

That is, we apply the message against the object.

Now we haven’t really done anything new yet. In fact, if you were to write in C++ the method ‘AddFields()’ from our C example, it would still work with our C++ declaration of MyRecord.

But C++ gives us a new tool: a way of marking fields “private”–that is, only accessible from the methods that are associated with the class.

Thus:

class MyRecord
{
	private:
		int a;
		int b;

	public:
		int AddFields();
};

We’ve hidden a and b from view. Now the only way you can change a and b or get their values is through methods which are then made public with MyRecord.

Encapsulation is the process of creating self-contained objects: objects which provide a clear interface for manipulating the object, but which hide the details as to how the object does it’s work.

Now there was no need to actually use the new features of C++ to provide this sort of data hiding. In C, we can take advantage of the fact that things declared within a single C file stay within that file: we could declare a pointer to a structure in our header file, but hide the details by declaring them in the C file that contains the implementation. C++ makes this easier for us by giving us better tools to manipulate access to the contents of the object.


Why is encapsulation important?

Simply put, encapsulation allows us to separate the “what” from the “how”; separate what we want an object to do from how the object actually does the work.

This becomes important for two reasons.

First, it means that we have an object which has a clearly defined “purpose.” For example we can define an object which represents a button on the screen: a rectangular region the user can tap on or click with their mouse, which then responds to that tap by visually changing appearance and by firing an event which represents the response to that tap or button press.

And second, tied to the first, we can isolate all of the code which handles the button’s behavior within the button itself. A user of the button doesn’t need to know the details of how a button works to put one on the screen, nor does the user need to know how a button receives click or tap events, or how it processes those events. A user doesn’t even need to know the details of how a button draws itself: they don’t need to know how the button handles details such as switching text alignment for languages which read right to left instead of left to right.

And because the details are isolated away, it means those details can change: instead of firing an event on the down click of the mouse the event can be fired when the mouse click is released. The button’s appearance can change–or even be changeable depending on the skin the user selects. The button can even be handled as a spectrum of button-like objects. None of this matter to the user of that button: drop one on the screen, set the text, and wire up the event for the response, and you’re done.


How we can change our object above to respect proper encapsulation

This idea of encapsulation is one that we can–and should follow in our own code. That way if we later have to change the details of how we implement a thing, it doesn’t require us to hunt through all of our code and change the details everywhere else. Change the object, not all the callers manipulating the object.

So, for our UITableViewCell example above, the change is simple. First, hide the details how our table view cell is implemented:

MyTableViewCell.h

@interface MyTableViewCell: UITableViewCell
- (void)setLeftText:(NSString *)left rightText:(NSString *)right;
@end

MyTableViewCell.m

@implementation MyTableViewCell
@property (strong) IBOutlet UILabel *leftLabel;
@property (strong) IBOutlet UILabel *rightLabel;

- (void)setLeftText:(NSString *)left rightText:(NSString *)right
{
    self.leftLabel.text = left;
    self.rightLabel.text = right;
}
@end

And in our caller code:

MyTableViewController.m

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    MyTableViewCell *c = (MyTableViewCell *)[tableView dequeueReusableCellWithIdentifier:@"MyTableViewCell" forIndexPath:indexPath];

    c.leftLabel.text = [NSString stringWithFormat:@"Left %d",indexPath.row];
    c.rightLabel.text = @"Right Label";
    [c setLeftText:[NSString stringWithFormat:@"Left %d",indexPath.row]
         rightText:@"Right Label"];
 

    return c;
}

The original offending code was wrong because it confused the “how” to do something (setting the internal structure of the table cell) with the “what”: set the left and right text.

And notice that now we’ve hidden the details inside the table view. So later, if for some reason we change our implementation of the UITableViewCell:

MyTableViewCell.m

@implementation MyTableViewCell
@property (copy) NSString *leftLabel;
@property (copy) NSString UILabel *rightLabel;

- (void)setLeftText:(NSString *)left rightText:(NSString *)right
{
    self.leftLabel = left;
    self.rightLabel = right;
	[self setNeedsDisplay];
}

- (void)drawRect:(CGRect)r
{
    CGRect textRect = CGRectMake(10, 0, 146, 44);
    {
        NSString* textContent = self.leftLabel;
        UIFont* textFont = [UIFont fontWithName: @"HelveticaNeue-Light" size: UIFont.labelFontSize];
        [UIColor.blackColor setFill];
        [textContent drawInRect: CGRectOffset(textRect, 0, (CGRectGetHeight(textRect) - [textContent sizeWithFont: textFont constrainedToSize: textRect.size lineBreakMode: UILineBreakModeWordWrap].height) / 2) withFont: textFont lineBreakMode: UILineBreakModeWordWrap alignment: UITextAlignmentLeft];
    }

    CGRect text2Rect = CGRectMake(164, 0, 146, 44);
    {
        NSString* textContent = self.rightLabel;
        UIFont* text2Font = [UIFont fontWithName: @"HelveticaNeue-Light" size: UIFont.labelFontSize];
        [UIColor.blueColor setFill];
        [textContent drawInRect: CGRectOffset(text2Rect, 0, (CGRectGetHeight(text2Rect) - [textContent sizeWithFont: text2Font constrainedToSize: text2Rect.size lineBreakMode: UILineBreakModeWordWrap].height) / 2) withFont: text2Font lineBreakMode: UILineBreakModeWordWrap alignment: UITextAlignmentRight];
    }
}
@end

Notice that we don’t have to change a single thing in MyTableViewController.m, simply because it never knew how the table view drew itself; it only knew how to ask.

Objective C declaration shortcuts

With a recent update (okay, not quite so recent update) to Objective C, the following lexical shortcuts appear to have been added:

Create a dictionary using @{…}

The @{ … } shortcut has been added to create a new NSDictionary with a list of keys and values. The format appears to be:

@{ key : value , ... }

So, to create a new dictionary with three keys:

NSDictionary *d = @{ @"a": @"First",
                     @"b": @"Second",
                     @"c": @"Third" };

This creates a dictionary with three keys: @”a”, @”b” and @”c”, with the values @”First”, @”Second” and @”Third”, respectively.

Create an array using @[ … ]

The @[ … ] shortcut creates a new NSArray with a list of values. The format appears to be:

@[ value, value, ... }

So, to create a new array with three items:

NSArray *a = @[ @"a", @"b", @"c" ];

This creates an array with three items, @”a”, @”b” and @”c”.

Create an NSNumber wrapper around a numeric constant using @(…)

The @(…) appears to take a scalar and wrap it in an NSNumber object. Thus:

NSNumber *n = @( 5 );

creates a new NSNumber with the value 5.


You can combine each of these together in any way you wish. For example:

NSDictionary *d = @{ @(1): @[ @"A" ],
                     @(2): @[ @"Another", @"Thing" ],
                     @(3): @[ @"Tom", @"Dick", @"Harry" ] };

Far more convenient than the old way.


I’ve seen the first two documented elsewhere, but I don’t think I’ve seen the third one (boxing an NSNumber) mentioned anywhere, and I don’t quite remember where I saw it. Probably in the compiler release notes.

Also note that these can be used in-line with code and not just with constant objects. So, for example, you can use the NSNumber boxing mechanism to box an integer return value from a function:

NSNumber *count = @( [array count] );

This has been a public service announcement.

Dear WordPress

Thanks (NOT!!!) for letting me know of all the messages in the queue waiting for approval–including comments that I made months ago.

(Shakes head)

There are days when I want to build my own blogging system, just so I can bypass all this weirdness.

My love/hate relationship with Design Patterns.

I have a love/hate relationship with Design Patterns, which is actually captured in the first two sentences of the Wikipedia article I just linked to.

I love them because they provide a general reusable solution–a rule of thumb, so to speak–on how to solve some fairly common problems.

For example, if you are developing a user interface application, then the Model-View-Controller pattern or the Model-View-Presenter pattern is definitely a lifesaver in helping you design a well-organized application.

(The difference between the two have to do with the ability of view components having direct access to the model data: MVC allows the connection as a rule of thumb, while MVP denies it. For most applications–such as simple iOS applications which access remote data from a database and present the results on a screen–strictly separating views from the model object makes a tremendous amount of sense. However, there are certain types of applications where the user interaction of the view with the data is so involved, you need to allow the view insight into the model data directly. Usually this is reserved for complex editing applications, like a text editor or a CAD system, where the logic for handling key or mouse events makes more sense to contain in a text editor view or a CAD drawing view than allow the raw events to percolate to an already over-loaded controller module.)

And having a good working knowledge of various design patterns helps you have a standard bag of tricks to apply to problems you may see in the field: singletons, proxies and modules are excellent ways to handle organization of conceptual functions. Flyweights are helpful when creating a complex page of preferences in a dialog. Object pools help with reducing the overhead of allocation. And so forth.


I hate them, though.

And I hate them for the very simple reason that to an inexperienced developer equipped with a hammer, the entire world becomes full of nails.

Design patterns are, as defined in Wikipedia, “general reusable solutions to commonly occurring problems.”

And there are three problems with that.

First, it requires some degree of understanding of the solution and how to apply it. If you don’t understand the ‘chain of responsibility’ pattern, for example, then you can really shoot yourself in the foot if you implement it poorly–as I’ve seen done on a number of occasions, with events in a custom event chain that just “disappeared”.

Second, it requires some degree of understanding what the problem was the solution is attempting to solve. One developer I had a conversation with a long time ago insisted on using a decorator pattern to allow a view to both have presentation and editing functionality, when either a flyweight or simple inheritance could have solved the problem far more gracefully.

And third, many developers labor under the presumption that these are the only solutions–that there are no other solutions out there. The whole world becomes full of nothing but nails, and by God if my hammer doesn’t work the problem is not solvable by us “mere mortals.”


To me, design patterns are just solutions to common problems. There are other potential solutions to other problems not quite as common: for example, I once wrote a networking stack for a small microprocessor by creating a state machine: a large switch statement where each ‘case’ ended by changing the variable controlling the state machine to the next ‘case’ that was to be executed.

Emulating multi-threading was achieved by having each network connection hold its own state–and on exit of each case, a pointer pointed to the next active network object and picked up with whatever state that network object was in.

(I’ve successfully used state machines to solve a number of esoteric problems, including building my own LISP interpreter, and my own virtual machine interpreter. And state machines are built by YACC for building a parser; they’re far more flexible than recursive-descent parsers.)

And there are many solutions I’ve come across which solve common problems related to multi-threading that I’ve yet to see written down as an “approved” Design Pattern, such as the solution I provide here, to solve the problem of anonymous inner blocks which may be deallocated by a user action prior to the completion of an asynchronous task.


Principally I hate design patterns for the same reason I hate orthodoxy: it’s just an excuse to stop thinking now that all of the approved answers have been derived for us.

1 2 3 4 36