Archive of ‘Commentary’ category

An interesting exercise.

So I’m developing this game in my spare time, and I needed a list of commonly used words. Downloaded the Project Guttenberg DVD of popular texts, an open source dictionary of words, and built a simple program which scanned the DVD, parsed out all the words, compared against the dictionary, and updated the count of those words to arrive at a list of commonly used words.

And here’s the top 50 words I found, in order:

the
of
and
to
in
that
he
was
it
his
is
with
for
as
you
on
had
not
be
at
but
by
this
her
or
which
from
have
they
she
all
him
we
are
were
my
me
so
one
an
no
their
if
there
who
said
them
when
would
been

I don’t know if this is cool or… mundane…

And then it turns upside-down in an instant.

Cartifact is in the business of making print maps for high end real-estate companies, and I was brought in to help build a technical team, to make Cartifact a tech-oriented company in order to position it for an acquisition.

But sometimes things don’t work out the way you plan.

At the same time we brought in a new company President who spent the last several months going around and talking to different companies with whom Cartifact had a pre-existing relationship. And after doing this for several months, the conclusion was that Cartifact made beautiful maps, and so Cartifact should continue making beautiful maps, increasing their exposure via better marketing and trying to capture more of the market for making hand-drawn beautiful maps.

This means the idea of building a technology team and positioning Cartifact–well, that went away. I’m not sure exactly what happened behind the scenes, but the feeling I get was that they didn’t want to continue bleeding red for something they had no intention of building. And so there was no more money for me.

It’s troubling when you find out the opportunity you left a rather secure job for no longer exists. And it’s troubling to discover this opportunity no longer exists with perhaps a half-hour’s notice. But that’s the way of the world sometimes.

And the important thing is to re-orient yourself, shake out your contacts, and see where you want to go.

Ten years ago, nearly to the month, I wound down In Phase Consulting, a consulting company I had run with my wife for 9 years and took a job at Symantec for two reasons. One, right after 9/11, the market for freelancing was drying up thanks to the Dot Com crash and the worries over 9/11. And two, I found that I was bumping up against my limits: I had never managed a team, I had never hired people, I had never really learned how to delegate tasks or break down larger projects into smaller components.

And I wanted to get that experience.

Now I’m back to where I was 10 years ago, but this time, the market has changed. No more picking up the phone and making cold calls; we now have Facebook and LinkedIn and blogs and mailing lists and MeetUps. It’s easier now than ever to circulate out there and see what projects are there. And while this does not alleviate the marketing process, it does provide better avenues for searching.

And I’ve changed: I’ve successfully built a team from scratch. I’ve successfully hired and brought people up to speed. I’ve successfully delivered large scale projects, and for the most part when I’ve had enough control to keep things from going haywire, I’ve managed to consistently bring projects in on time and under budget.

So a few hours ago I filed an S Corp app with LegalZoom and created a very basic web site, Glenview Software Corporation, along with ordering business cards. I figure I’ll give the freelance thing a try again, but this time not be afraid of going after the larger projects.

We’ll see where this lands.

If you need a highly qualified software developer with 25 years of professional experience in mobile, embedded, web, desktop and client/server architecture software, please send me an e-mail.

Meanwhile I’m going to go back to what I do best: create damned good software through solving very hard problems. Like that LISP interpreter which I now have running embedded on an iPhone and an iPad which I plan to turn into a symbolic programmable calculator for those archiectures.

On Requirements Documentation.

After reading this article I just wanted to add my two cents.

  • Documentation takes time to both write, read and communicate. Meaning if you write a 100 page specification for a software product, the developers are going to have to read the 100 page specification for your software product, and you and your software developers are going to have to hold several meetings in order to make sure the intent of that 100 page document was properly communicated.
  • Documentation also takes time to maintain. If you write a 100 page document you are going to have to go through and revise a 100 page document as new facts on the ground develop: as you learn more about your customers and as you learn more about the product itself as it is being built. Each revision to the document must also be read by your software developers, and you’re going to have to communicate the effective impact of those changes in a series of meetings as well.
  • This implies a very simple fact: if it takes you a day to write a product specification, allocate two days for the specification to be read, and three days for meetings to effectively communicate the information in your specification to your developers. If it takes a month, allocate two and three months, respectively. Or resign yourself to the fact that your product specification document won’t be read–and the entire exercise in writing that document was at best a masturbatory waste of time.

I note these three facts because they are often forgotten by product managers. (And about that 100 page specification: I’ve seen them. Worse, I’ve seen them delivered three quarters through the development cycle, by proud product managers who believed that, after spending months writing them, believed we could then execute on these massive tomes with perhaps a day or two of reading, rewriting the parts of the software we had already developed as needed.)

Clearly the theory behind the Laffer curve also applies to specification documents. No documentation at all is bad; it means we don’t have any consensus as to what we’re building. Too much documentation is also bad: it means we can never find the time to develop a consensus–and that assumes that the documentation is not internally inconsistent. (Sorry, Product Managers–but in general you’re not trained Software Architects, so please don’t play them. I’ve seen product specifications which specified the algorithm to use, by Product Managers who flunked out of college math. Me, my degree from Caltech was in math, so just tell me what you want me to build and let me figure out how to build it, or tell you why it can’t be built as specified with the budget allocated.)

So there is clearly a sweet spot in specification documentation.

And the keyword (which I slipped by above, in case you didn’t see it) is consensus.

The software specification document is used to help build, communicate and maintain a consensus as to what we are going to build, with the Product Manager providing input from his interactions with the customer as to what the customer wants. (As a Product Manager you’ve identified and talked to the customer, right? Right?) The best way to build the consensus is to effectively communicate the needs clearly, while getting feedback from the developers as to what they believe they can and cannot build. (And if a developer tells you they can’t build it, listen to them–because it may be that while it can be done, they don’t know how. And remember: we don’t fight the war we want, we fight the war we have–and we fight it with the people we have. Which also implies that you should listen to the developers because they may know how to do something you thought was impossible which makes all the difference in the world.)

So in my opinion, a well built product specification is:

  • As short as possible to communicate what is needed. (That way everyone can understand it quickly, and so internal inconsistencies don’t creep in. Further, short is easier to maintain as the facts on the ground changes.)
  • Communicates clearly what is needed, not how it should be built. (A product manager who specifies how something should be built–what components, what algorithms, etc., is either playing the job of software architect he is not qualified to play, or doesn’t trust his developers. Either case spells serious trouble for the team.)
  • Is as much a product of consensus building as it is top-down management. (Otherwise the product manager is assuming capabilities and limitations that may not actually be true, and is demonstrating distrust for the development team.)

But ultimately this is about building a consensus: a consensus as to what the customer wants and needs, with the Product Manager as the go-between, communicating with both the customer of the product and with the development team building the product. Sometimes the product manager needs to push back on the customer or convince the customer that there is an alternate, better solution; sometimes the Product Manager needs to accept that the developers cannot build his vision and needs to accept a modified vision. But this also means the Product Manager has to accept his role as a member of a team communicating ideas and facilitating consensus building, rather than believing, as many product managers I’ve known seem to believe, that without any training whatsoever in software development, architecture or design, that they are better architects than their software architects, better developers than their software developers, and better visionaries than Steve Jobs.

There was only one Steve Jobs. And even he listened to his developers–after all, according to reports he opposed an Apple App Store.

Go build a consensus instead.

Musing on Advertising

So I’ve been giving some thought about advertising in general. After all, for the past three years I’ve been peripherally associated with advertising of one form or another — and all of it that particular breed of advertising called local advertising. (Anything that tells you what’s going on around you is a type of advertising — so I count Geodelic as a form of advertising, though it was only handling the presentation part, not the monitization of the presentation part.)

And it’s interesting. Interesting in the sense that there is a real psychology to advertising particular to certain categories of advertisers — and that psychology is, in my opinion, not being captured in the “one size fits all” form of advertising being offered through the Internet. I contend that, in the Internet’s rush to better slice and dice the customers, to create better focused advertising for those customers based on previous purchase patterns and demographics through automated matching algorithms, the quality of the advertising offered on the Internet has become more generic and less focused on the advertiser.

In other words, in concentrating on the customer the Internet is forgetting the message.

Just sitting here at my desk, I can think of three different types of advertisers.

The first, such as restaurants, toothpaste manufacturers and providers of services we use every day, would be more interested in getting people to remember who they are so we’re more likely to buy their products when the time comes (once a week at the grocery store, or every day as we walk out of the office building to get lunch) to think of them as we make our decision. Advertisers like that are probably less concerned with things like individual conversions based on click through rates through Google — because the profit for a conversion is practically non-existant. (Do you think an ice cream store will willingly pay $20 for a single conversion — a single customer purchase based on a click through from Google — for a $5 large scoop, which nets perhaps $4 gross profit and perhaps 50 cents net?)

That type of advertiser wants repeat customers. They can’t justify a $20 conversion if that conversion only results in a single sale — they need an ongoing and constant repeat customer base.

A second type of advertiser is an advertiser who provides a service that you’re probably not going to constantly use or repeatedly call. Unless you’re a rich asshole, you’re probably not going to be a repeat customer to a divorce lawyer. Unless you’re a building contractor, you’re probably not going to need a plumber on call. Many services like this are services you don’t even think about needing until you need it: you probably aren’t going to be on first name basis with your auto mechanic — unless you either have an old car and he’s a friend, or you’re into cars and need someone constantly on call to help with your new finds.

These people are ideal targets for something like a Google SEM campaign or a Yellow Pages ad: you don’t know you need them until you need them — and when you do, you’ll explicitly search for one. Car broke down? Google “auto mechanic los angeles” and start calling.

A third type of advertiser is an advertiser who is selling an aspirational good. By “aspirational good” I mean a product which is a major purchase, but which you’re not going to immediately impulse buy. Refrigerators for a new house, a toilet for a remodeled bathroom, a car, a new computer, a new expensive watch: all these things are goods which you’ll probably do a lot of research deciding if you want to buy that good. Factors such as “am I a Toyota driver or a Honda driver, or maybe I can squeeze into a BMW” — that is, does the good match who you see yourself being — come into play. You’ll probably ask friends or muse over the purchase. Me: I tend to seem impulsive when I buy something expensive, such as an iPad or a new computer — but trust me: I only seem impulsive because I don’t share my internal thought processes. The iPad purchase only seemed impulsive because I didn’t share the fact that I was reading every rumor, every scrap of information, and scouring the iPhone OS documentation and header files for every scrap of information for a year prior to making a purchase.

When I bought my current car, I mused over that purchase for two years. Remodeling the deck? About 18 months. And so forth.

There is a real psychology to aspirational goods, by the way. Entire magazines (such as GQ or Esquire for men, or Cosmopolitan for women) exist which are essentially curated ads for various products. (GQ, for example, runs entire articles on the best watch for men to wear.) These curated magazines essentially tell people what sort of people wear what sort of products — essentially helping to define the aspiration that may cause someone to spend $2,000 on a watch or $350 on shoes. It used to be that other types of products were also advertised as aspirational products — such as air travel, with their beautiful air stewardesses in short skirts, or (in an earlier age) train travel being enjoyed by the rich and famous of their era. All BMW advertising is aspirational advertising, for example: “The Ultimate Driving Machine” is well positioned as something a wealthy but secret sports car driver wants to drive.

And aspirational advertising also serves to remind the people who already purchased their product that the purchase was a smart one by reinforcing the aspirational stereotypes associated with the product. They help confirm the purchase, and make you feel good so that, perhaps 3, 5 or 10 years from now, when the lease is up or the old car is done, you buy a newer model of the same car.

Aspirational advertising is not restricted to just a $2,000 watch or a $60,000 car, however: cooking magazines apply the same logic (aspiring to be a better cook) towards $30 cooking pans and $5 egg timers. There is a reason why a store like Sur La Table will carry things like $12 mango slicers that almost no-one will buy: it reminds us that there is a gadget for aspirational cooks that perhaps costs only a couple of dollars but which handles the single dedicated task of slicing the pit out of a mango.

Aspirational advertising can also apply to some services as well: high end restaurants can also use aspirational advertising to advertise their services. A restaurant which attempts to position itself as an expensive restaurant perhaps does not make much more money per seat/hour (dinner may cost more, but you spend more time eating it, taking up the seat during that time), but it’s simply a positioning strategy: fewer but higher-end customers. The profits are the same, but the message becomes aspirational rather than just utilitarian: you eat there because it’s a fine restaurant, not just because you’re hungry.

I don’t think Internet advertising captures any of this.

Personally I blame Google. I honestly do: Google made the ability to find something the thing that is “top of mind” for most advertising executives on the Internet. We don’t care what the message is that we send; we only care that we shove it in front of your eyeballs. The landing page that we construct for you is exactly the same if you’re a restaurant (who is struggling with your positioning: are you an aspirational restaurant or a quick eatery restaurant?), or a plumber or lawyer or a jewelry store.

If you’re a sophisticated advertiser you probably already have your own web site which reflects what you want to advertise — but, um, if you’re a sophisticated advertiser you probably don’t need help advertising.

But I blame Google because Google has a habit of reducing everything to statistics and numbers. Their designs are all data driven — is the coupon on the landing page sufficiently large enough to capture the eyes and is sufficiently related to the keywords to represent a sufficient “call to action”? Is the phone number easy to find, and routed through a call tracking service so we can capture the efficiency of the call combination? Do we have a properly located blurb, and is the uploaded video presentation well positioned for playback? Is it the right shade of blue? Let’s sample 8,000 shades to statistically determine the right shade of blue.

Google’s UIs are crap. Landing pages built based on a statistical model attempting to represent a singular call to action associated with the keywords that maximizes the ROI by maximizing the conversion rate tends to be ugly little critters that do no-one any service — except perhaps for someone who needs a plumber in a hurry because there is crap backflowing onto their pretty rugs, and they will overlook a bad color choice or a blink tag on a coupon presentation.

Of course to some extent Google search works for helping customers find the things they want. For needs-based services — for finding a plumber in a hurry — the current Internet model of impressions converting to clicks converting to conversions via calls to action works: if I need to unclog a toilet now and the guy is nearby, the model works very well, and who gives a damn that the shade of blue in the border was #0000CE instead of #020FE0.

And Google search, so long as it isn’t completely trashed by SEO campaigners who are looking to spam the Internet with useless “reviews” in order to backload the search results and skew the information we’re searching for, works for aspirational purchases: they help us define what it is we’re buying and narrow down our product choices by understanding the range of features provided on that next car, refrigerator, or laptop computer.

As a mechanism for aspirational advertising, though, the current Internet model sucks hard. At best it’s informational, not aspirational: where is the store located? What are their hours? No sense of “is this a place for a good lunch” or “will I be catered to and pampered as I drop $10k on jewelry” or “do they have affordable shoes.” A map, some text, a description about how they’ve been around since 1960 and a blinking coupon promising me 5% off on my next purchase if I mention some magic code does not work to define aspiration.

Google is not an aspirational company. Sure, for a software developer, developers aspire to work there — but that’s because Google, ironically enough, has done a fantastic job understanding the mind of developers and presenting themselves as the ideal place to work. But the developer-engineer model of looking at the world, while it works for developers and engineers, doesn’t work for everyone else. And Google is only in first place because we think “I’ll google this” instead of “I’ll search for this on the Internet” — and because, up until now, the Google search bar is the default search bar in the toolbar of most browsers.

I think if the Internet is going to improve it’s ROI for Internet advertisers, the model has to change.

Specifically we have to start understanding the advertiser message more, rather than focus on the computational processes surrounding matching consumers to advertisers — a process which is intrinsically flawed to begin with. (Even Amazon, who is supposed to be king of this process, screws up royally when it offers to sell me a second safe after buying a first one: how many blasted safes do I need in my house? Can’t someone at Amazon just add a single bit to their taxonomy to indicate categories of goods which have a low probability of repeat purchases?)

Ironically by focusing on the advertisers more and on the type of advertiser (volume business, aspirational business, or service provider), we could conceivably also better target the customers they’re trying to reach.

I think part of the problem with CS educations today is the overuse of design patterns.

Okay, I know that probably irked a few people. But hear me out.

A “design pattern” really is really just a common solution to a problem, right? I mean, if you have a problem like “how should I handle percolating events up a hierarchy of potential listeners”, then Chain of responsibility is not a bad way to go.

And I think it’s reasonable to study different design patterns as solutions to different commonly encountered problems.

But when people start seeking a design pattern without understanding what the problem is, it’s putting the cart before the horse. It leads to what I’ve been calling “voodoo programming”, where you shake a stick hoping that the magic stick will magically resolve the problem. That sort of magic thinking inevitably leads to the lavaflow anti-pattern, where layers of misunderstood code is piled on top of older, misunderstood code–and ultimately leads to a bloated, slow system which is hard to maintain.

Couple with the other things that programmers do which are counterproductive, and the real question is not “why is it only 32% of projects complete on time and in budget, while 24% of all IT projects are canceled,” but “how the hell is that 32% number so high?”

Makes me wonder if, at the end of the day, a number of projects in the “completed on-time and in budget” bucket factored in the additional costs of “voodoo programming”, and the actual failure rate (that is, where the time to deliver is longer than optimal) is closer to 90%.

Things I think about when starting a new project that, surprisingly, many people appear not to.

I switched projects at work and now I’m going through the struggle of trying to figure out how to build other people’s projects. It’s always a struggle, of course: a lack of familiarity always makes things harder than they should.

But there are some common things that I realized that don’t seem to be as universal as they should be.

So here are some things I think about when I start work on a brand new project, that I wish other people would also prioritize.

How will you debug the project?

Seems weird to put this at the top of the list. But there are quite a few web applications being built where, in the list of things people worry about, “debugability” isn’t even in the top 5. Oh, sure; they think about configurability (how will I configure the tool dynamically?), and logging (how will I get information about problems off a server that is locked down from me?), and management (how will I remotely manage something that has elements of it locked down from me?). But for whatever reason, the simple ability to reach up in the toolbar, and press the ‘debug’ button and have their product (in one automated process) build, locally deploy, launch, and stop on a locally set breakpoint doesn’t even appear in the top 10 of things people seem to worry about.

All too often debugging a web application seems to be “oh, sure; just kick off Maven with some magic combination of things, then copy this file there, and run that script–oh, and connect your debugger to the running process.” Some of those steps, of course, are more lore than they are documented processes. And it’s a terrible substitute for just checking out the project, and hitting the “debug” button.

I’m a huge fan of spending a few days and figuring out which integrated tools you need to allow for one-button debugging within Eclipse, my personal favorite IDE. Remote debugging should only be used as a last resort, in order to attach to a running remote process, in order to diagnose a problem with a deployed application. But ideally all debugging should be done by just hitting the little bug icon.

In fact, I’d suggest this is the first thing that should be done when setting up a project, before figuring out how you’re going to deploy the project, before figuring out how you’re going to package the project, or before you figure out which libraries you intend to use. And this should be the first thing on the top of your mind as you incorporate new technologies.

Because, in my opinion, if you check out a project onto a brand new computer, and it doesn’t just get sucked into the IDE, ready to be debugged in one step by pressing the debug button, your build is broken.

How will you bring new team members up to speed or replace them?

In part this is a technical question, and it has to do with documenting your processes. How does a new team member check out the sources from the source repository? Which tools does he need to install in his IDE in order to run your project? Where can he get the latest documentation?

If you’ve made sure that, once the IDE is installed, checking out the source from the source repository simply “works” (see the step above), then you’ve saved yourself a lot of documentation time.

But any document is also a product that must be debugged and maintained. So who will maintain that document? Who will debug the document (run through the steps to make sure they’re clear and correct)?

And if you’re in a shop which supports multiple IDEs, then you have an extra step you must perform: first, figure out how you will allow multiple IDEs to co-exist and operate on the same source base. And second, maintain the projects across different IDEs. Now that could be as simple as having a culture of constant communications when new files are incorporated into a project. But it does mean the project files will need to be maintained as well as the sources–and it does mean the documentation will also need to be maintained as the project evolves.

How will you distribute the resulting product?

For whatever reason, in the three web shops I’ve worked for, web distribution was always treated a bit cavalierly: since we maintain the severs, naturally we don’t need to give as much thought towards distributing the resulting product as we do if we’re distributing a CD-based product or an App Store distributed product.

And while this is true on the surface, every web shop I worked for also has an operations team and a support team who are expecting a “run book” or other documentation on how to install the product and how to maintain the product.

In other words, they’re asking for a software user’s manual.

Just like the ones we used to print for a CD distributed product.

So you can’t escape the following questions, even if you work for a web shop.

How will you distribute your product? What installer will you use to install the product? What are the distribution products, and who will maintain those distribution products? What about the documentation; who will maintain it? Who will test the installation process? Who will make sure that on first boot it’s obvious what to do?

And how will you alert the user when things go wrong, and how will you tell the user what to do when things go wrong–and when to call you for support, and when to handle things themselves?

How will you get diagnostic information back when things go wrong? And what form will that diagnostic information take? Is it sufficient to help you pinpoint the failure?

It’s surprising to me how little thought seems to be given to these basic questions. What’s even more interesting to me are the number of developers who treat these questions with disdain: I know a few who are proud of the fact that their “IDE” is vi and their debug process is printf(). Okay, if that is what chimes your bells–but most of us are more interested in tightening the edit/compile/debug cycle and in making the project more manageable for real human beings.

*sigh*

How long before Facebook is back? I use Facebook for more personal posts and politically oriented stuff, and I use this blog for more technical-oriented topics. So clearly, Facebook is now down for my account, especially when I have a long winded politically oriented post I want to write. (*sigh*)

Little killers: channel abuse.

Channel abuse.

When you use a magic value in a particular field (such as a text field) in order to signify something else. For example, if you use a magic text pattern (like ‘XXX’) to signify the last record.

This can kill you because (a) what if the user types in “XXX” into that field? And (b) if you’re storing the record as an explicit marker, what happens if your magic record goes away?

Better off either creating some sort of conversion protocol to allow out-of-band information to reside within the text (like the & escape character in HTML), or to create a separate control channel to contain control data. And if you’re using the symbol as an end of array marker, you’re much better off using the built-in OS routines (end of file marker, end of database marker, etc) to determine you’re at the end of the list.

A surprising observation, or finding your own voice.

One thing that surprised me was the percentage of people who, on reading my code from my last post, took umbrage to the following line of code:

if (null != (ret = cache.get(n))) return ret;

(Okay, in all fairness, only two people out of 20 or so commenters made a comment–about one in 10, but on a trivially small data set.)

It fascinates me.

First, a disclaimer. I don’t mean to use my comments to disparage anyone. I don’t know the people who made their comments, and I’m sure if they’re here they’re clearly very intelligent people of the highest caliber whose code quality is undoubtedly impeccable. If it wasn’t, they wouldn’t be here, right?

My comments go to current development trends, however, which motivate people to be far more interested in form over function.

There is nothing new here, by the way, as I go into in my comments below. But I’m fascinated by the devotion of form over function that is being taught to our developers which sometimes makes people blind to the reasons why the form exist.

I started tinkering with computers in 1977 when I played with the BASIC interpreter on a TRS-80 in the 8th grade. I don’t observe this to suggest that my experience trumps other people’s experience or to present myself as some sort of code guru or expert that people should not argue with. I only note this to note that I’ve been around a while and have seen different trends ebb and flow over the past 34 years of hacking, and over the past 23 years or so of doing this professionally since getting my B.S. in Math from Caltech.

It’s context, nothing more.

But in that time, I’ve noticed a few shifting trends: things that at one time was considered “best practice” are is now considered poor practice or visa versa.

Take, for example, the statement above that started this whole article. One suggestion I saw was to rewrite the code:

MyClass ret = cache.get(n);
if (null != ret) return ret;

We could even go so far as to rewrite this statement with the allocator statement reserving the variable on a separate line:

MyClass ret;
ret = cache.get(n);
if (null != ret) return ret;

When I started writing software, we edited our code on an 80 x 24 character display. This means you could only see 24 lines of code at any one time. Back then, the two statements below would have consumed two or three of those 24 lines of code, and so would be considered inferior to the one line statement:

if (null != (ret = cache.get(n))) return ret;

Back then, the limit on the number of characters in a line also favored shorter variable names. Setting aside, of course, that earlier C compilers could only match variable names of 6 characters or less (so that, for example, ‘myVariable’ would match ‘myVari’ or ‘myVariFoo’), which was imposed partially for memory reasons, but partially because of a lack of need–variable names were kept short because:

if (null != (foundFactorialReturnValue = factorialReturnStorage.get(n))) return ret;

This could get pretty unwieldy.

It gets worse when discussing formulas, such as the distance between two points:

double dx = p1.x - p2.x;
double dy = p1.y - p2.y;
double dist = Math.sqrt(dx * dx + dy * dy);

is easier to follow than:

double deltaXCoordinate = point1.xCoordinate - point2.xCoordinate;
double deltaYCoordinate = point1.yCoordinate - point2.yCoordinate;
double distanceBetweenPoints = Math.sqrt(deltaXCoordinate * deltaXCoordinate + deltaYCoordinate * deltaYCoordinate);

Of course programming styles change over the years. We’re no longer constrained by the 80×24 character limits of a green ADM-2A terminal. My computer display at home is 30 inches in diagonal, and capable of displaying several hundred lines of code with documentation in a separate window. Even the smallest MacBook Air has a pixel resolution of 1366 x 768 pixels; at 12 pixels per line, this means you can easily display 55 lines of code with room left over for the window tile and decorators and the menu bar.

And of course in the desire to cram more and more code into an 80×24 character display, C programmers took some “liberties” that took the whole drive towards putting as much information within a single line of code waaaay to far, writing such abominations as:

for (ct=0,p=list;p;++ct,p=p->next) ;

which counts the number of items in a linked list. (The count is in ct, the list in list.)

(In fact, this drive for “clarity” through compactness was one of the inspirations that led to the creation of the International Obfuscated C Code Contest.)

Today, I believe the pendulum has swung too far in the other direction. We’re so hell bent on the proper form (out of concern that, by putting a compound statement in a single line of code it will make it ‘harder’ to understand) that we even have tools (such as Checkstyle) which will enforce syntactic styles–throwing an error during the build process if someone writes an early return or a compound statement.

And while I’m not arguing anarchy, I do believe going so far as to break the build because someone dared to write a compound statement with an early return rather than writing:

MyClass ret;
ret = cache.get(n);
if (null == ret) {
    // some additional logic with a single exit point 
    // at the end of the if statement, using no returns, 
    // breaks or continues or (God help us!) gotos
}
return ret;

is going too far. (Setting ‘ReturnCount’ = 1 in Checkstyle.)

Imagine a world with only declarative sentences. There are no conjunctions. All sentences follow a proper format. All sentences start with a noun. The noun is followed by a proper verb phrase. The verb phrase is followed by a well structured object. The object of the sentence is a proper noun phrase.

Imagine this world with well written sentences. Sentences that follow the format taught in the third grade.

“I want you to be an uncle to me,” said Mr. George Wright. He leaned forward towards the old sailor.

“Yes,” said Mr. Kemp. Mr. Kemp was mystified. Mr. Kemp paused with a mug of beer midway to his lips.

Or:

The question is ‘to be or not to be.’
Is it nobler to suffer outrageous fortune?
Or should we take arms against a sea of rising troubles, and end them?
To die? To sleep no more?

Sorry, but I don’t think Shakespeare or Hemingway are helped by our rules.

Ultimately writing code has two goals.

The first is to accomplish a task, to create a software package that can be deployed which accomplishes the specified task with as few bugs (or no bugs) as possible.

The second is to produce maintainable code: code that, years from now, you can figure out. And code that, more likely than not, will be handed off to a maintenance developer–possibly overseas in India or China–who will be asked to understand and maintain the software that you wrote.

Now both tasks can be helped by writing simple code: that was yesterday’s post.

But code legibility can also be helped by thinking about the code you write.

To me, writing code is like writing an essay. Like an essay, which is full of sentences and paragraphs and sections, code is full of statements, and code groupings (like paragraphs) and classes or modules.

And like sentences, which has rules that we were all taught in the third grade (but then we later ignore as we learn the rules of the road and find our own voice), code too has rules of legibility that we should be able to break judiciously as we gain experience.

Each statement of code is, in a way, a sentence: it has a noun (the object being operated on), a verb (the operator or function call), subjects (parameters), and so forth. While we’re taught that a sentence must have a subject, a verb and an object, we learn later on that perhaps to express ourselves we can bend the rules.

So, for example:

if (null != (ret = cache.get(n))) return ret;

This may be a perfectly acceptable statement under the right circumstances: the idea is clearly expressed (get the cached value and return it if it was found), the logic is easy to follow.

And by putting it on one line, our focus is carried away from the logic of checking the cache and can focus on the multiple lines of calculating the cached value. We can concentrate, in other words, on the meat of the algorithm (the computational code), and the code which bypasses the check can be made into a functional footnote.

Of course there are places where this can be the wrong thing to write as well: if the emphasis in the routine, for example, is on checking the cache–well, then perhaps this deserves a multi-line statement.

Or perhaps when we find a value in our cache we trip off some other logic, then the logic deserves a line of it’s own. Perhaps the checking is sufficiently important enough that it needs to be called out separately, like:

ret = cache.get(n);
if (null != ret) {
	return doThing(ret);
}

It’s all a matter of communicating, with your own voice, what is and is not important, so future generations of code maintainers can understand what is and is not important, what goes together and what is separate.

Ultimately it’s about striving for a balance: creating working code that can be understood, by using idioms of programming which convey the subtext of the meaning of that code.

Sure, when you’re inexperienced and you haven’t found your voice, it’s appropriate to follow a strict “noun/verb/object” structure. It’s appropriate, in other words, to use simple declarative statements while you gain experience writing code, and to observe other common “best practices” such as using descriptive variable names.

But at some point you need to find your own voice. When you do, it’s also appropriate to break the rules.

And if you concentrate too much on the rules rather than on what’s being said, then perhaps you’ll also make the mistake both commentors did when commenting on my code style, when they failed to note the code itself was actually broken, with a dangerous infinite loop triggered by certain parameter values.

Complexity–or how hard is it to display a list of 3,000 items in a table on MacOS X anyway?

The laptop on my desk can do around 2,403 MWIPS, around 100 times faster than the Cray X-MP/24 of yesteryears, and has easily more than 100 times the memory. (For reference, NASA clocked the Cray X-MP/24 at 25 MWIPS, and the system came with either 16 or 32 megabytes of main memory, in 1 megabyte banks.)

So how friggin’ hard is it for Outlook Express to display the summary lines of 3,000 e-mails on my MacBook Pro?

separator.png

Here’s the thing. I suspect (though I don’t know because I don’t work for Microsoft) that rather than suck in the headers into memory from an e-mail file, storing those headers in a linked list and then, as the user scrolls up or down in the scroll bar, the system runs a pointer to the first point of the linked list, and simply does a “drawRow(…); ptr = ptr->next;” operation, instead the system is using something like SQLite for storage and doing a database query to the SQL engine, like “SELECT … ORDERED BY … LIMIT (n-rows) OFFET (scroll-amount)”.

Then, behind the scenes, the SQL statement is then compiled into a byte code for a SQL interpreter engine, the engine then interprets the byte code using a byte code interpreter, with each instruction pulling in elements of the table (stored in a way which is optimized for reducing disk space while arbitrary records are added and removed) into an on-board cache, which sometimes is missed because–well, we’re only interested in a dozen or so rows displayed in the summary table, right?

The end result is that a computer 100 times faster than the fastest computer in the world from 30 years ago (which, honestly, was pretty blindingly fast) and which has more than enough memory to store four encoded movies in RAM, hiccups and stutters when I scroll through my e-mails.

Really?

separator.png

I’d like to call this lazy, but I really can’t: wiring up the UI of Oracle to a back-end SQL engine, then maintaining the link of those systems in such a way which allows you to page through a few thousand e-mails is not a trivial task. Nevermind that SQL was designed in part to reduce its memory footprint for large datasets that cannot be stored in memory, and we’re using it on an e-mail database which could be stored in 1/100th of the available RAM on my system.

Instead, I think it’s because software development today tends to be more about marrying off-the-shelf components without thinking about what those off-the-shelf components actually do–or even if (as in the case of tracking 3,000 e-mails on a system that would have been considered 20 years ago a super-computer whose computational power made it a Federal crime to export overseas) the off-the-shelf components are even warranted.

Now if my e-mail box had, say, 1 million e-mail messages, and a total disk footprint of, say, 2 gigabytes–then I could see using a complicated relational database engine to manage those e-mails. But instead, in an effort to reduce the magnitude of the exponent in an O(n**e) operation, we forget that there is a constant C which can often dominate.

Which is why an e-mail system which simply parses all 3,000 e-mails into a linked list (and which would handle operations in O(n) time) would be far faster than a complicated system using a relational database that perhaps may run at a theoretical O(n**0.5) time, but whose constant C for each operation is measured in milliseconds verses nanoseconds for the former.

separator.png

I wish we would as an industry spend less time thinking about the tools and toys which we use to solve a problem, and more time thinking about the domain the problem resides in–and selecting the appropriate tools given the domain, rather than simply asserting the latest and greatest gee-wiz gadget is the right answer regardless of the question.

I’ve seen tons of examples of this sort of thing. Adobe Air used to build a user interface because, well, it’s pretty–without thinking for a second if it can even be deployed on the target devices used by people in the field. Javascript/AJAX web sites used for people who travel and whose devices do not have a consistent or reliable Internet connection. Sending and receiving full queries across a slow EDGE network connection when on-board caching would reduce the queries to status updates (or even allow the system to run without any queries whatsoever).

We don’t think of the domain of the problem or figure out the priorities of the components of that problem or pick the appropriate tools and technologies.

And so I’m stuck with a computer that is a marvel of advanced technology, a super-computer in a briefcase capable of running a simulation that can model the solution to Schrodinger’s Equation fast enough to allow a physicist to fiddle with the inputs and see the outputs in real time, run like a fucking dog when scrolling through my morning’s e-mails.

1 2 3 12