A surprising observation, or finding your own voice.

One thing that surprised me was the percentage of people who, on reading my code from my last post, took umbrage to the following line of code:

if (null != (ret = cache.get(n))) return ret;

(Okay, in all fairness, only two people out of 20 or so commenters made a comment–about one in 10, but on a trivially small data set.)

It fascinates me.

First, a disclaimer. I don’t mean to use my comments to disparage anyone. I don’t know the people who made their comments, and I’m sure if they’re here they’re clearly very intelligent people of the highest caliber whose code quality is undoubtedly impeccable. If it wasn’t, they wouldn’t be here, right?

My comments go to current development trends, however, which motivate people to be far more interested in form over function.

There is nothing new here, by the way, as I go into in my comments below. But I’m fascinated by the devotion of form over function that is being taught to our developers which sometimes makes people blind to the reasons why the form exist.

I started tinkering with computers in 1977 when I played with the BASIC interpreter on a TRS-80 in the 8th grade. I don’t observe this to suggest that my experience trumps other people’s experience or to present myself as some sort of code guru or expert that people should not argue with. I only note this to note that I’ve been around a while and have seen different trends ebb and flow over the past 34 years of hacking, and over the past 23 years or so of doing this professionally since getting my B.S. in Math from Caltech.

It’s context, nothing more.

But in that time, I’ve noticed a few shifting trends: things that at one time was considered “best practice” are is now considered poor practice or visa versa.

Take, for example, the statement above that started this whole article. One suggestion I saw was to rewrite the code:

MyClass ret = cache.get(n);
if (null != ret) return ret;

We could even go so far as to rewrite this statement with the allocator statement reserving the variable on a separate line:

MyClass ret;
ret = cache.get(n);
if (null != ret) return ret;

When I started writing software, we edited our code on an 80 x 24 character display. This means you could only see 24 lines of code at any one time. Back then, the two statements below would have consumed two or three of those 24 lines of code, and so would be considered inferior to the one line statement:

if (null != (ret = cache.get(n))) return ret;

Back then, the limit on the number of characters in a line also favored shorter variable names. Setting aside, of course, that earlier C compilers could only match variable names of 6 characters or less (so that, for example, ‘myVariable’ would match ‘myVari’ or ‘myVariFoo’), which was imposed partially for memory reasons, but partially because of a lack of need–variable names were kept short because:

if (null != (foundFactorialReturnValue = factorialReturnStorage.get(n))) return ret;

This could get pretty unwieldy.

It gets worse when discussing formulas, such as the distance between two points:

double dx = p1.x - p2.x;
double dy = p1.y - p2.y;
double dist = Math.sqrt(dx * dx + dy * dy);

is easier to follow than:

double deltaXCoordinate = point1.xCoordinate - point2.xCoordinate;
double deltaYCoordinate = point1.yCoordinate - point2.yCoordinate;
double distanceBetweenPoints = Math.sqrt(deltaXCoordinate * deltaXCoordinate + deltaYCoordinate * deltaYCoordinate);

Of course programming styles change over the years. We’re no longer constrained by the 80×24 character limits of a green ADM-2A terminal. My computer display at home is 30 inches in diagonal, and capable of displaying several hundred lines of code with documentation in a separate window. Even the smallest MacBook Air has a pixel resolution of 1366 x 768 pixels; at 12 pixels per line, this means you can easily display 55 lines of code with room left over for the window tile and decorators and the menu bar.

And of course in the desire to cram more and more code into an 80×24 character display, C programmers took some “liberties” that took the whole drive towards putting as much information within a single line of code waaaay to far, writing such abominations as:

for (ct=0,p=list;p;++ct,p=p->next) ;

which counts the number of items in a linked list. (The count is in ct, the list in list.)

(In fact, this drive for “clarity” through compactness was one of the inspirations that led to the creation of the International Obfuscated C Code Contest.)

Today, I believe the pendulum has swung too far in the other direction. We’re so hell bent on the proper form (out of concern that, by putting a compound statement in a single line of code it will make it ‘harder’ to understand) that we even have tools (such as Checkstyle) which will enforce syntactic styles–throwing an error during the build process if someone writes an early return or a compound statement.

And while I’m not arguing anarchy, I do believe going so far as to break the build because someone dared to write a compound statement with an early return rather than writing:

MyClass ret;
ret = cache.get(n);
if (null == ret) {
    // some additional logic with a single exit point 
    // at the end of the if statement, using no returns, 
    // breaks or continues or (God help us!) gotos
}
return ret;

is going too far. (Setting ‘ReturnCount’ = 1 in Checkstyle.)

Imagine a world with only declarative sentences. There are no conjunctions. All sentences follow a proper format. All sentences start with a noun. The noun is followed by a proper verb phrase. The verb phrase is followed by a well structured object. The object of the sentence is a proper noun phrase.

Imagine this world with well written sentences. Sentences that follow the format taught in the third grade.

“I want you to be an uncle to me,” said Mr. George Wright. He leaned forward towards the old sailor.

“Yes,” said Mr. Kemp. Mr. Kemp was mystified. Mr. Kemp paused with a mug of beer midway to his lips.

Or:

The question is ‘to be or not to be.’

Is it nobler to suffer outrageous fortune?

Or should we take arms against a sea of rising troubles, and end them?

To die? To sleep no more?

Sorry, but I don’t think Shakespeare or Hemingway are helped by our rules.

Ultimately writing code has two goals.

The first is to accomplish a task, to create a software package that can be deployed which accomplishes the specified task with as few bugs (or no bugs) as possible.

The second is to produce maintainable code: code that, years from now, you can figure out. And code that, more likely than not, will be handed off to a maintenance developer–possibly overseas in India or China–who will be asked to understand and maintain the software that you wrote.

Now both tasks can be helped by writing simple code: that was yesterday’s post.

But code legibility can also be helped by thinking about the code you write.

To me, writing code is like writing an essay. Like an essay, which is full of sentences and paragraphs and sections, code is full of statements, and code groupings (like paragraphs) and classes or modules.

And like sentences, which has rules that we were all taught in the third grade (but then we later ignore as we learn the rules of the road and find our own voice), code too has rules of legibility that we should be able to break judiciously as we gain experience.

Each statement of code is, in a way, a sentence: it has a noun (the object being operated on), a verb (the operator or function call), subjects (parameters), and so forth. While we’re taught that a sentence must have a subject, a verb and an object, we learn later on that perhaps to express ourselves we can bend the rules.

So, for example:

if (null != (ret = cache.get(n))) return ret;

This may be a perfectly acceptable statement under the right circumstances: the idea is clearly expressed (get the cached value and return it if it was found), the logic is easy to follow.

And by putting it on one line, our focus is carried away from the logic of checking the cache and can focus on the multiple lines of calculating the cached value. We can concentrate, in other words, on the meat of the algorithm (the computational code), and the code which bypasses the check can be made into a functional footnote.

Of course there are places where this can be the wrong thing to write as well: if the emphasis in the routine, for example, is on checking the cache–well, then perhaps this deserves a multi-line statement.

Or perhaps when we find a value in our cache we trip off some other logic, then the logic deserves a line of it’s own. Perhaps the checking is sufficiently important enough that it needs to be called out separately, like:

ret = cache.get(n);
if (null != ret) {
	return doThing(ret);
}

It’s all a matter of communicating, with your own voice, what is and is not important, so future generations of code maintainers can understand what is and is not important, what goes together and what is separate.

Ultimately it’s about striving for a balance: creating working code that can be understood, by using idioms of programming which convey the subtext of the meaning of that code.

Sure, when you’re inexperienced and you haven’t found your voice, it’s appropriate to follow a strict “noun/verb/object” structure. It’s appropriate, in other words, to use simple declarative statements while you gain experience writing code, and to observe other common “best practices” such as using descriptive variable names.

But at some point you need to find your own voice. When you do, it’s also appropriate to break the rules.

And if you concentrate too much on the rules rather than on what’s being said, then perhaps you’ll also make the mistake both commentors did when commenting on my code style, when they failed to note the code itself was actually broken, with a dangerous infinite loop triggered by certain parameter values.

Complexity–or how hard is it to display a list of 3,000 items in a table on MacOS X anyway?

The laptop on my desk can do around 2,403 MWIPS, around 100 times faster than the Cray X-MP/24 of yesteryears, and has easily more than 100 times the memory. (For reference, NASA clocked the Cray X-MP/24 at 25 MWIPS, and the system came with either 16 or 32 megabytes of main memory, in 1 megabyte banks.)

So how friggin’ hard is it for Outlook Express to display the summary lines of 3,000 e-mails on my MacBook Pro?

separator.png

Here’s the thing. I suspect (though I don’t know because I don’t work for Microsoft) that rather than suck in the headers into memory from an e-mail file, storing those headers in a linked list and then, as the user scrolls up or down in the scroll bar, the system runs a pointer to the first point of the linked list, and simply does a “drawRow(…); ptr = ptr->next;” operation, instead the system is using something like SQLite for storage and doing a database query to the SQL engine, like “SELECT … ORDERED BY … LIMIT (n-rows) OFFET (scroll-amount)”.

Then, behind the scenes, the SQL statement is then compiled into a byte code for a SQL interpreter engine, the engine then interprets the byte code using a byte code interpreter, with each instruction pulling in elements of the table (stored in a way which is optimized for reducing disk space while arbitrary records are added and removed) into an on-board cache, which sometimes is missed because–well, we’re only interested in a dozen or so rows displayed in the summary table, right?

The end result is that a computer 100 times faster than the fastest computer in the world from 30 years ago (which, honestly, was pretty blindingly fast) and which has more than enough memory to store four encoded movies in RAM, hiccups and stutters when I scroll through my e-mails.

Really?

separator.png

I’d like to call this lazy, but I really can’t: wiring up the UI of Oracle to a back-end SQL engine, then maintaining the link of those systems in such a way which allows you to page through a few thousand e-mails is not a trivial task. Nevermind that SQL was designed in part to reduce its memory footprint for large datasets that cannot be stored in memory, and we’re using it on an e-mail database which could be stored in 1/100th of the available RAM on my system.

Instead, I think it’s because software development today tends to be more about marrying off-the-shelf components without thinking about what those off-the-shelf components actually do–or even if (as in the case of tracking 3,000 e-mails on a system that would have been considered 20 years ago a super-computer whose computational power made it a Federal crime to export overseas) the off-the-shelf components are even warranted.

Now if my e-mail box had, say, 1 million e-mail messages, and a total disk footprint of, say, 2 gigabytes–then I could see using a complicated relational database engine to manage those e-mails. But instead, in an effort to reduce the magnitude of the exponent in an O(n**e) operation, we forget that there is a constant C which can often dominate.

Which is why an e-mail system which simply parses all 3,000 e-mails into a linked list (and which would handle operations in O(n) time) would be far faster than a complicated system using a relational database that perhaps may run at a theoretical O(n**0.5) time, but whose constant C for each operation is measured in milliseconds verses nanoseconds for the former.

separator.png

I wish we would as an industry spend less time thinking about the tools and toys which we use to solve a problem, and more time thinking about the domain the problem resides in–and selecting the appropriate tools given the domain, rather than simply asserting the latest and greatest gee-wiz gadget is the right answer regardless of the question.

I’ve seen tons of examples of this sort of thing. Adobe Air used to build a user interface because, well, it’s pretty–without thinking for a second if it can even be deployed on the target devices used by people in the field. Javascript/AJAX web sites used for people who travel and whose devices do not have a consistent or reliable Internet connection. Sending and receiving full queries across a slow EDGE network connection when on-board caching would reduce the queries to status updates (or even allow the system to run without any queries whatsoever).

We don’t think of the domain of the problem or figure out the priorities of the components of that problem or pick the appropriate tools and technologies.

And so I’m stuck with a computer that is a marvel of advanced technology, a super-computer in a briefcase capable of running a simulation that can model the solution to Schrodinger’s Equation fast enough to allow a physicist to fiddle with the inputs and see the outputs in real time, run like a fucking dog when scrolling through my morning’s e-mails.

Insanity.

Insanity is having a two week scrum cycle and expecting to push working code to production at the conclusion of every single blasted cycle.

Insanity is expecting engineering to read, digest, and understand every single last bullet point on a 50 page product specification document in a matter of weeks, written by product managers who think they’re architects designing a product.

Insanity is doubling-down on the previous statement by creating 13 or 14 such 50 page documents, because engineering didn’t understand the first 50 page document.

Insanity is getting pissed off at said engineering because they notice conflicts on the bullet points in said 13 volume requirement and wonder how they should implement something that resolves the conflicting points.

Insanity is not shielding your direct reports from the vagaries and insanities of upper management, who believes their lack of understanding is your emergency.

Insanity is not pushing back at said upper management who thinks if they can buy something as complex as Microsoft Office at Amazon for a few hundred bucks and have it delivered in two days, then clearly you should be able to deliver something similarly complex in the same time frame and on the same budget.

Insanity is thinking that an admittedly poor manager managing an admittedly poor employee will somehow result in excellence.

Insanity is thinking making people look good both up and down the management chain is your first priority, or even your second or third. (Not to say cooperation isn’t vital–but cooperation is not the same as making people look good: the first requires getting along, the second requires pushing out mis-information.)

Insanity is stripping head count from a well functioning group and handing them to a poorly functioning group, on the theory that this will balance the workload.

Insanity is forgetting that everyone you manage has the potential of becoming an extremely talented person–some simply need to be taught, some you need patience with, and some who are naturally talented, if not recognized, may find greener pastures elsewhere. And even if at their maximum potential some people are far less productive than others, they’ll still be far more productive if encouraged.

Insanity is not being forward thinking.

Summarizing yesterday’s post on user interfaces.

If you have a system design that requires the operator to make 20 decisions in the back end in order to accomplish a task, don’t complain the guys building the front-end user interface that the user interface requires too many decisions.

The only thing the UI guys can do is either (1) hide the decisions that don’t need to be made (though if you’re asking for decisions that don’t need to be made, WTF?), (2) present the decisions in a way which is easier to understand.

But if you have 20 decisions that have to be made, no amount of UI goodness is going to reduce the effort of making 20 decisions.

So if you want a simpler UI, reduce the decisions that are required by the system.

User Interface Truisms.

Fundamentally how hard it is to use a user interface depends on the cognitive load of that interface–that is, how much thinking you have to do in order to use the interface.

Now “thinking” is one of those fuzzy things that really needs to be quantified.

As a proxy for user interface complexity (or rather, how hard it is to use an interface) some people use the number of buttons that one has to press in order to accomplish a given task. But that is only a proxy: clearly even though it takes 19 button presses (including the shift key) to type “Hello World!”, one would never argue that these 19 button presses is as hard to accomplish as navigating through 19 unknown interface menus.

That’s because the button press is a proxy for a decision point. The real complexity, in other words, comes from making a decision–which goes right back to cognitive load, the amount of thinking you have to do to accomplish a task.

So decision points are clearly a sign of cognitive load.

Now think back when you first saw a computer keyboard, and how mystified you were just to type a single letter: search, search, search, ah! the ‘h’ key! Success! Now, where is that stupid ‘e’ key? Oh, there it is, next to the ‘w’ key–why is it next to the ‘w’ key? Weird. Now, if I were the ‘l’ key, where would… oh, there it is, all the way on the other side of the keyboard. Weird. At least I know to press it twice. And now the ‘o’–ah, right there, next to the ‘l’ key.

Ooops. I have a lower case ‘h’. How do I back up? …

Clearly, then, along with decision points we have familiarity with the interface as a factor in how hard a user interface is to use: if we know how to touch type, typing “Hello World!” doesn’t even require thought beyond thinking the words and typing them. For for the uninitiated to the mysteries of the computer keyboard, hunting and pecking each of those keys is quite difficult.

Complexity, then, is cognitive load. And cognitive load goes towards the difficulty in making the decisions to accomplish a task, combined with the unfamiliarity of the interface to find what one needs to do to accomplish that task.

separator.png

Now, of course, from a user interface design perspective there are a few things that can be done to reduce cognitive load, by attacking the familiarity problem.

One trick is to have a common design language for the interface. By “design language” I’m referring, of course, to what you have to do to manipulate a particular control on the screen. If you always manipulate a thing that looks like a square by clicking on it–and the act of clicking on it causes an ‘x’ to appear in that square or disappear if it was there, then you know that squares on the screen can be clicked on.

And further, if you know that squares that have an ‘x’ in them means the item is somehow “selected” or “checked” or “enabled” of whatever–and you know that unchecking something means it’s “unselected” or “un-checked” or “disabled”–then suddenly you have some familiarity: you can quickly see boxes and realize they are check-boxes, and checking them means “turn this on” and unchecking means “turn this off.”

This idea of a design language can even extend to interfaces built strictly using text-only screens: if you see a text that looks like [_________], and a blinking square on the first underscore, and typing types in that field–and hitting the tab key moves to the next [_________] symbol on the screen, then you know all you need to know to navigate through a form. Other text symbols can have other meaning as well: perhaps (_) acts like our checkbox example above, or acts like a radio button (a round thing you can select or unselect which has the side effect of unselecting all other related round button thingies) or whatever.

The point is consistency.

And this consistency extends beyond the simple controls. For example, if you have a type of record in your database that the user can add or remove from a screen, having the “add” and “delete” and “edit” buttons in the same place as on other screens where other records are added or deleted helps the user understand that yes, this is a list of records, and immediately he knows how to add, delete, and edit them.

Visual language provides a way for a user to understand the unfamiliar landscape of a user interface.

separator.png

The other trick is for selective revelation of the interface in a guided way, revealing decisions that need to be made in an orderly way.

For example, imagine an order entry system where the type of order must be first selected, then the product being ordered, then product-specific information needs to be entered. This could be implemented by selectively showing controls that need to be filled out at each step of the process. This could be accomplished by a wizard. And notice that unneeded information (such as the size of a clothing item, unneeded when ordering a purse) can continue to be hidden if not needed.

The goal with this is to help guide the decision making process, to help gather the information in the order needed by the system. And by guiding the decision making process you reduce cognitive load: you ask only the questions that are needed rather than overwhelm the user with a bewildering array of interrelated choices, some of which (such as the clothing size of a purse) are nonsensical.

separator.png

The problem with all these user interface tricks (and there are plenty of others: arranging the information on the screen, tips and hints that dynamically come up, on-line help for the first time user, making interface reflect a consistent cognitive model, reducing short-term memory load by segregating items into 7 +/- 2 items or groups, etc.) is that they all go towards tackling the familiarity problem of the interface. In other words, they only go towards reducing the cognitive load of the interface itself.

And, honestly, most of these design patterns are pretty well known–and only go towards reducing the cognitive load of the first-time user. Once someone has gained familiarity with an interface–even a very bad one–the cognitive load imposed by a poorly designed interface is like the cognitive load imposed by a computer keyboard: eventually you just know how to navigate through the interface to do the job. (To be clear, reducing familiarity cognitive load reduces training costs if this is an internal interface, and reduces consumer friction and dissatisfaction of an external interface–so it’s important not to design a bad interface.)

Ultimately the cognitive load of a system comes from the decision points imposed by the interface. And a user interface can only present the information from the underlying system: ultimately it cannot make those decisions on behalf of the operator. (If the user interface could, then the decision wouldn’t need to be made by the operator–and the decision point really isn’t a decision point but an artifact of a badly designed system.)

What this means is that in order to simplify a product, the number of operator decision points must ultimately be reduced–either by prioritizing those decision points (noting which decisions are optional or less important to capture), or through redesigning the entire product.

separator.png

Remember: a user interface is not how a product looks. It’s how the product works.

What’s wrong with business reporting of the Computer Industry

You read through a report on computer industry jobs, or you take a test in high school which leads you to believe you may have a future as a “Computer Terminal Operator” or a “Computer Software Analyst.” And the stuff you read makes no sense whatsoever–things like “A Computer Programmer converts symbolic statements of business, scientific or engineering problems to detailed logic flow charts into a computer program using a computer language”, and you think “what?” Or you see something like:

Disappearing Jobs:

Plus, the work of computer programmers requires little localized or specialized knowledge. All you have to know is the computer language.

And you think “WTF?!?”

Really?

Here’s the problem. All of these descriptions are based on an industry classification scheme first created by the Bureau of Labor Statistics, part of the United States Department of Labor. And the descriptions are hopelessly out of date.

In the world of the Bureau of Labor Statistics, this is how a computer program is created, executed, and the results understood:

separator.png

First a business (such as a WalMart) decides that it has a business reason to create a new computer program. For example, they decide that they need a computer system in order to determine which products are selling better in one geographic region, so they can adjust their orders and make sure those products flow to that area.

This problem was probably identified by a Management Analyst (B026), whose job is to “analyze business or operating procedures to advise the most efficient methods of accomplishing work.” Or it was identified by an Operations Analyst (A065).

So they work with a Systems Analyst (A064) in order to restate the problem (“find the areas where different SKUs are selling better, compare against current logistical shipping patterns, and adjust future orders to make sure stores are well stocked for future demand”) into a detailed flow chart and requirements documents outlining how this process should work.

Once this flow chart has been agreed upon by the analysts and operations engineers and scientists, they turn the problem over to a Computer Programmer (A229) in order to convert the flow charts describing the problem into a computer program. This computer program is generally written using automated data processing equipment, such as a punch card reader.

The computer programmer verifies his program by working with a Computer Operator (D308) to submit his punched cards to the mainframe and, after a program run completes, returning the printout to the computer programmer’s “in box”, a wooden box used to hold the printouts designated to a specific programmer. (Computer operators “select and load input and output units with materials, such as tapes or disks and printout forms for operating runs.”)

Once the Computer Programmer has verified that his punch card deck is properly functioning, he will then eventually request (depending on the nature of the program) to either have his job run on a regular basis or loaded into the business mainframe so that inputs to his program may be submitted by a Data Entry Keyer (also known as a Computer Terminal Operator D385), or the process may be batch run by management, depending on if the Computer Systems Administrator (B022) permits it.

separator.png

Does this sound like the software industry you’re familiar with? No?

Here’s the problem. This is what every organization outside of the computer industry thinks goes on at places like Google, Apple, or Microsoft. Government decisions on education, the Bureau of Labor Statistics reports, and even reporting by places like CNBC are all driven by this image of the computer industry–an image that is about 30 to 40 years out of date.

And no-one has figured it out in the government, because every time they send out a survey on jobs to the computer industry, generally some guy somewhere goes “well, hell, none of this sounds like my guys. So I’m going to guess they’re all in the A064 category, because they need to think about what they’re writing–so they can’t be in the A229 category.

Until someone tells the Bureau of Labor Statistics their designations are garbage, we’re just going to continue to get garbage out from the BLS, from Government managed school textbooks (who still advise people about professions as a “Systems Analyst”, a profession that doesn’t actually exist as such), and from reporting outlets like CNBC–all who get their understanding from the BLS occupational classification system.

separator.png

Like fish who don’t notice the water they swim in, we don’t really know how much the government and government classification systems affect our thinking in this country.

separator.png

Addendum: Even job survey sites tend to use the same BLS classifications for classifying job and salaries, which is why most job sites talk about “Systems Analyst III” or “Computer Programmer II” job designations–which you will never see on a Google or an Apple job listing. It’s why figuring out salary requirements is such a royal pain in the ass for the computer industry as well–because everything is getting classified into buckets that are 30 years out of date.

Really?

Sorry I’m picking on this guy, but he brings up two things that really irritates the living hell out of me about our industry.

Having worked in software development for over 15 years and developing software for nearly 30 years,…

So… you’re counting coding when you were 5?

Look, for all I know this guy was a lawyer who started writing software out of college, but waited to jump ship until he was 35 and he’s now in his 50’s. But most developers I know who want to talk about how much programming experience they are count all the way to the time when their mommy gave them a toy laptop with a BASIC interpreter.

I’m sorry, but I’d like to call “bullshit” on that.

In every other industry I’m aware of, experience is counted as professional experience, where you had a fucking job (or a volunteer job) where you actually went somewhere and did something for money (or to help with a volunteer organization). It involved regular hours and following directions from another managing supervisor, and it involved delivering stuff on a schedule.

Fucking around when you’re 8 on daddy’s computer with the copy of GW-Basic that came on the computer no more counts (in my mind) as “experience” than does wrapping duct tape over a leaking air duct counts as “experience” as an HVAC Contractor, or fixing a leaky pipe counts as a Plumber, or helping a friend nail boards together to tack up a broken piece of trim on their house counts as experience as a finish contractor. Renting a U-Haul doesn’t count as truck driver experience, helping your younger sister with her homework doesn’t count as teaching experience, and convincing a friend he should buy a new phone doesn’t count as sales experience.

So why in the name of Cthulhu does tinkering around with a BASIC interpreter after school in the 2nd grade count as development experience?

Really?!?

To me, experience is professional experience. College education can be counted if called out separately as college, in the same way that taking technical courses to become a plumber can count on your resumé towards being a plumber, if called out separately. Fucking around sniffing pipe glue doesn’t count.

By this metric, I graduated from Caltech in 1988 with a degree in Mathematics, with experience in computer graphics, computational theory, a touch of VLSI design and hardware design, and a touch of mathematical optimization theory. From 1987 to 1988 I worked for a company doing computer graphics for the Macintosh (I needed to take one class to finish my degree, so worked full time while finishing up), and since graduating I’ve worked non-stop as a software developer in a variety of fields.

Which means I have 23 years of professional experience and a 4 year degree from a pretty good school.

I don’t count the time when I got a TRS-80 back in the 70’s, or the time when I was 12 learning Z80 assembler to hack my computer, or the time I spent in high school learning LISP. If I were, I could say things like “well, I have been developing software for nearly 33 years”–or is that longer, since I started tinkering with programmable calculators and electronic circuits well before then?

I call bullshit on the practice. Sorry.

In my opinion development experience starts accumulating the first time you get a real fucking job, either post-college, post-high school, or after dropping out of college. Not from the time your elementary school teacher allowed you to play with the old Commodore 64’s after school if you finished your homework first.

You’ve guessed it, Scrum adresses all of these resulting in 99% – 100% on target delivery. So it’s not due to bad programmers if an agile process can fix this.

I also call bullshit on the “one size fits all management style fixes all the problems in the computer industry.”

Excuse me, but Agile will fix all the ills of wrong estimation, wrong status updates, scope creep, and the like?

Really?

Because Agile will–what, exactly? Reduce the problem set into manageable chunks that can be fed to our steady stream of interchangeable cogs of programmers?

Don’t get me wrong; I think Agile is a reasonable tool for certain problems in a managerial toolkit. Along with good bug tracking, a well organized QA process, and motivated developers, and a good project plan. But Agile doesn’t fix a damned thing: it simply creates a regular communications channel between individual contributors and managers who need to keep track of the bottom line.

It helps, in other words, if you have good managers and good developers. But it won’t do squat if you have dysfunction in the overall team. And while I’ve seen people say that this dysfunction isn’t Agile, I’d argue that if you use a business label to label not just the tool but the overall team result, then you’re no longer describing a tool–you’re describing a condition and attributing it to a tool.

And I despise circular thinking.

Software shipped well before the latest management fad came down the pike, and the introduction of this fad did not make managers good leaders or developers better programmers. And I’d even go so far as to argue that the practice simply altered the workplace, making it harder, not easier, to ship new development efforts which are more properly handled with a waterfall method with constant feedback from the development team.

There is a reason why large companies generally bring new products to market by buying small companies who develop those products, rather than doing it in-house: because all of these managerial fads can never replaced a well motivated small team of people doing something either out of love or out of greed. Large companies take away motivation by greed and they tend to marginalize those who have love for a project.

And Agile will never replace passion.

Thanks for reading my rant. And apologies to Stephan Schmidt whose post just set me off this morning–I actually agree with 90% of what he wrote about the whole “bad programmer” debate, which is honestly a different rant for a different time. Though I will note that just because I need a C compiler to be more effective than entering a program via a toggle switch panel, does that mean I’m a bad programmer because I need a tool to help me figure out which bits in program space should be set to 1, and which ones should be set to 0?

Sometimes you get what you want by accident.

A year ago I had thought to go back and get my Ph.D., thinking that eventually I’d like to turn this into either a teaching gig or into a research gig. I’m nearly 45, you see, and while I probably could go on and do the single contributor thing pretty much the rest of my life (since I love to learn new things all the time), at some point it’d be nice to think of the next generation.

So I talked to a couple of college professors about Computer Science Ph.D.s, and–meh. The problem is most of the interesting research is being driven by corporate need rather than by theoretical considerations. Which makes a certain degree of sense: the Internet and modern CPUs have introduced all sorts of interesting problems that really need solving far more than refining Turing Machines. And university research is often funded by–you guessed it–those corporations generating the problems that need to be solved.

Besides, you do anything for 20+ years, constantly striving to stretch your own talents the entire way, and chances are you probably have quite a bit you could be teaching the professors.

So I punted.

And instead I took a management gig thinking that perhaps I could use the opportunity to act like a college professor–teaching a group of graduate students. Except they’re professional developers with a decade and a half less experience than I, and the problems we’re working on aren’t as theoretical.

But I was hoping no-one would notice that I was spending all my time with one publicly stated purpose: execution–but all the while with one secret purpose: developing my direct reports.

Now I’m reading The Extraordinary Leader as part of a management training program at AT&T. And something interesting stood out: one of the most important things a leader can do is develop and advance the people he leads.

Hmmmm…

I guess my secret agenda doesn’t need to be quite so secret.

So here we go: I think what I’m going to do is (a) when there are tight deadlines, keep the schedule by pushing back at unreasonable requests, setting a schedule that may be aggressive but can be kept, get the resources we need and protecting my team, and helping out (very selectively) in areas where I can do the most good while leaving most of the work to my team. But when (b) we have times where our schedule is not quite so tight, to help by creating team building and learning exercises–such as tomorrow’s brainstorming sessions where I’m going to turn over the design of a new component to my team and have them come up with several alternative architectural designs, then debate the pros and cons of each.

My theory–and we’ll see if it works–is that by having such exercises, both teaching exercises, as well as team building through constructive discussions of alternate ways of designing something–my team will learn new stuff through practice and become better developers and, if they wish, better leaders.

Wish me luck.

Fortunately management pays far more than a professorship.

Goodbye Far Clipping Plane.

I really wanted to write this up as a paper, perhaps for SigGraph. But I’ve never submitted a paper before, and I don’t know how worthy this would be of a SigGraph paper to begin with. So instead, I thought I’d write this up as a blog post–and we’ll see where this goes.

Introduction

This came from an observation that I remember making when I first learned about the perspective transformation matrix in computer graphics. See, the problem basically is this: the way the perspective transformation matrix works is to convert from model space to screen space, where the visible region of screen space goes from (-1,1) in X, Y and Z coordinates.

In order to map from model space to screen space, typically the following transformation matrix is used:

perspective.gif

(Where fovy is the cotangent of the field of view angle over 2, aspect is the aspect ration between the vertical and horizontal of the viewscreen, n is the distance to the near clipping plane, and f is the distance to the far clipping plane.)

As objects in the right handed coordinate space move farther away from the eye, the value of z increases to -∞, and after being transformed by this matrix, as our object approaches f, zs approaches 1.0.

Now one interesting aspect of the transformation is that the user must be careful to select the near and far clipping planes: the greater the ratio between far and near, the less effective the depth buffer will be.

If we examine how z is transformed into zs screen space:

derivematrix.gif

And if we were to plot values of negative z to see how they land in zs space, for values of n = 1 and f = 5 we get:

zpersgraph.png

That is, as a point moves closer to the far clipping plane, zs moves closer to 1, the screen space far clipping plane.

Notice the relationship as we move closer to the far clipping plane, the screen space depth acts as 1/z. This is significant when characterizing the accuracy of the representation of an object’s distance and the accuracy of the zs representation of that distance for drawing purposes.

If we wanted to eliminate the far clipping plane, we could, of course, derive the terms of the above matrix as f approaches ∞. In that case:

farclipinf1.gif

And we have the perspective matrix:

persmatrix2.gif

And the transformation from z to zs looks like:

zpersgraph2.png

IEEE-754

There are two ways we can represent a fractional numeric value. We can represent it as a fixed point value, or we can use a floating point value. I’m not interested here with a fixed point representation, only with a floating point representation of numbers in the system. Of course not all implementations of OpenGL support floating point mathematics for representing values in the system.

An IEEE 754 floating point representation of a number is done by representing the fractional significand of a number, along with an exponent.

ieee754.gif

Thus, the number 0.125 may be represented with the fraction 0 and the exponent -3:

ieee754ex.gif

What is important to remember is that the IEEE-754 representation of a floating point number is not accurate, but contains an error factor, since the fractional component contains a fixed number of bits. (23 bits for a 32-bit single-precision value, and 52 bits for a 64-bit double-precision value.)

For values approaching 1, the error in a floating point value is determined by the number of bits in the fraction. For a single-precision floating point value, the difference from 1 and the next adjacent floating point value is 1.1920929E-7, which means that as numbers approach 1, the error is of order 1.1920929E-7.

We can characterize the error in model space given the far clipping plane by reworking the formula to find the model space z based on zs:

zspacederive.png

We can then plot the error by the far clipping plane. If we assume n = 1 and zs = 1, then the error in model space zε for objects that are at the far clipping plane can be represented by:

zerrorderive.gif

Graphing for a single precision value, we get:

zerror.png

Obviously we are restricted on the size of the far clipping plane, since as we approach 109, the error in model space grows to the same size as the model itself for objects at the far clipping plane.

Clearly, of course, setting the far clipping plane to ∞ means almost no accuracy at all as objects move farther and farther out.

The reason for the error, of course, has to do with the representation of the number 1 in IEEE-754 mathematics. Effectively the exponent value for the IEEE-754 representation is fixed to 2-1 = 0.5, meaning as values approach 1, the fractional component approaches 2: the number is effectively a fixed-point representation with 24 bits of accuracy (for a single-precision value) from 0.5 to 1.0.

(At the near clipping plane the same can be said for values approaching -1.)

separator.png

All values in the representation range of IEEE-754 points have the same feature: as we approach the value, the representation is similar to if we had picked a fixed-point representation with 24 (or 53) bits. The only value in the IEEE-754 range which actually exhibits declining representational error as we approach that value is zero.

In other words, for values 1-ε, accuracy is fixed to the number of bits in the fractional component. However, for values of ε approaching 0, the exponent can decrease, allowing the full range of bits in the fractional component to maintain the accuracy of values as we approach zero.

With this observation we could in theory construct a transformation matrix which can set the far clipping plane to ∞. We can characterize the error for a hypothetical algorithm that approaches 1 (1-1/z) and one that approaches 0 (1/z):

zerrors.png

Overall, the error in model space of 1-1/z approaches the same size as the actual distance itself in model space as the distance grows larger: err/z approaches 1 as z grows larger. And the error grows quickly: the error is as large as the position in model space for single precision values as the distance approaches 107, and the error approaches 1 for double precision values as z approaches 1015.

For 1/z, however, the ratio of the error to the overall distance remains relatively constant at around 10-7 for single precision values, and around 10-16 for double-precision values. This suggests we could do away without a far clipping plane; we simply need to modify the transformation matrix to approach zero instead of 1 as an object goes to ∞.

Source code:

The source code for the above graph is:

public class Error
{
    public static void main(String[] args)
    {
        double z = 1;
        int i;
        
        for (i = 0; i < 60; ++i) {
            z = Math.pow(10, i/3.0d);
            
            for (;;) {
                double zs = 1/z;
                double zse = Double.longBitsToDouble(Double.doubleToLongBits(zs) - 1);
                double zn = 1/zse;
                double ze = zn - z;

                float zf = (float)z;
                float zfs = 1/zf;
                float zfse = Float.intBitsToFloat(Float.floatToIntBits(zfs) - 1);
                float zfn = 1/zfse;
                float zfe = zfn - zf;

                double zs2 = 1 - 1/z;
                double zse2 = Double.longBitsToDouble(Double.doubleToLongBits(zs2) - 1);
                double z2 = 1/(1-zse2);
                double ze2 = z - z2;

                float zf2 = (float)z;
                float zfs2 = 1 - 1/zf2;
                float zfse2 = Float.intBitsToFloat(Float.floatToIntBits(zfs2) - 1);
                float zf2n = 1/(1-zfse2);
                float zfe2 = zf2 - zf2n;
                
                if ((ze == 0) || (zfe == 0)) {
                    z *= 1.00012;   // some delta to make this fit
                    continue;
                }

                System.out.println((ze/z) + "t" + 
                        (zfe/zf) + "t" + 
                        (ze2/z) + "t" + 
                        (zfe2/zf));
                break;
            }
        }
        
        for (i = 1; i < 60; ++i) {
            System.out.print(""1e"+(i/3) + "",");
        }
    }
}

We use the expression Double.longBitsToDouble(Double.doubleToLongBits(x)-1) to move to the previous double precision value (and the same with Float for floating point values), repeating (with a minor adjustment) in the event that floating point error prevents us from propery calculating the error ratio at a particular value.

A New Perspective Matrix

We need to formulate an equation for zs that crosses -1 as z crosses n, and approaches 0 as z approaches -∞. We can easily do this by the observation from the graph above: instead of calculating

zoldformula.gif

We can simply omit the 1 constant and change the scale of the 2n/z term:

znewformula.gif

This has the correct property that we cross -1 at z = -n, and approach 0 as z approaches -∞.

znewgraph.png

From visual inspection, this suggests the appropriate matrix to use would be:

newpersmatrix.gif

Testing the new matrix

The real test, of course, would be to create a simple program that uses both matrices, and compares the difference. I have constructed a simple program which renders two very large, very distance spheres, and a small polygon in the foreground. The large background sphere is rendered with a diameter of 4×1012 units in radius, at a distance of 5×1012 units from the observer. The smaller sphere is only 1.3×1012 units in radius, embedded into the larger sphere to show proper z order and clipping. The full sphere (front and back) are drawn.

The foreground polygon, by contrast, is approximately 20 units from the observer.

I have constructed a z-buffer rendering engine which renders depth using 32-bit single-precision IEEE-754 floating point numbers to represent zs. Using the traditional perspective matrix, the depth values become indistinguishable from each other, as their values approach 1. This results in the following image:

rendertest_image_err.png

Notice the bottom half of the sphere is incorrectly rendered, as is large chunks of the smaller red sphere.

Using the new perspective matrix, and this error does not occur in the final rendered product:

rendertest_image_ok.png

The code to render each is precisely the same; the only difference is the perspective matrix:

public class Main
{
    /**
     * @param args
     */
    public static void main(String[] args)
    {
        Matrix m = Matrix.perspective1(0.8, 1, 1);
        renderTest(m,"image_err.png");
        
        m = Matrix.perspective2(0.8, 1, 1);
        renderTest(m,"image_ok.png");
    }

    private static void renderTest(Matrix m, String fname)
    {
        ImageBuffer buf = new ImageBuffer(450,450);
        m = m.multiply(Matrix.scale(225,225,1));
        m = m.multiply(Matrix.translate(225, 225, 0));
        
        Sphere sp = new Sphere(0,0,-5000000000000d,4000000000000d,0x0080FF);
        sp.render(m, buf);
        
        sp = new Sphere(700000000000d,100000000000d,-1300000000000d,300000000000d,0xFF0000);
        sp.render(m, buf);
        
        Polygon p = new Polygon();
        p.color = 0xFF00FF00;
        p.poly.add(new Vector(-10,-3,-20));
        p.poly.add(new Vector(-10,-1,-19));
        p.poly.add(new Vector(0,0.5,-22));
        p = p.transform(m);
        p.render(buf);
        
        try {
            buf.writeJPEGFile(fname);
        }
        catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Notice in the call to main(), we first get the traditional perspective matrix with the far clipping plane set to infinity, then we get the alternate matrix.

The complete sources for the rendering test which produced the above images, including custom polygon renderer, can be found here.

With this technique it would be possible to render correctly large landscapes with very distant objects without having to render the scene twice: once for distant objects and once for near objects. To use this with OpenGL would require adjusting the OpenGL pipeline to allow the far clipping plane to be set to 0 instead of 1 in zs space. This could be done with the glClipPlane call.

Conclusion

For modern rendering engines which represent the depth buffer using IEEE-754 (or similar) floating point representations, using a perspective matrix which converges to 1 makes little sense: as values converge to 1, the magnitude of the error is similar to that of a fixed-point representation. However, because of the nature of the IEEE-754 floating point representation, convergence to 0 has much better error characteristics.

Because of this, a new perspective matrix than the one commonly used should have better rendering accuracy, especially if we change the far clipping plane to ∞.

By using this new perspective matrix we have demonstrated a rendering environment using 32-bit single-precision floating point values for a depth buffer which is capable of representing in the same scene two objects whose size differs by 11 orders of magnitude. We have further shown that the error in representation of the zs depth over the distance of an object should remain linear–allowing us to have even greater orders of magnitude difference in the size of objects. (Imagine rendering an ant in the foreground, a tree in the distance, and the moon in the background–all represented in the correct size in the rendering system, rather than using painter’s algorithm to draw the objects in order from back to front.)

Using this matrix in a system such as OpenGL, for rendering environments that support floating point depth buffers, would be a matter of creating your own matrix (rather than using the built in matrix in the GLU library), and setting a far clipping plane to zs = 0 instead of 1.

By doing this, we can effectively say goodbye to the far clipping plane.

Addendum:

I’m not sure but I haven’t seen this anywhere else in the literature before. If anyone thinks this sort of stuff is worthy of SigGraph and wants to give me any pointers on cleaning up and publishing, I’d be greatful.

Thanks.

There is nothing new under the sun.

Well, the rant on TechCrunch has gone global: Tech’s Dark Secret, It’s All About Age.

Excuse me while I throw in my two cents, as a 44 year old software developer.

  1. Pretty much all of the useful stuff in Computer Science was invented by the 1960’s or 1970’s. Very little is out there today that is really “new”: MacOS X, for example, is based on Unix–whose underpinnings can be traced back to 1969, with most of the core concepts in place by the early 1980’s.

    Even things like design patterns and APIs and object oriented programming stem from the 70’s and 80’s. Sure, the syntax and calling conventions may have changed over the years, but the principles have stayed the same.

    For example, take the MVC model first discussed formally 20 years ago. The ideas behind that weren’t “invented” then; the MVC papers from Talgent simply codify common practices that were evolving in the industry well before then. One can find traces of the ideas of separating business logic from presentation logic in things like Curses, or in Xerox Parc’s work from the 1970’s. I remember writing (in LISP) calendrical software for Xerox as a summer intern at Caltech, using the principles of MVC (though not quite called that back then) in 1983.

    Or even take the idea of the view model itself. The idea of a view as a rectangular region in display space represented by an object which has a draw, resize, move, mouse click handler and keyboard focus handler events can be found in InterLisp, in NextStep, in Microsoft Windows, on the Macintosh in PowerPoint; hell, I even wrote a C++ wrapper for MacOS 6 called “YAAF” which held the same concepts. The specific names of the specific method calls have changed over the years, but generally there is a draw method (doDraw, -drawRect:, paint, paintComponent, or the like), a mouse down/move/up handler, a resize handler (or message sent on resize), and the like.

    The idea never changes; only the implementation.

    Or hell, the Java JVM itself is not new: from P-machines running a virtual machine interpreter running Pascal to the D machine interpreter running InterLisp, virtual machine interpreters running a virtual machine has been around longer than I’ve been on this Earth. Hell, Zork ran on a Virtual Machine interpreter.

  2. I suspect one reason why you don’t see a lot of older folks in the computer industry is because of self-selection. Staying in an industry populated by Nihilists who have to reinvent everything every five years or so (do we really need Google Go?) means that you have to be constantly learning. For some people, the addiction to learning something new is very rewarding. For others, it’s stressful and leads to burnout.

    Especially for those who are smart enough to constantly question why we have to be reinventing everything every five years, but who don’t like the constant stress of it–I can see deciding to punt it all and getting into a job where the barbarians aren’t constantly burning the structures to the ground just because they can.

    I know for a fact that I don’t see a lot of resumes for people in their 40’s and 50’s. I’m more inclined to hire someone in their 40’s as a developer than someone in their 20’s, simply because you pay less per year of experience for someone who is older. (Where I work, there is perhaps an 80% or 90% premium for someone with 4 or 5 times the experience–a great value.)

    But I also know quite a few very smart, bright people who decided they just couldn’t take the merry-go-round another time–and went off to get their MBA so they could step off and into a more lucrative career off the mental treadmill.

    I have to wonder, as well, where I would be if I had children. Would I have been able to devote as much time reading about the latest and greatest trends in Java development or Objective C or the like, if I had a couple of rug-rats running around requiring full-time care? (I probably would have, simply because I’d rather, on the whole, read a book on some new technology than read the morning paper. I would have probably sacrificed my reading on history and politics for time with my children.)

  3. There is also this persistent myth that older people have familial obligations and are less likely to want to work the extra hours “needed to get the job done.” They’re less likely to want to pull the all-nighters needed to get something out the door.

    But in my experience, I have yet to see development death marches with constant overnighters paid off in pizza that didn’t come about because of mismanagement. I don’t know another industry in the world where mis-managing the resource sizing, and demanding your workers work overtime to compensate for this failure to do proper managerial resource sizing and advanced development planning is seen as a “virtue.”

    And I suspect the older you get, the less likely you are to put up with the bullshit.

    Having seen plenty of product make it to market–and plenty not make it to market, and having lived through several all nighters and product death marches, I can see a common theme: either a product’s sizing requirements were mismanaged, or (far more commonly) upper management is incapable of counting days backwards from a ship date and properly assessing what can be done.

    The project I’m on, for example, was given nearly a year to complete. And Product Management pissed away 7 of those months trying to figure out what needs to be done.

    The younger you are, the less likely you are to understand that three months is not forever, and if you need to have something in customer hands by December, you have to have it in QA’s hands by September or October–which means you have to have different modules done by July. It’s easy if you don’t have the experience to understand how quickly July becomes December to simply piss away the time.

    So I can’t say that it’s a matter of older people not being willing to do what it takes–if upper management also was willing to do what it takes, projects would be properly sized and properly planned. No, it’s more a matter of “younger people don’t have the experience to do proper long-term planning to hit deadlines without working overtime,” combined with “younger people don’t have the experience to call ‘bullshit’.”

  4. There is also, as an aside, a persistent myth that it takes a certain type of intelligence or a certain level of intelligence to be successful in the software industry.

    I’m inclined to believe more in the 10,000 hour rule: if you practice something for 10,000 hours, you will become successful at that thing.

    Intelligence and personality could very well help you gain that 10,000 hours: the first few hours of learning how to write software or learning a new API or a new interface can be quite annoying and stressful. But if you persist, you will get good at it.

    Which means IQ and personality, while perhaps providing a leg up, doesn’t guarantee success.

    It’s why I’m inclined also to want to favor more experienced and older developers who have persisted with their craft. If we assume a 6 hours of actual development work (with the other 2 on administrative stuff), then a work year only has 1,500 hours–meaning 10,000 hours takes about 7 years to accumulate. Assuming you start out of college at 21, this means that anyone under the age of 28 will not have sufficient experience to be good at their craft.

    And that assumes they practiced their craft rather than just going through the motions.

The whole “it’s all about ageism” in the tech industry is an interesting meme–simply because it’s far more complicated than that.