User Interface Truisms.

Fundamentally how hard it is to use a user interface depends on the cognitive load of that interface–that is, how much thinking you have to do in order to use the interface.

Now “thinking” is one of those fuzzy things that really needs to be quantified.

As a proxy for user interface complexity (or rather, how hard it is to use an interface) some people use the number of buttons that one has to press in order to accomplish a given task. But that is only a proxy: clearly even though it takes 19 button presses (including the shift key) to type “Hello World!”, one would never argue that these 19 button presses is as hard to accomplish as navigating through 19 unknown interface menus.

That’s because the button press is a proxy for a decision point. The real complexity, in other words, comes from making a decision–which goes right back to cognitive load, the amount of thinking you have to do to accomplish a task.

So decision points are clearly a sign of cognitive load.

Now think back when you first saw a computer keyboard, and how mystified you were just to type a single letter: search, search, search, ah! the ‘h’ key! Success! Now, where is that stupid ‘e’ key? Oh, there it is, next to the ‘w’ key–why is it next to the ‘w’ key? Weird. Now, if I were the ‘l’ key, where would… oh, there it is, all the way on the other side of the keyboard. Weird. At least I know to press it twice. And now the ‘o’–ah, right there, next to the ‘l’ key.

Ooops. I have a lower case ‘h’. How do I back up? …

Clearly, then, along with decision points we have familiarity with the interface as a factor in how hard a user interface is to use: if we know how to touch type, typing “Hello World!” doesn’t even require thought beyond thinking the words and typing them. For for the uninitiated to the mysteries of the computer keyboard, hunting and pecking each of those keys is quite difficult.

Complexity, then, is cognitive load. And cognitive load goes towards the difficulty in making the decisions to accomplish a task, combined with the unfamiliarity of the interface to find what one needs to do to accomplish that task.

separator.png

Now, of course, from a user interface design perspective there are a few things that can be done to reduce cognitive load, by attacking the familiarity problem.

One trick is to have a common design language for the interface. By “design language” I’m referring, of course, to what you have to do to manipulate a particular control on the screen. If you always manipulate a thing that looks like a square by clicking on it–and the act of clicking on it causes an ‘x’ to appear in that square or disappear if it was there, then you know that squares on the screen can be clicked on.

And further, if you know that squares that have an ‘x’ in them means the item is somehow “selected” or “checked” or “enabled” of whatever–and you know that unchecking something means it’s “unselected” or “un-checked” or “disabled”–then suddenly you have some familiarity: you can quickly see boxes and realize they are check-boxes, and checking them means “turn this on” and unchecking means “turn this off.”

This idea of a design language can even extend to interfaces built strictly using text-only screens: if you see a text that looks like [_________], and a blinking square on the first underscore, and typing types in that field–and hitting the tab key moves to the next [_________] symbol on the screen, then you know all you need to know to navigate through a form. Other text symbols can have other meaning as well: perhaps (_) acts like our checkbox example above, or acts like a radio button (a round thing you can select or unselect which has the side effect of unselecting all other related round button thingies) or whatever.

The point is consistency.

And this consistency extends beyond the simple controls. For example, if you have a type of record in your database that the user can add or remove from a screen, having the “add” and “delete” and “edit” buttons in the same place as on other screens where other records are added or deleted helps the user understand that yes, this is a list of records, and immediately he knows how to add, delete, and edit them.

Visual language provides a way for a user to understand the unfamiliar landscape of a user interface.

separator.png

The other trick is for selective revelation of the interface in a guided way, revealing decisions that need to be made in an orderly way.

For example, imagine an order entry system where the type of order must be first selected, then the product being ordered, then product-specific information needs to be entered. This could be implemented by selectively showing controls that need to be filled out at each step of the process. This could be accomplished by a wizard. And notice that unneeded information (such as the size of a clothing item, unneeded when ordering a purse) can continue to be hidden if not needed.

The goal with this is to help guide the decision making process, to help gather the information in the order needed by the system. And by guiding the decision making process you reduce cognitive load: you ask only the questions that are needed rather than overwhelm the user with a bewildering array of interrelated choices, some of which (such as the clothing size of a purse) are nonsensical.

separator.png

The problem with all these user interface tricks (and there are plenty of others: arranging the information on the screen, tips and hints that dynamically come up, on-line help for the first time user, making interface reflect a consistent cognitive model, reducing short-term memory load by segregating items into 7 +/- 2 items or groups, etc.) is that they all go towards tackling the familiarity problem of the interface. In other words, they only go towards reducing the cognitive load of the interface itself.

And, honestly, most of these design patterns are pretty well known–and only go towards reducing the cognitive load of the first-time user. Once someone has gained familiarity with an interface–even a very bad one–the cognitive load imposed by a poorly designed interface is like the cognitive load imposed by a computer keyboard: eventually you just know how to navigate through the interface to do the job. (To be clear, reducing familiarity cognitive load reduces training costs if this is an internal interface, and reduces consumer friction and dissatisfaction of an external interface–so it’s important not to design a bad interface.)

Ultimately the cognitive load of a system comes from the decision points imposed by the interface. And a user interface can only present the information from the underlying system: ultimately it cannot make those decisions on behalf of the operator. (If the user interface could, then the decision wouldn’t need to be made by the operator–and the decision point really isn’t a decision point but an artifact of a badly designed system.)

What this means is that in order to simplify a product, the number of operator decision points must ultimately be reduced–either by prioritizing those decision points (noting which decisions are optional or less important to capture), or through redesigning the entire product.

separator.png

Remember: a user interface is not how a product looks. It’s how the product works.

Fixed e-mail

So I broke the outgoing e-mail for this blog.

And it was interesting how I did it: I broke outgoing e-mail by, um, well, changing to Google’s GMail. The problem is LunarPages (which hosts this account) blocks outgoing SMTP ports to allow the blog to send e-mail out to GMail. So by reverting back to LunarPages for e-mail hosting fixed the problem.

I don’t know what came over me, switching to GMail…

two_cents++: Apple dropping Java.

Two observations about the whole Apple deprecating Java thing:

(1) One of the biggest problems with Java on the Macintosh is that the Java AWT/Swing stuff is just bloated like crazy. I like Java Swing and the Mac implementation a lot; I have written Macintosh applications in Java that look and feel like Macintosh applications.

But practically speaking it’s just crazy bloated–and I would have been just as happy with something significantly smaller. (Do you really need pluggable LAF libraries? Has anyone you known ever successfully extended the javax.swing.plaf abstract classes to build their own custom look and feel? And would you really want to in the first place?)

The engineering cost is just not worth it.

Had Sun made portability a substantial requirement: require, for example, that the OpenSDK can be ported to a new windowing environment with less than a few months of one engineer’s time, by simplifying the native dependencies, for example, we wouldn’t be here.

(2) Of all the companies in the world right now using Java, Google is the one company that is doing the most interesting things: they’ve created their own VM for Android, they’ve created a Java to Javascript cross-compiler in GWT. They seem to be the most devoted to creating a useable subset of Java that allows you to write once (in Eclipse) and run anywhere (on mobile devices, desktop, web browsers).

If people really want Java to survive, they’d petition Google to purchase the assets behind Java. And let Google maintain Java and extend it into a full-fledged open language.

What’s wrong with business reporting of the Computer Industry

You read through a report on computer industry jobs, or you take a test in high school which leads you to believe you may have a future as a “Computer Terminal Operator” or a “Computer Software Analyst.” And the stuff you read makes no sense whatsoever–things like “A Computer Programmer converts symbolic statements of business, scientific or engineering problems to detailed logic flow charts into a computer program using a computer language”, and you think “what?” Or you see something like:

Disappearing Jobs:

Plus, the work of computer programmers requires little localized or specialized knowledge. All you have to know is the computer language.

And you think “WTF?!?”

Really?

Here’s the problem. All of these descriptions are based on an industry classification scheme first created by the Bureau of Labor Statistics, part of the United States Department of Labor. And the descriptions are hopelessly out of date.

In the world of the Bureau of Labor Statistics, this is how a computer program is created, executed, and the results understood:

separator.png

First a business (such as a WalMart) decides that it has a business reason to create a new computer program. For example, they decide that they need a computer system in order to determine which products are selling better in one geographic region, so they can adjust their orders and make sure those products flow to that area.

This problem was probably identified by a Management Analyst (B026), whose job is to “analyze business or operating procedures to advise the most efficient methods of accomplishing work.” Or it was identified by an Operations Analyst (A065).

So they work with a Systems Analyst (A064) in order to restate the problem (“find the areas where different SKUs are selling better, compare against current logistical shipping patterns, and adjust future orders to make sure stores are well stocked for future demand”) into a detailed flow chart and requirements documents outlining how this process should work.

Once this flow chart has been agreed upon by the analysts and operations engineers and scientists, they turn the problem over to a Computer Programmer (A229) in order to convert the flow charts describing the problem into a computer program. This computer program is generally written using automated data processing equipment, such as a punch card reader.

The computer programmer verifies his program by working with a Computer Operator (D308) to submit his punched cards to the mainframe and, after a program run completes, returning the printout to the computer programmer’s “in box”, a wooden box used to hold the printouts designated to a specific programmer. (Computer operators “select and load input and output units with materials, such as tapes or disks and printout forms for operating runs.”)

Once the Computer Programmer has verified that his punch card deck is properly functioning, he will then eventually request (depending on the nature of the program) to either have his job run on a regular basis or loaded into the business mainframe so that inputs to his program may be submitted by a Data Entry Keyer (also known as a Computer Terminal Operator D385), or the process may be batch run by management, depending on if the Computer Systems Administrator (B022) permits it.

separator.png

Does this sound like the software industry you’re familiar with? No?

Here’s the problem. This is what every organization outside of the computer industry thinks goes on at places like Google, Apple, or Microsoft. Government decisions on education, the Bureau of Labor Statistics reports, and even reporting by places like CNBC are all driven by this image of the computer industry–an image that is about 30 to 40 years out of date.

And no-one has figured it out in the government, because every time they send out a survey on jobs to the computer industry, generally some guy somewhere goes “well, hell, none of this sounds like my guys. So I’m going to guess they’re all in the A064 category, because they need to think about what they’re writing–so they can’t be in the A229 category.

Until someone tells the Bureau of Labor Statistics their designations are garbage, we’re just going to continue to get garbage out from the BLS, from Government managed school textbooks (who still advise people about professions as a “Systems Analyst”, a profession that doesn’t actually exist as such), and from reporting outlets like CNBC–all who get their understanding from the BLS occupational classification system.

separator.png

Like fish who don’t notice the water they swim in, we don’t really know how much the government and government classification systems affect our thinking in this country.

separator.png

Addendum: Even job survey sites tend to use the same BLS classifications for classifying job and salaries, which is why most job sites talk about “Systems Analyst III” or “Computer Programmer II” job designations–which you will never see on a Google or an Apple job listing. It’s why figuring out salary requirements is such a royal pain in the ass for the computer industry as well–because everything is getting classified into buckets that are 30 years out of date.

Really?

Sorry I’m picking on this guy, but he brings up two things that really irritates the living hell out of me about our industry.

Having worked in software development for over 15 years and developing software for nearly 30 years,…

So… you’re counting coding when you were 5?

Look, for all I know this guy was a lawyer who started writing software out of college, but waited to jump ship until he was 35 and he’s now in his 50’s. But most developers I know who want to talk about how much programming experience they are count all the way to the time when their mommy gave them a toy laptop with a BASIC interpreter.

I’m sorry, but I’d like to call “bullshit” on that.

In every other industry I’m aware of, experience is counted as professional experience, where you had a fucking job (or a volunteer job) where you actually went somewhere and did something for money (or to help with a volunteer organization). It involved regular hours and following directions from another managing supervisor, and it involved delivering stuff on a schedule.

Fucking around when you’re 8 on daddy’s computer with the copy of GW-Basic that came on the computer no more counts (in my mind) as “experience” than does wrapping duct tape over a leaking air duct counts as “experience” as an HVAC Contractor, or fixing a leaky pipe counts as a Plumber, or helping a friend nail boards together to tack up a broken piece of trim on their house counts as experience as a finish contractor. Renting a U-Haul doesn’t count as truck driver experience, helping your younger sister with her homework doesn’t count as teaching experience, and convincing a friend he should buy a new phone doesn’t count as sales experience.

So why in the name of Cthulhu does tinkering around with a BASIC interpreter after school in the 2nd grade count as development experience?

Really?!?

To me, experience is professional experience. College education can be counted if called out separately as college, in the same way that taking technical courses to become a plumber can count on your resumé towards being a plumber, if called out separately. Fucking around sniffing pipe glue doesn’t count.

By this metric, I graduated from Caltech in 1988 with a degree in Mathematics, with experience in computer graphics, computational theory, a touch of VLSI design and hardware design, and a touch of mathematical optimization theory. From 1987 to 1988 I worked for a company doing computer graphics for the Macintosh (I needed to take one class to finish my degree, so worked full time while finishing up), and since graduating I’ve worked non-stop as a software developer in a variety of fields.

Which means I have 23 years of professional experience and a 4 year degree from a pretty good school.

I don’t count the time when I got a TRS-80 back in the 70’s, or the time when I was 12 learning Z80 assembler to hack my computer, or the time I spent in high school learning LISP. If I were, I could say things like “well, I have been developing software for nearly 33 years”–or is that longer, since I started tinkering with programmable calculators and electronic circuits well before then?

I call bullshit on the practice. Sorry.

In my opinion development experience starts accumulating the first time you get a real fucking job, either post-college, post-high school, or after dropping out of college. Not from the time your elementary school teacher allowed you to play with the old Commodore 64’s after school if you finished your homework first.

You’ve guessed it, Scrum adresses all of these resulting in 99% – 100% on target delivery. So it’s not due to bad programmers if an agile process can fix this.

I also call bullshit on the “one size fits all management style fixes all the problems in the computer industry.”

Excuse me, but Agile will fix all the ills of wrong estimation, wrong status updates, scope creep, and the like?

Really?

Because Agile will–what, exactly? Reduce the problem set into manageable chunks that can be fed to our steady stream of interchangeable cogs of programmers?

Don’t get me wrong; I think Agile is a reasonable tool for certain problems in a managerial toolkit. Along with good bug tracking, a well organized QA process, and motivated developers, and a good project plan. But Agile doesn’t fix a damned thing: it simply creates a regular communications channel between individual contributors and managers who need to keep track of the bottom line.

It helps, in other words, if you have good managers and good developers. But it won’t do squat if you have dysfunction in the overall team. And while I’ve seen people say that this dysfunction isn’t Agile, I’d argue that if you use a business label to label not just the tool but the overall team result, then you’re no longer describing a tool–you’re describing a condition and attributing it to a tool.

And I despise circular thinking.

Software shipped well before the latest management fad came down the pike, and the introduction of this fad did not make managers good leaders or developers better programmers. And I’d even go so far as to argue that the practice simply altered the workplace, making it harder, not easier, to ship new development efforts which are more properly handled with a waterfall method with constant feedback from the development team.

There is a reason why large companies generally bring new products to market by buying small companies who develop those products, rather than doing it in-house: because all of these managerial fads can never replaced a well motivated small team of people doing something either out of love or out of greed. Large companies take away motivation by greed and they tend to marginalize those who have love for a project.

And Agile will never replace passion.

Thanks for reading my rant. And apologies to Stephan Schmidt whose post just set me off this morning–I actually agree with 90% of what he wrote about the whole “bad programmer” debate, which is honestly a different rant for a different time. Though I will note that just because I need a C compiler to be more effective than entering a program via a toggle switch panel, does that mean I’m a bad programmer because I need a tool to help me figure out which bits in program space should be set to 1, and which ones should be set to 0?

Sometimes you get what you want by accident.

A year ago I had thought to go back and get my Ph.D., thinking that eventually I’d like to turn this into either a teaching gig or into a research gig. I’m nearly 45, you see, and while I probably could go on and do the single contributor thing pretty much the rest of my life (since I love to learn new things all the time), at some point it’d be nice to think of the next generation.

So I talked to a couple of college professors about Computer Science Ph.D.s, and–meh. The problem is most of the interesting research is being driven by corporate need rather than by theoretical considerations. Which makes a certain degree of sense: the Internet and modern CPUs have introduced all sorts of interesting problems that really need solving far more than refining Turing Machines. And university research is often funded by–you guessed it–those corporations generating the problems that need to be solved.

Besides, you do anything for 20+ years, constantly striving to stretch your own talents the entire way, and chances are you probably have quite a bit you could be teaching the professors.

So I punted.

And instead I took a management gig thinking that perhaps I could use the opportunity to act like a college professor–teaching a group of graduate students. Except they’re professional developers with a decade and a half less experience than I, and the problems we’re working on aren’t as theoretical.

But I was hoping no-one would notice that I was spending all my time with one publicly stated purpose: execution–but all the while with one secret purpose: developing my direct reports.

Now I’m reading The Extraordinary Leader as part of a management training program at AT&T. And something interesting stood out: one of the most important things a leader can do is develop and advance the people he leads.

Hmmmm…

I guess my secret agenda doesn’t need to be quite so secret.

So here we go: I think what I’m going to do is (a) when there are tight deadlines, keep the schedule by pushing back at unreasonable requests, setting a schedule that may be aggressive but can be kept, get the resources we need and protecting my team, and helping out (very selectively) in areas where I can do the most good while leaving most of the work to my team. But when (b) we have times where our schedule is not quite so tight, to help by creating team building and learning exercises–such as tomorrow’s brainstorming sessions where I’m going to turn over the design of a new component to my team and have them come up with several alternative architectural designs, then debate the pros and cons of each.

My theory–and we’ll see if it works–is that by having such exercises, both teaching exercises, as well as team building through constructive discussions of alternate ways of designing something–my team will learn new stuff through practice and become better developers and, if they wish, better leaders.

Wish me luck.

Fortunately management pays far more than a professorship.

And then Apple changes the Rules. Again.

So I uploaded J2OC, and had lost interest in it. After all, who needs a second “let’s recompile Java into Objective C” in order to build iPhone and Android applications, if Apple isn’t going to allow it?

Then Apple does this: Statement by Apple on App Store REview Guidelines

In particular, we are relaxing all restrictions on the development tools used to create iOS apps, as long as the resulting apps do not download any code. This should give developers the flexibility they want, while preserving the security we need.

What I would ideally want is a Java VM kernel that can be linked into an iPhone application, one capable of running a jar file. Because ideally I’d like to write model code in Java–so I can port that model code to Android. Yet I don’t want UI bindings into the Apple API–I’d rather just build the UI twice, while the (more complicated) model code remains the same.

Thank you Apple. Maybe I’ll document J2OC better and provide some sample programs. It really is a cool little bit of technology. 🙂

Something Funny Happened To Me On The Way To Release.

So I started playing with parsing Java class files, creating a cross compiler capable of converting Java class files into Objective C files. I even had a sufficient amount of Apache Harmony running so I could use a good part of the java.lang and java.util classes; roughly in parity with the GWT cross compiler that can compile Java class files into Javascript.

Then Apple dropped the “no cross compiling” bombshell.

Now, keep in mind that I’m just me, tinkering on my spare time during weekends. I don’t have the desire or the time to go up against Apple. I’d rather allow the XMLVM project (which has a well established ecosystem, or so it seems) to decide to go (or not go) against Apple’s wishes.

Then time went by, and I sort of lost interest in this thing.

So I’ve taken the liberty to post the source code here: the Java to Objective C Compiler sources, and the J2OC RTL, which contains a subset of the Apache Harmony project, and implementing the java.lang and java.util classes.

It’s been an interesting project, and hopefully in the next few weeks I’ll document how this all works–including the wierdnesses and pitfalls I came across with the Java VM to get Apache Harmony to work. (Nothing like working through a very large collection of class files to find all the fringe cases.) The output code was intended to be human readable–but it really isn’t for some expressions.

But I’ll describe that in the next few weeks.

And at some point I’ll post an example iPhone application which includes Java code.

Note that my approach was different than the XMLVM project. Instead of providing Java bindings of the iOS libraries, my intent was to only allow the compilation of a computational kernel, then have the user provide the UI elements separately for Android, the iPhone, the iPad, and whatever other target the code was to compile for.

So you won’t find a turn-key solution for recompiling Android code and have it run on the iPhone. You should really check out the XMLVM project instead.

All this code, by the way, is being published under a BSD style license: go ahead and use the code, but leave me out of it and don’t blame me if it goes haywire.

separator.png

While I don’t intend to get into the functioning of the compiler, I will give a taste of how the code works. The bulk of the .class file parser, which reads and loads the .class file data into memory, is contained in the class ClassFile in com.chaosinmotion.j2oc.vm. This class takes in its constructor an input stream opened to the first byte of a .class file, and loads the entire class file into memory.

Once read, the entire class file can be accessed using the getters associated with that class. The bulk of the code contained inside the .vm (and subpackages within .vm) are used to represent the contents of the class file. The .vm.data classes contain the various data types used to store the meta data within a class file (such as the method names, the attributes fields, and the like), and the .vm.code classes contain a code parser to convert the code within the .class files into an array of processed instructions.

Once the instructions are parsed (by the vm.code.Code class), the code in a method is represented as an array of code segments; a run of instructions that starts with an instruction first jumped into by another instruction, and terminates with either the end of the method or with a jump instruction. In other words, a CodeSeg (Code.CodeSeg class) is a section of instructions that always enters at the first instruction and executes sequentially to the last instruction in the segment. Additional information, such as the list of variables that are used when the segment is entered are noted; this is the current state of the Java operator stack as this segment is entered.

Ultimately the code parser and class file reader represents the code in a .class file in memory in an intermediate state that can then be used to write Objective C with the WriteOCMethod class (com.chaosinmotion.j2oc.oc). A class, CodeOptimize (.oc package) provides utilities that determine if code preambles must be written for memory management or for exception handling: memory management preamble does not need to be written if I never invoke another method. (This is the case for simple functions which return a field or does simple math.)

The theory is that in practice, it should be possible to replace the code writer method with a writer method capable of writing a different language, such as C++ or C.

separator.png

In the future, when I have more time, I’ll write more about the J2OC project. But for now, if there are any segments or parts you want to use or play with, be my guest.