Thoughts on Netbooks.

One of the things that the company I’m working for has considered is creating a Netbook version of our location-based software. So I bought a Netbook (The MSI Wind 120) in order to understand what all the fuss is about.

My observations:

(1) This is not a new product category.

To me, a new product category is a product which I interact with differently than with existing products. For example, I interact with my iPhone in a completely different way than I do with my 17″ laptop computer: I keep my iPhone in my pocket and pull it out to look thing up quickly, while my laptop gets pulled out of a briefcase and unfolded on a table.

A Netbook is a small laptop. It doesn’t fit in my pocket; I have to take it out and put it down on my lap or a table. I have to boot it up. It’s more convenient: I can use a smaller briefcase. It’ll more easily fit on the airline tray table in coach. But it’s a laptop computer.

(2) The best part about a Netbook computer is not that it is small, but that it is cheap. If you need a second computer or you cannot afford a good desktop computer, and you’re using it primarily for text editing or web browsing, a Netbook computer makes an excellent low-cost choice: I only paid $300 for mine.

(3) Other than this, the keys are cramped (making it a little harder to touch-type on), the screen is small (1024×600 pixels), making it inconvenient to use for text editing, and there is no built-in CD-ROM drive. (The Samsung SE-S084B external USB DVD-writer is around $60 and works great with the MSI Wind.) Thus, while it is an excellent low-cost choice, it’s clear it’s a low-cost choice: you are giving up a lot to save the $500-$1000 span between the 10″ laptop and a well-equipped 12″ laptop.

(4) A cheap laptop will never dethrone the iPhone in the mobile space. On the other hand the eagerness of mobile carriers to come up with something to dethrone the iPhone may force them to consider lowering the price of an all-you-can-eat data plan for laptops, which means a wireless cell card and/or built-in 3G wireless in laptops will undoubtedly be coming down the pike in the near future.

The real question, however, is will it be too little too late: a proliferation of free and cheap Wifi hotspots may make all-you-can-eat 3G wireless for laptops a terrible value proposition unless you need to surf the net out in the boondocks. (On the other hand, if you are in construction or farming, where you routinely work in the boondocks, 3G wireless for laptops will be a god-send.)

(5) A small form-factor touch-screen tablet will be a new product category if it satisfies the following requirements:

(a) Fast boot time. A touch-screen tablet needs to go from off to on in 2 seconds or less.

(b) It should be two to three times the size of the original iPhone. To get an idea of what I mean, here is a pictures showing the relative sizes of my iPhone, an HP 50g calculator, the original Kindle, the MSI Wind, and my 17″ Macbook Pro.

The iPhone is an ideal size to fit in a pocket, but once you get to the size of the HP calculator (one of the larger calculators out there), you need to put it in a backpack or a briefcase or purse. Around the size of the MSI Wind, and you need a dedicated carrying case for the device.

To me, an ideal “new product category” item would be somewhere between the size of the HP 50g calculator and the Kindle, with the Kindle being the top size for such a device.

(c) Battery capacity should be enough to allow the device to surf the ‘net and use the CPU full-boar for a minimum of 4 hours. The iPhone gets its monumental lifetime between charges from very clever power utilization: when surfing the ‘net, once a page is downloaded to your phone, the CPU is turned off. (It’s why Javascript animation execution stops after 5 seconds.) But if you write software that constantly runs and is not event-driven, especially software that uses the ‘net at the same time, the iPhone battery will drain in less than an hour.

I believe for such a small form factor touch screen device to do the trick it needs about 4 times the battery capacity of the iPhone.

Once you reach this size and have something that is “instant-on”, you now have a device that is big enough to work on where you are–and perhaps balance in one hand while you use it in another–but not so big that you need to find a table at Starbucks to pull it out. In fact, such a device would occupy the same product category space (in terms of size, form factor and how a user could interact with it) as a large calculator.

Which means one application which would be ideal for such a device would be a port of Mathematica or some other calculator software which would put the HP 50g to shame. Another application that would be ideal would be web surfing; ideally such a device would devote more disk caching than the iPhone does to web surfing. Also, vertical software for engineers, and e-book readers, would also work.

The idea here is to create a device that straddles the mid point between the iPhone “pull it out, look it up, put it away” 30 second use cycle, and the laptop “gotta find a table at Starbucks so I can pull it out of my briefcase” 1-5 hour use cycle.

And the MSI Wind (and other clamshell shaped cheap laptops) ain’t it.

Update: However, the CrunchPad very well may be the product I’m thinking about, assuming there is a way to install new software on the unit.

The importance of sending a view size changed event on a mobile device.

On Windows Mobile 5 (and I assume the same is true of 6 and 7), the small bar at the bottom which shows the current meaning of the two smart buttons is not a separate chunk of real estate taken away from the rest of the application; instead it is a floating HWND object. It is up to your application to know if that HWND object is present, and to make sure you don’t obscure the HWND.

On the iPhone with the v2.2 and v3.0 OS, the slide-up keyboard is essentially a floating window which overlays your UI. It is your responsibility to know if the keyboard is present, and if so, resize the contents of your view accordingly. That means if you have a table view with an edit text field that is graphically in the area where the keyboard will appear, you have to figure out a way to move the table contents up and down as the keyboard appears and disappears.

I contend that the region of the screen devoted to the keyboard or custom keypad or the like should not be handled as pop-up windows overlaying your user interface. Instead they should be handled as events that resize the main “window” of your user interface. And instead of hooking a half-dozen different events that could change the form factor of the screen, all of these events should be handled exactly the same way they’re handled on a desktop application: with a resize event sent to the “window”, which then percolates down the view chain.

Unfortunately it appears no-one agrees with me. And so we’re stuck doing all sorts of complicated stuff–including Android, which tears down and rebuilds the “Activity” (the thing which manages the views in a UI) rather than simply send a resize event.

Judging an iPhone Development Contest

This evening I’m going to USC to judge an iPhone development contest being held at USC. It should be interesting–and a lot of fun. Part of my goal is to keep an eye out for good talent we may want to poach, and part of my goal is to judge the applications according to concept, user experience, implementation and business model.

Now me, I’m not good on the business model side of the fence: I tend to think of a business model as being “make a good product and sell it for a reasonable price”, which means half of the business models used nowadays in the tech industry makes no sense to me. But what do I know? On the other hand, user experience, and implementation: now all that I’ve got in spades, and I intend to be an opinionated bastard tonight, turning it into a teachable moment if at all possible.

So… How do people use their phones?

Way back in June when I started working for the startup I’m working for now, I made an observation about how people will use their smart phones–a fact which was ignored, and which I believe is hurting our app now. And it reflects how we use different computing form-factors in general.

A desktop is not a laptop is not a pocket computer.

A desktop is an immersive environment. Consisting of a very large display screen and keyboard, a desktop computer is something that generally has a fixed location, but can have a lot of attached or non-portable devices that allow you a great deal of power. Desktop computers are perfect for the daily grind of working at a fixed location.

Laptop and portable computers are less immersive simply because they are smaller. In general, you use a laptop computer by sitting down at a given fixed location, setting up your environment (pull out laptop, open it up, turn it on, find a wireless connection, plug it in), and working. Battery life for a laptop is not all that important unless you set up in an area without an outlet–at which point battery life is invaluable. Battery life is also invaluable if you use a laptop to go from meeting to meeting, where setup time needs to be as short as “take out, flip open.”

Phone computers are the least immersive device, and the one that interests me the most because it is the one that is the least understood. The few people I’ve talked to about phone computers have talked about creating compelling immersive environments for the phone–and tackling the problem of figuring out how to create an immersive environment on a tiny form factor. But phone computers are not immersive.

Certainly there are “immersive” games being designed for the iPhone computer. But there are also a lot of people who seem surprised that some of the most successful applications for the iPhone are things like fart jokes. The reality is that these successful tiny applications are not simply successful because the App Store is improperly structured–though this contributes. No, they are successful because people use the iPhone differently than they use laptop computers.

You see, most people use the phone in the following way:

(1) They pull it out of the pocket and press the “unlock” key.
(2) They tap a few keys. Perhaps they’re getting someone’s phone number, or placing a call, or pulling the finger.
(3) They then put it back into their pockets.

In other words, the tiny, non-immersive form factor combined with a non-existent setup time (as compared to laptops or desktops) means that most interaction with the phone is going to be a very quick cycle of “pull it out, ask a question, get an answer, put it back.”

Now there are times when people are using their phones for an extended period of time. But that extended period of time will stem from two sources. First, a user may be stuck on a subway or in an airport and have nothing to do–at which point they’ll use the phone as a source of entertainment. They’ll pull it out and watch a movie or TV show, or browse the web, or play a game. Because they’re in a circumstance where they’re killing time, however, the game or program should be able to save state quickly, and be entertaining immediately. A first person shooter that has no objective but quickly blowing people away makes more sense than a ‘capture the flag’ game or a game which takes some extended period of time to complete the level: you don’t know when the user will be back on the subway or stuck in an airport. And he may only be killing five minutes.

The second reason why a user will continue using an application (IMHO) is when he pulls it out, asks a question–then doesn’t get the answer he’s looking for, so re-asks the question in a different way. But this is not a good model of user interaction: the user is not interacting with your application because they’ve suddenly gone into an immersive interaction. They’re frustrated! because they’re not getting the answer they want.

Because of this, information-providing applications should provide information quickly. All the usual rules of computer-human interaction apply: 1/4th of a second is the longest delay between touching something and having it react. Two to four seconds is the longest delay for doing some computation–and during this time a spinning animation should be shown. If there is something that requires more time (such as getting the GPS location or hitting a server), a status display should be put up if this takes more than 5 seconds, letting the user know exactly what the software is doing. And if the entire interaction takes longer than perhaps 15 seconds, you better have a very good reason why–because chances are the user is going to flip the application off and put it away, unless he really needs the information.

But the bottom line is this: if your application provides information, it needs to work in a very quick cycle of “pull the phone out, ask a question, get an answer, put the phone away.” If you are designing your application to be an immersive environment or to function in any way other than in a “pull it out, ask a question, get an answer, put it away” loop, you’re designing your application for failure.

KISS in practice.

I was having a conversation with a co-worker, where I complained that the biggest problem with some of the code we’re working on is that it was over-engineered. I reminded him of Gall’s Law:

“A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.”

He asked me a question about client-server interaction, and rather than shooting off a 140 character reply it seems to make sense to answer the question more fully here.

To summarize, we are building an application that communicates to a remote server using JSON over HTTP. The remote system is built on top of Drupal. (Honestly it’s not how I would have done it, because there seems to be too many moving parts already–but the folks who built the server assure me that Drupal is God, and who am I to complain?)

Now on an error there are several things that can go wrong. The remote server can return a 4xx or a 5xx HTTP error. It can return a JSON error. Or it could return an HTML formatted error (I guess they have some flag turned on which translates all remote errors into human-readable error messages.) The client is hardened against all of these return possibilities, of course, and is also hardened against a network failure or against a server being unresponsive.

So here’s the question: how should the client respond to these wide variety of errors?

On the server side, of course, there needs to be logging. We even do some device-level logging, and a mechanism is in place on our iPhone client to transmit that log in the background to the server if there is a problem. Because clearly from a developer’s perspective a 404 is not a 501 is not a 505 is not a “server is unresponsive” is not an “illegal character in the API” is not a “network down” error. So clearly we need to know what is going on so we can fix it.

But what does the end user need to know?

In my mind, handling this is very simple: what can the user affect?

In the case of a mobile device, the user can retry again later, or he can make sure he’s got good network reception. Period.

So in the event of a failure, the client’s behavior should be simple: if a connection fails, then test to see if the network is up. If the network is down, tell the user “I’m unable to talk to the remote server because your network is down.” And if the network is up, tell the user “Sorry, a problem occurred while talking to the server.”

And that’s it.

If the network is down, the user can try to turn it on: he can take the device out of airplane mode, or move to where he has reception, or turn on the WiFi. But it’s up to the user–we just have to tell him “Um, the network is down.” And let him decide how he wants to handle the problem–or even if he wants to handle the problem: it could be he clicked on our application while at 38,000 feet over the Atlantic.

If we can’t hit the server, but the network is up, then there is nothing the user can do about it. All we can do as software developers is make sure whomever is maintaining the servers that they should investigate the outage. All we can tell the user, however, is “sorry, no dice; try again later.”

And hope to hell our servers are reliable enough we don’t drive users away.

Making it more complex is silly. It doesn’t serve the user who may be stressed out by messages he can do nothing about anyway.

Managerial Abuse.

One of the most interesting comments I ever read in any documentation was something I encountered years ago in Apple’s documentation. It was a technical note discussing using the Apple Macintosh Toolbox, and the key point was essentially that you should not just follow the rules, but you should go the extra step and follow the “gestalt” of the rules. A similar technical note observed that you should not abuse certain managers and use them for functionality they were not designed to handle. Ultimately it was about writing code that not only used the API according to the letter of the law, but also used the API in a way which observed the spirit (and the underlying design goals) of the API.

I have found this in general to be very sound advise, and it amuses me greatly (in a sick and perverted way) to watch programmers stretch the rules because they thought something should be done one way, and suddenly their stuff stops working because their changes went against the grain of the design rules of the platform they’re writing code on.

The latest example: I’m working on a program which at one point needs to access web pages on the iPhone and display them. The developer I’m working with has some very strong ideas as to how the HTTP protocol should work–and apparently his strong ideas have little to do with common HTTP practice. For example, the two common ways to pass parameters to an HTTP server is (a) ‘GET’ parameters, and (b) ‘POST’ parameters in the body of the request. So what does he want to do? Put the parameters in the HTTP header as ‘X-company-xxx’ extensions. And he wants the iPhone to always send these parameters, even from the UIWebView class within the iPhone.

So how did he want to accomplish this feat?

On the iPhone every web request through UIWebView triggers a callback to the -shouldStartLoadWithRequest: delegate method to the UIWebView delegate. The parameter passed is a NSURLRequest.

But it turns out this is actually a NSMutableURLRequest when the user clicks on a link. So, his solution was to cast the NSURLRequest (as advertised in the interface) to an NSMutableURLRequest, verify that it responds to the setValue:forHTTPHeaderField: callback, then populate the new values he wants to send back:

- (BOOL)webView:(UIWebView *)webView shouldStartLoadWithRequest:(NSURLRequest *)req navigationType:(UIWebViewNavigationType)navigationType
{
    NSMutableURLRequest *request = (NSMutableURLRequest *)req;

    if ([request respondsToSelector:@selector(setValue:forHTTPHeaderField:)]) {
        [request setValue:@"myValue" forHTTPHeaderField:@"X-Company-parameter"];
    }
    return YES;
}

Upon reading the above, if your reaction is “hey, that’s pretty clever”, please leave your name and address in the comments section of my blog so I can make sure I never work with you ever, in any professional capacity. However, if your reaction was “ick”, then your reaction was the same as mine. It took two people to convince me to let it go–with my ultimate reaction being “hey, it’s your code. Just as long as you deal with it when it bites you in the end, go for it.”

And guess what? It bites you in the end. By overriding this method and manipulating the request, for whatever reason, map URLs (that is, http://maps.google.com URLs which would otherwise bring up the Apple Google map application) break in this situation.

Managerial abuse bites you in the ass in the end because you start making assumptions about how the underlying system works. And if the system doesn’t work that way, then a future revision to the operating system will break your application hard, and sometimes in very hard to detect ways.

And in this case, this story abused two things: (1) It abused a channel normally reserved for low-level client/server interaction to send information in HTTP that should otherwise be sent in the traditional way with GET or POST parameters, or that should otherwise be handled with cookies. (This abuse puts additional requirements on how client code should behave when interacting with a web site, and these other requirements may effectively pre-empt porting the client to operating systems which don’t provide a mechanism to stuff HTTP header fields on web browser initiated clicks.) (2) It abused an existing API by coercing a non-mutable object through pointer casting into a mutable object, assuming the underlying object will always be mutable.

Thus, life is made unnecessarily complex, and for no better reason than the fact that a developer wants to invent an ad-hoc standard outside of the “gestalt” or design guidelines of the existing spec–only so he can show off just how clever he can be.

Thoughts on project management.

Why don’t designers and developers have more power at many companies? The thought arises from the observation that newspapers could be saved if designers had more power.

I think the answer is fairly simple: there is (or rather, there should be) a clear difference between the functional requirements for a project, and the artistic and technological requirements to fulfill those requirements. For example, when designing a newspaper, the functional requirements may be to make the cover eye-catching for important news ‘above the fold’, to make the banner clear, and to make the contents easy to navigate. Technical functional requirements may also include information such as the number or placement of color pages, or the paper size itself, and business functional requirements may include the percentage of pages devoted to advertising.

Of course this should be obvious as hell. But in practice it is not, and I see two reasons why things fail.

First, they fail because the project manager or owner does not have a clear idea of the functional problems he is trying to solve, or doesn’t want to commit to a simple list of problems that he wants to solve. For example, a project manager managing a new software package who doesn’t know the target audience or understand the use cases for his software package is simply going to be unable to figure out what the functional requirements are–and a “swiss army knife” of functional requirements or (worse) a fluid and ever-changing list of core requirements spells disaster, because the manager or owner can never be made happy and the product will never ship.

Second, they fail because project managers or owners envision themselves as designers or developers or software architects, and rather than enumerate a list of core requirements and allow their people to do their jobs with some degree of flexibility, they micromanage the process. For example, I suspect most newspapers don’t allow designers to have power over the design of the newspaper because the newspaper owners think the page layout should be done a certain way or the newspaper should have a certain look. They don’t allow the designer any power, having decided that they (the owner) has more design experience than the designer.

Often, people who fall in the first category (they have no clue of the functional requirements) insert themselves into the process and fall into the second category (micromanaging the process, by second-guessing the hired professionals), simply because separating the functional from the design or development process requires experience. There is a difference between “make the newspapers headlines readable from a distance” and “use Helvetica 36 point type for the headline”–yet most inexperienced project managers are unable to separate the two.