*sigh*

And then some asshole hacked my blog. The only thing I lost was the theme I had tinkered with to something resembling what I liked. Now I’m stuck with this until I can restore the old template formatting stuff I liked.

How do I know what environment variables are on iOS?

Apple’s iOS operating system is built on Unix, which implies that you can take your Unix-based C code, wrap it around some Apple iOS UI goodness, and have an iPhone application.

(Well, it’s more complicated than that. But if you’re a crusty ol’ Unix hacker like myself, there is something gratifying being able to use fopen() and getenv() instead of Apple’s NS-wrapped calls to these same routines.)

So how do you know what the environment variables are that are available on your phone?

Simple: I wrote a simple iOS program which neatly displays all of the available environmental variables.

And the variables available appear to be:

  • PATH: The path of available unix commands
  • TMPDIR: The temporary directory path (points to the temp folder in your sandbox)
  • SHELL: /bin/sh, natch.
  • HOME: The home directory of your application (points to the root of your sandbox)
  • USER: mobile, natch.
  • LOGNAME: mobile

and some odd ones I’ve never seen:

  • __CF_USER_TEXT_ENCODING = 0x1F5:0:0
  • CFFIXED_USER_HOME, which appears to be the same as HOME

In a debug environment I’m seeing the additional oddball variables:

  • CFLOG_FORCE_STDERR = YES
  • NSUnbufferedIO = YES
  • DYLD_INSERT_LIBRARIES = some path apparently on my host computer. (?)

Now this is on my iPhone running iOS 5.1; YMMV. Which is why I uploaded the program. Though it appears I would trust that both HOME and TMPDIR will both be available and point to the right place, and constructing the paths to the Documents and Library folders is just a matter of concatenating the path string returned from HOME. So if you need to write a new file to the Documents folder in the home directory of your application you can write:

char buffer[256];
strcpy(buffer,getenv("HOME"));
strcat(buffer,"/Documents/myfile.txt");
FILE *f = fopen(buffer,"w");
...
fclose(f);

REST is not secure.

Simple Security Rules

Security by obscurity never works. Assume the attacker has your source code. If you are doing some super cool obscuring of the data (like storing the account number in the URL in some obscured manner like the Citi folks apparently did), someone can and will break your algorithm and breach your system.

Excuse my language, but what on this God’s mother-fucking Earth were the Citi folks thinking? Really? REALLY?!?

But I can see the conversation amongst the developers at Citi now. “We really need to implement our web site using modern REST techniques, meaning we cannot store state on the servers, but have the complete state representation in the client, and use stateless protocols to communicate between the client and server.

Hell, I’ve been part of this conversation at a number of jobs over the past few years.

But here are the problems with a truly stateless protocol.

Stateless protocols are subject to replay attacks.

If a request sent to the back end is stateless, then (unless there is a timestamp agreed to between the client and server which expires the transactions) it must necessarily follow that the same request, when sent to the server, will engage the same action or obtain the same results. This means if, for example, I send a request to transfer $100 from my checking account to pay a bill, that request, when sent again, will transfer another $100 from my checking account to the bill account. If sent 10 times, $1,000 will be transferred.

And so forth.

The biggest problem to replay attacks, of course, is a black-hat sniffing packets: theoretically, since the protocol is stateless, the black-hat doesn’t even have to be able to decrypt the stateless request. He just needs to echo it over and over a bunch of times to clean out your account.

Stateless protocols are insecure protocols.

As we saw in the case of Citi, vital information specifying your account was part of the request. This means that, if the bad guys can figure out how the account information was encoded, they can simply resend the request using a different account and–volia!–instant access to every account at Citi.

Combine that with a reworked request to transfer money–and you can now write a simple script that instantly cleans out every Citi bank and transfers the money to some off-shore account, preferably in a country with banking laws that favor privacy and no extradition treaties.

See, the problem with a stateless protocol is that if the protocol requres authentication, somewhere buried in every request is a flag (or a condition of variables) which say “yes, I’m logged into this account.”

Figure out how to forge that, and you can connect to any account just by saying “um, yeah, sure; I’m logged in.”

Stateless protocols expose vital information.

Similarly, a black-hat who has managed to decrypt a stateless protocol can now derive a lot of information about the people using a service. For Citi, that was bank account information. Other protocols may be sending (either in the clear or in encrypted format) other vital, sensitive information.

But we can just use SSH to guarantee that the packets are only being sent to our servers.

When was the last time you got a successful SSH handshake from a remote web server where you took the time to actually look at the authentication keys to see that they were signed by the entity you were connecting to? And when was the last time you logged into a secure location, got a “this key has expired” warning, and just clicked “ignore?”

SSH only works against a man-in-the-middle attack (where our black-hat inserts himself into the transaction rather than just sniffs the data) when users are vigilant.

And users are not.

But we can use a PKI architecture to secure our packets.

If your protocol is truly stateless, then your client must hold a public key to communicate with your server, and that public key must be a constant: the server cannot issue a key per client because that’s–well, that’s state. Further, you can’t just work around this by having the client hold a private key for information sent by the server; the private key is no longer private when a user can download the key.

So at best PKI encrypts half of the communications. And you have the key to decrypt the data coming off the server. And since all state is held by the client, that implies if you listen to the entire session, you will eventually hear all of the state that the client holds–including account information and login information. You may not know specifically what the client is sending–but then, you have the client, so with state information and detailed information about the protocol (contained in the client code), you’re pretty much in like Flynn.

My own take:

REST with a stateless protocol should be illegal when handling monetary transactions.

For all the reasons above, putting out a web site that allows account holders to access their information via a stateless RESTful protocol is simply irresponsible. It’s compromising account holder’s money in the name of some sort of protocol ideological purity.

Just fucking don’t do it.

You can eliminate most of these risks by maintaining a per-session cookie (or some other per-session token) which ties in with account information server-side.

It’s actually quite easy to use something like HttpSession to store session information. And if your session objects are serializable, many modern Java Servlet engines support serializing state information across a cluster of servers, so you don’t lose scalability.

Personally I would store some vital stateful information with the session, stored as a single object. (For example, I would hold the account number, the login status, the user’s account ID, and the like in that stateful object, and tie it in with a token on the client.) This is information that must not be spoof-able across the wire, and information that is only populated on a successful login request.

The flip side is that you should only store the session information you absolutely need to keep around, for memory considerations: there is no need to keep the user’s username, e-mail address, and other stuff which can be derived from a quick query using an integer user ID. Otherwise, if you have a heavily used system, your server’s memory may become flooded with useless state information.

And don’t just generate some constant token ID associated with the account. That doesn’t prevent against replay attacks, and if your ID is small enough, it just substitutes one account number for another–which is, well, pretty damned close to worthless.

Use a two-state login process to access an account.

It’s easy enough to develop a protocol similar to the APOP login command in the POP3 protocol.

The idea is simple enough. When the user attempts to log in, a first request is made to the remote server requesting a token. This token can be the timestamp from the server, a UUID generated for the purpose, or some other (ideally cryptographically secure) random number. Client-side, the password is then hashed using a one-way hash (like MD5 or SHA-1) along with the token number. That is, we calculate, client-side: passHash = SHA1(password + “some-random-salt” + servertoken);.

We then send this along with the username to the server, which also performs the same password hash calculation based on the transmitted token. (And please don’t send the damned token back to the server; that simply allows someone to replay the login by replaying the second half of the login protocol rather than requesting a new token for this specific session to be issued from the server.) If the server-side hash agrees with the client-side hash, then the user is logged in.

Create a ‘reaper’ thread which deletes state information on a regular basis.

If you are using the HttpSession object, by default many Servlet containers will expire the object in about 30 minutes or so. If you’re using a different mechanism, then you may consider storing a timestamp with your session information, touching the timestamp on each access, and in the background have a reaper thread delete session objects if they grow older than some window of time.

In the past I’ve built reaper threads which invalidated my HttpSessions, to deal with older JSP containers which apparently weren’t reaping the sessions when I was expecting them to. (I’ve seen this behavior on shared JSP server services such as the ones provided by LunarPages, and better to be safe than sorry.)

For God’s sake, just say “NO!” to REST. Especially when dealing with banking transactions. Especially if you’re the banking services I use: I don’t want my money to be transferred to some random technologically sophisticated mob syndicate, simply because you needed your fucking ideological purity of a stateless RESTful protocol and didn’t think through the security consequences.

GSM Feature codes that appear to work with the iPhone on the AT&T network.

Some random notes I cribbed together regarding GSM feature codes on the iPhone on the AT&T network. I tried some of them (including the #31# trick), but not all of them. YMMV.

GSM Feature Codes (that appear to work on the iPhone on AT&T)

http://www.geckobeach.com/cellular/secrets/gsmcodes.php

Feature codes 21, 67, 61, 62 appear to work. Note changing the number for unanswered voice calls from the default that was plugged into the phone may make it harder to reset; meaning if you set up call forwarding to something other than the voice mail number (on my phone 1-253-709-4040), write down what was was before so you can re-enable voice mail.

Feature code 31 appears to work. (#31#number hides your phone number.)

http://cellphonetricks.net/apple-iphone-secret-codes/

All codes work. The two digit call values all accept the same formats documented in the link above, I believe. (Note that some of these codes duplicate functionality that can be found in settings.)

http://www.arcx.com/sites/GsmFeatures.htm

This notes that you can also set a timeout parameter for “Forward if not answered” so as to set the amount of time before your call forwards somewhere. I don’t know if this works. Note that the FIDO code (3436) to send to voice mail doesn’t appear to work on the iPhone. Typing in a 10-digit number (starting with 1 + area code + number) appears to work.

Parsing the new OpenStreetMaps PBF file format.

I’ve been playing with the new .PBF file format from OpenStreetMaps for encoding their files, and thus far I’m fairly impressed. The new file format is documented here, and uses Google Protocol Buffers as the binary representation of the objects within the file. The overall file is essentially a sequence of objects written to a single data stream, with each of the elements of the stream encoded using the Google Protocol Buffer file format.

Here’s what I had to do to get a basic Java program up and running.

(1) Download the Google Protocol Buffers library and decompress.

(2) You will now need to build the Google Protocol compiler, in order to compile the .proto files for the OSM file. To do this, cd into the directory where the protocol buffers were created, and compile:

./configure
make
make install

Note that this will install Google’s libraries into your /usr/local directory. If you don’t want that, do what I did:

mkdir /Users/woody/protobuf
./configure --prefix=/Users/woody/protobuf
make
make install

(Full disclosure: I’m using MacOS X Lion.)

(3) Download the protocol buffer definitions for OSM.

(4) Compile them.

(Full disclosure: I downloaded the above files into ~/protobuf, created in step 2 above. When I did this, compiling the files took:

bin/protoc --java_out=. fileformat.proto
bin/protoc --java_out=. osmformat.proto

(5) Compile the descriptor.proto file stored in the downloaded protobuf-2.4.1 directory (created in step 1) src/google/protobuf/descriptor.proto file.

(Full disclosure: I copied this file from it’s location in the protobuf source kit into ~/protobuf created in step 2. I then compiled it with:

bin/protoc --java_out=. descriptor.proto

(6) Create a new Eclipse project. Into that project copy the following into the source kit:

(a) protobuf-2.4.1/java/src/main/java/*
(b) The product files created in steps (4) (~/protobuf/crosby…, ~/protobuf/com…)

(7) Test application

Now it turns out from the description on the OpenStreetMaps PBF file format, the file is encoded using a 4 byte length which gives the length of the BlobHeader record, the BlobHeader record (which contains the raw length of the contents), and a Blob which contains a stream which decodes into a PrimitiveBlock. The map data is contained in the PrimitiveBlock, and there are multiple PrimitiveBlocks for a single file. So the file sort of looks like a sequence of:

Length (4 bytes)
BlobHeader (encoded using Protocol Buffers)
Blob (encoded using Protocol Buffers)

And the blob object contains a block of data which is either compressed as a zlib deflated stream which can be inflated using the Java InflaterInputStream class, or as raw data.

And there are N of these things.

Given this, here is some sample code which I used to successfully deserialize the data from the stored file us-pacific.osm.pbf:

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.zip.InflaterInputStream;

import crosby.binary.Fileformat.Blob;
import crosby.binary.Fileformat.BlobHeader;
import crosby.binary.Osmformat.HeaderBlock;
import crosby.binary.Osmformat.PrimitiveBlock;

public class Main
{

	/**
	 * @param args
	 */
	public static void main(String[] args)
	{
		try {
			FileInputStream fis = new FileInputStream("us-pacific.osm.pbf");
			DataInputStream dis = new DataInputStream(fis);
			
			for (;;) {
				if (dis.available() <= 0) break;
				
				int len = dis.readInt();
				byte[] blobHeader = new byte[len];
				dis.read(blobHeader);
				BlobHeader h = BlobHeader.parseFrom(blobHeader);
				byte[] blob = new byte[h.getDatasize()];
				dis.read(blob);
				Blob b = Blob.parseFrom(blob);

				InputStream blobData;
				if (b.hasZlibData()) {
					blobData = new InflaterInputStream(b.getZlibData().newInput());
				} else {
					blobData = b.getRaw().newInput();
				}
				System.out.println("> " + h.getType());
				if (h.getType().equals("OSMHeader")) {
					HeaderBlock hb = HeaderBlock.parseFrom(blobData);
					System.out.println("hb: " + hb.getSource());
				} else if (h.getType().equals("OSMData")) {
					PrimitiveBlock pb = PrimitiveBlock.parseFrom(blobData);
					System.out.println("pb: " + pb.getGranularity());
				}
			}
			
			fis.close();
		}
		catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}

Note that we successfully parse the OSMHeader block and the PrimitiveBlock objects. (Each OSM file contains a header block and N self-contained primitive blocks.)

I’m still sorting out how to handle the contents of a PrimtiveBlock; my goal is to eventually dump this data into my own database with my own database schema for further processing. But for now this gets one in the door to reading .pbf files.

I hope this helps someone out there…

As an aside I know there are more efficient ways to parse the file. This is just something to get off the ground with, with the proviso that the code is short and simple, and hopefully rather clear.

On the passing of Mr. Jobs.

I’ve been turning over in my mind how I feel about the passing of Steve Jobs. After all, he’s a public figure: I did not know him, I did not work for Apple, my sole involvement with the products he’s created involve owning a series of Macintosh systems and writing software using a Macintosh. Oh, and owning a series of iPods, iPhones and an iPad.

And really, at some level, aside from being a consumer of products his multi-billion dollar company created, what does a public figure have to do with my own personal life or how I feel about things. After all, would I even bother opening up MarsEdit and typing in this essay at work (when I should be doing more important things) if, for example, Matthias Müller passed away? Would I care if the CEO of Muhtar Kent passed away? Would I even notice if John Stumpf passed on? After all I also consume their products as well.

I don’t have many personal heroes: people I look to for inspiration for myself. Most people I know, even most people I look towards in public life, sometimes have their moments. But for the most part they strike me as flawed fools who happen to be in the right place at the right time–at best, entertaining and amusing, occasionally saying something interesting or quote-worthy.

Thinking back, I honestly think I only have two, and they were people who shaped my world.

The first is Ronald Reagan. When I reached high school Jimmy Carter was in office, the embassy in Iran was overrun by students, and he refused to light the Christmas Tree because it was somehow wasteful. We were a country in decline, so President Carter said, and the best we could hope for as a people was for the rest of the world to forgive us our sins and accept us as equals–and that, in a world dominated by the Soviet Union and by a Communist system that had absolutely no respect whatsoever for the individual. Learn Russian, hope to be accepted as a cog in the machine, and perhaps the Politburo would assign you to something that wasn’t so dreadful that it couldn’t be drowned out by some cheap potato vodka.

And Reagan seemed to change all that.

President Reagan claimed to be the Hedgehog: He knew just one Great thing.

It was my first introduction to the idea that a man, consumed with one great and solitary vision, could change the world.

But in many ways Steve Jobs and Apple really set the tone of my own thinking, at least when it comes to development and aesthetics and the things I find insanely great.

I learned the idea of user interface design and the aesthetics of great computer human interaction at the hands of the Apple User Design Guidelines and the various SIGCHI papers published by Apple in human interaction studies.

I was inspired by the visual aesthetics of Apple’s products, including early vision videos they published in the 1980’s which attempted to look forward 20 years to where we would be today.

I watched as Apple’s attention to detail (including Steve Jobs’ famously using a loupe to make sure all the pixels lined up in MacOS system software and on the iPhone) and understanding that design is how a thing works, not just how a thing looks, captured the imagination of the consumer public, first with the iPod, then with the iPhone, and finally with the iPad.

I realized over the years of working on Apple software how the various engineering decisions under the hood were driven by a genuine need to make a computer simpler to use–from the comment in an older version of “Inside Macintosh” how the tiny flutter of the user’s manual comes at the price of the “thud” of the two manual programming guideline set makes when dropped on the desktop. And how this desire to make insanely great products that people could use drove even fundamental research into things like TCP/IP multicasting, which allows someone to just turn on their printer and see it through Bonjour, without ever realizing just how fucking complex the transactions were behind the scenes to make that magic work.

But ultimately I wanted to do insanely great things as well.

One of my own life’s frustrations has always been that I haven’t been able to do the insanely great, that because of circumstances I’ve been put in places where at best I can do “fairly good”–but product managers with no fucking sense, other developers who have no sense of the “lusers” they deal with, and development managers who hated the idea of a direct report outshining them have stood in my way. But then, I’m not alone in this, and it is the trials and tribulations and setbacks and how we meet them define our character.

And even here I know I’m not alone; I simply need to remember how Steve Jobs was ousted to remember that setbacks happen even to the best of us.

It’s not to say that Steve Jobs did all these things: he didn’t write the SIGCHI papers or the Apple HIG documents or create the original Finder and System software packages that came on my original (1984) Macintosh. But he surrounded himself with brilliant people, inspired and guided them, and drove them to greater and greater levels of perfection.

And in seeking this type of concrete perfection, it always felt to me that it was a way to seek to touch the face of God.

In so many ways my own thinking of the world, and how I look to the world, is formulated by the ideas and philosophies and concepts championed and driven by Ronald Reagan (in politics) and by Steve Jobs (in technology).

And as such, Steve Jobs will be sorely missed.

Random snippets of useful code: MacOS scrolling edition.

The fact that scrolling by default places things in the lower left (instead of upper left) seems brain dead to me. Fortunately it’s easy to flip things internally for your application. Just add the following snippet of code to one of your views, and things will work out just fine:

//	Clip view, causes scroll contents to grow from top

@interface NSClipView (Private)
- (BOOL)isFlipped;
@end

@implementation NSClipView (Private)
- (BOOL)isFlipped
{
	return YES;
}
@end

Slaying dragons, or how to compete against the big boys.

Realize that large corporations came to being by starting as small companies, and by carefully growing their user base. Often large corporations with existing product lines have been around for a while, so they have a number of institutional shortcomings which can be exploited for competitive purposes.

This is especially true in the arena of software, where progress over the past 10 years have left older companies with legacy solutions for problems that no longer exist.

Also realize that most companies have, in an effort to fill their niche and exploit as many revenue streams as possible, have expanded their products by adding features that very few people use. Often products like this (such as Microsoft Office) are attempting to tick off as many checkmarks on the product feature list as possible in order to justify their high price. From a cost benefit perspective, however, those additional features cost a tremendous amount to develop, in order to capture an increasingly smaller audience–so the companies are in effect using the existing revenue stream to develop features that add very little value to capture a fractional number of people.

So to slay a dragon, or at least to compete against one:

(1) Look for products where the established player has an expensive and complex product, with complexity that was created to solve problems (such as memory limitations or computational limitations) which no longer exist. For example, Photoshop was designed in an era where 8 megabytes of RAM was unheard of, and where processor speeds were measured in dozens of megahertz. So Photoshop is full of legacy image paging and image management utilities that are just no longer needed in an era where the cheapest throwaway laptops have 100 times more RAM and run 100 times faster.

(2) Look for products which have a tremendous amount of complexity, where that complexity has been in part replaced by advancements in operating system design or which is just no longer needed because of advances in hardware. Again, looking at Photoshop, much of Photoshop’s image processing abilities have been duplicated by MacOS’s CoreGraphics engine, which means what once required a Ph.D. in Computer Graphics to code can now be handled by any noob with a few calls to a well-documented API. (The underlying algorithms are still complicated, but they’re delivered to you on a silver platter.)

(3) Look for products where 90% of the users only use 10% of the features.

When these three things are true, you have a perfect storm: the ability to create a product (like Pixelmator) which is inexpensive (because you don’t have to recreate the entire Photoshop computational stack), which does 10% of what Photoshop does (but the 10% that people want), which can be sold for a fraction of what Photoshop sells for.

The brothers who created that tool made millions.

Of course it takes decades to overthrow an Adobe or a Microsoft Word. But it is doable: we’ve already seen it with Quark XPress and with Framemaker, and I believe we’re seeing it right now with Microsoft Office (at least in the Macintosh ecosystem) being displaced by iWork.

Sometimes you can even tackle a market by creating a product that does 10% of a competitor’s functionality, but adds a few new features a competitor would never add in a million years, to attack a new market no-one is really addressing. For example, Opacity is a brilliant little tool that sorta competes against Illustrator, but is really designed for building the little fiddly bits of artwork used for application development. (To tell you how application-centric that tool is, it even has an export utility which will export Objective C code to draw the paths, and has the ability to insert variables that get expressed in code, so you can draw artwork that then is translated into a function call for drawing a UIView of an NSView.)

It is possible to slay dragons. You just have to be careful how you do it. It takes a long time. It’s risky.

But just because some big guy is in that market already doesn’t mean you can’t take that market over and eventually win. After all, Apple was some stupid PC maker who didn’t know anything about the consumer market or how cell phones are made.

And another curiosity: multi-stream BZip2 files.

It’s “weird bugs” day here on Tiny Toons.

And the “weird bug” I was encountering was using Apache Commons Compress‘s BZip2CompressorInputStream class to decompress a OpenStreetMaps Planet file on the fly while parsing it. I kept getting an unbalanced XML exception.

To make a long story short, the bzip2 file is compressed using a multi-stream compression algorithm, which means that, in order to use parallel compression on the file stream, the single file is broken into pieces, each piece is compressed, and the overall file is reassembled from the pieces–with each piece basically being a complete BZip2 file.

The best solution of course is to add multi-stream support to the BZip2CompressorInputStream class. But after spending an hour hacking at the class, I came up with a simpler solution: a wrapper input stream which, when it sees that the BZip2 decompressor has returned an EOF but there is still data in the compressed input data stream, restarts the decompressor.

Here’s that class:

import java.io.IOException;
import java.io.InputStream;

import org.apache.commons.compress.compressors.CompressorInputStream;
import org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream;

/**
 * Handle multistream BZip2 files.
 */
public class MultiStreamBZip2InputStream extends CompressorInputStream
{
	private InputStream fInputStream;
	private BZip2CompressorInputStream fBZip2;

	public MultiStreamBZip2InputStream(InputStream in) throws IOException
	{
		fInputStream = in;
		fBZip2 = new BZip2CompressorInputStream(in);
	}

	@Override
	public int read() throws IOException
	{
		int ch = fBZip2.read();
		if (ch == -1) {
			/*
			 * If this is a multistream file, there will be more data that
			 * follows that is a valid compressor input stream. Restart the
			 * decompressor engine on the new segment of the data.
			 */
			if (fInputStream.available() > 0) {
				// Make use of the fact that if we hit EOF, the data for
				// the old compressor was deleted already, so we don't need
				// to close.
				fBZip2 = new BZip2CompressorInputStream(fInputStream);
				ch = fBZip2.read();
			}
		}
		return ch;
	}
	
	/**
	 * Read the data from read(). This makes sure we funnel through read so
	 * we can do our multistream magic.
	 */
	public int read(byte[] dest, int off, int len) throws IOException
	{
		if ((off < 0) || (len < 0) || (off + len > dest.length)) {
			throw new IndexOutOfBoundsException();
		}
		
		int i = 1;
		int c = read();
		if (c == -1) return -1;
		dest[off++] = (byte)c;
		while (i < len) {
			c = read();
			if (c == -1) break;
			dest[off++] = (byte)c;
			++i;
		}
		return i;
	}
	
	public void close() throws IOException
	{
		fBZip2.close();
		fInputStream.close();
	}
}

Wrapping our FileInputStream object in one of these, and feeding this to the SAX XML parser, seemed to do the trick.

Today I didn’t actually get any work done. Instead, it was spent looking for weird and unusual bugs and banging my head against them.

Meh.