Computer Languages and Entry Points

Every computer language has what I would call an “entry point”: a set of things which you need to ‘grok’ in order to understand that computer language. They’re fundamental assumptions made about the shape of the universe which is just “off” enough that you’ll forever be confused if you don’t get them. (Or at least they’re things that confused me until I got them.)

Here are a few languages I spent time learning and the ‘pivot points’ around those languages.

LISP
* It’s all about recursion. Learn to live by, and love, recursion.
* Because there is no real natural ‘syntax’ per se, pick a ‘pretty print’ style and stick with it.
* There is no such thing as a single “program”; applications just sort of ‘jell’. This is unlike C or Pascal programs, which definitely have a beginning, middle, and end to development. (This fact made me wonder why, besides Paul Graham, no-one used LISP for web development.)

Pascal and C
* The heap is your friend. Understand it, appreciate it, love it.
* The heap is your enemy; it is a mechanism designed to kill your program at every opportunity if given a chance.

C++
* Objects are your friends. Instead of telling the program the steps to accomplish something, you describe the objects which are doing things. To someone who is used to thinking procedurally, this is a really fantastic brain fuck.
* While a->foo() looks like a pointer to a method, it is really a reference to a vtable which happens to hold a pointer to a method with an invisible argument. If you’re used to thinking about C as a glorified portable assembly language (and can even remember which form of a++ for a pointer a translates into a single assembly instruction on a 680×0 processor), vtables help bridge the gap from “talking to bare metal” to “abstract object oriented programming.”
* Resource Acquisition Is Initialization.

Java
* The package hierarchy is like a parallel “code-space” file system.
* When learning Java, also learn the run-time library. Love it or hate it, but you should first learn the java.lang package, followed by java.util, java.io, and then java.net. All the rest is icing.

Objective C
* Learn to love the Smalltalk-like syntax. For someone who cut their eye-teeth on LISP, C++ and Java, the Smalltalk-like syntax leaves you wondering what to call a method; after all, every language except Smalltalk has just one name for a function and like any good language based on math, the name prefixes the parameters. Not Smalltalk, nor Objective-C.
* If you cut your eye-teeth on the rules for addRef()/release() in Microsoft’s COM environment, the retain/release rules are fucking odd. With Microsoft, the rule was simple: if something hands you a pointer to an object, you own it; you release it. If you pass a pointer to an object, you must increment the reference count on the object because you pass the object means you’re passing ownership.

Not Objective C, which in adding ‘autorelease’ has made the rules a little bit more complicated. Now the rules seem to be:

(1) If you are handed a pointer to an object, you must explicitly retain it if you are holding it.
(2) If you pass an object to another object, it will retain it, so you need to do nothing.
(3) If you create an object out of thin air, you own it–and if you don’t need to hold it, you must autorelease it so it gets released.

Now (3) gets real tricky: basically, there is a naming convention which tells you if you ‘created an object out of thin air.’ Basically, if the name of the method you called starts with ‘alloc’ or ‘new’ or contains the word ‘copy’, you own it, and you have to let it go with autorelease. (Why not ‘release’ instead? Because ‘release’ causes it to go away now, while autorelease causes it to go away “eventually.”)

I long for the simplicity of COM, even though it creates additional verbage in your code. Though I’d much rather have the garbage collection of Java or LISP: just drop the damned thing, and it’ll go away on its own.

Multi-threaded programming

While not strictly a language, it does impose its own rules on the universe, and there are a few key things to ‘grok’ to get it:

* There really are only a few “safe” design models for building multi-threaded code. You have thread pools, single-threaded event pumps, and re-entrant data structures (which can be as simple as using a semaphore to grant exclusive access to as complicated as using read/write locks on a complex data structure)–and that’s it. In other words, you either have units of work which can be isolated into a single object and worked in parallel (because they’re unrelated)–good for a thread pool–or you have a process (such as a user interface) where you shove all your parallelism into a queue which is operated on in a single thread (as is done in most UI libraries), or you have a specialized case where you have a re-entrant data structure which was designed from the ground up to be multi-thread ready.

And that’s it.

Anything else is a bug waiting to be discovered.

* If you need to create a re-entrant data structure that is more complicated than a semaphore-protected object, you really really need to remember to avoid deadlocks by making sure you use a consistent locking order.

* Oh, and it is quite helpful to create a semaphore object or a locking object wrapper, for testing purposes, which times out (and fails noisily) after some set period, such as five minutes. Preferably blowing up by killing all threads waiting on this object with a complete stack trace.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s