Pruning Non-Determinacy verses Thinking Ahead.

Oh, look; now we have multiple cores inside of modern day computers. Today you can get a laptop with two processor cores in one chip for cheap, and soon Apple will be releasing a computer with 8 processor cores spread across two microprocessor chips. Noticing that the workaround being used today to allow Moore’s Law to get around physical reality is to stuff more processor cores onto a chip–which requires multithreading and multitasking to take full advantage of it–several articles have come across which attempt to illustrate the pitfalls of multithreaded programming.

Except, well, do they?

Exhibit A: The Problem with Threads, which attempts to illustrate the problem with threads by discussing the Observer Pattern:

Consider the observer pattern,5 a very simple and widely used design pattern. Figure 1 shows a Java implementation valid for a single thread. This shows two methods from a class where an invocation of the setValue() method triggers notification of the new value by calling the valueChanged() method of any objects that have been registered by a call to addListener().

The code in Figure 1 is not thread safe, however. That is, if multiple threads can call setValue() or addListener(), the listeners list could be modified while the iterator is iterating through the list, triggering an exception that will likely terminate the program.

To reiterate, because I don’t feel like copying his code across, the observer pattern example given is simple: someone calls “setValue”, and this sets the value for the object as well as tells everyone else about the value that was just set.

Then we go on for several paragraphs talking about why the code is not thread safe, to which all I can really add here is, well, no fucking shit, Sherlock, it isn’t thread safe.

Frankly, not only is it not thread safe, but the solution doesn’t even make any sense in a multi-threaded environment!

What are you trying to do here? Well, when someone comes along and changes the value, you’d like to tell everyone who cares what that value changed to. Simple enough, right? But in a multi-threaded environment, what does it mean when thread 1 changes the value when thread 2 also wants to change the value? Are the objects who want to be notified thread safe? What does it mean when a central value is changed twice by two threads? What is the semantics here?

In a multi-threaded environment, assuming the Observer Pattern should somehow be made to work overlooks a rather obvious question: what does it mean for two threads to change the same value? And the corollary: when we are notified do we just need to be told that something changed so we can refresh something–like a UI display widget–and so we just need to know the last time it changed? Or do we need to know every value transition in the order in which they occurred, because we’re driving a state machine?

In other words, by focusing upon the design model to proclaim that Threading is hard, we’ve avoided the question of what the hell it is we’re doing. And if we could answer the question “what are we trying to accomplish here,” perhaps we could then solve the problem by using an appropriate design pattern rather than trying to extend a design pattern where it doesn’t belong.

So, say instead the problem is you want to just be notified the last time something changed, so you can update a user interface element. We know that Swing runs everything within its own UI thread–so it means our setValue routine will need to use Swing method EventQueue.invokeLater() on a Runnable object which then sends out the notification to our listeners that the value has changed. Then, when the value has changed, we can determine if we’ve inserted an event into the Swing event queue and see if it has fired yet. If it hasn’t, we’re done. If it has, create and insert a new one.

The implementation is straight forward:

public class ValueHolder
{
    private List listeners = new LinkedList();
    private int value;
    private boolean fWaitToFire = false;
    
    public interface Listener 
    {
        public void valueChanged(int newValue);
    }
    
    private class FireListeners implements Runnable
    {
        private List copyOfListeners;

        FireListeners()
        {
            copyOfListeners = new LinkedList(listeners);
        }
        public void run()
        {
            fireListeners(copyOfListeners);
        }
    }
    
    public synchronized void addListener(Listener listener)
    {
        listeners.add(listener);
    }
    
    public synchronized void setValue(int newValue)
    {
        value = newValue;
        if (!fWaitToFire) {
            EventQueue.invokeLater(new FireListeners());
            fWaitToFire = true;
        }
    }
    
    private void fireListeners(List list)
    {
        int localValue;
        
        synchronized(this) {
            /* Grab value and reset wait to fire. If someone else changes me while */
            /* I'm iterating the values, that's okay; I'll just get fired again later */
            /* Since we're invoked from FireListeners internal class, we're already */
            /* in the main Swing thread, so there are no multi-threaded issues with */
            /* the list iterator here. */
            localValue = value;
            fWaitToFire = false;
        }
        
        Iterator it = list.iterator();
        while (it.hasNext()) {
            ((Listener)it.next()).valueChanged(localValue);
        }
    }
}

Notice the two principles I’ve used here: first, keep the amount of code in the synchronized block as short as possible. That way, the size of the non-syncronous section is kept to a minimum, and parallelism is maximized. Second, I’ve used a monitor flag, ‘fWaitToFire’, which is used to determine if we already have an object queued up in the Swing thread that hasn’t been fired yet. This permits me to minimize the number of times we carry out the expensive operation of creating a copy of my listener list. Since we’re only interested in knowing when the value changed, but not being notified every time the value changed, this may only notify us once if the value is changed twice, right in a row.

Suppose, however, that we need to be notified every time the value has changed, and in the order the value has changed to. Then we need to implement a thread queue: an object which doesn’t just store the current value, but a queue of all of the values that were held by this object. A thread then owned by that object runs in the background, peeling off entries in the queue and sending them out to the listener in order.

Because it’s late, I will leave the implementation of a thread queue to the reader.

There is one thing in this article I do agree with wholeheartedly:

I conjecture that most multithreaded general-purpose applications are so full of concurrency bugs that—as multicore architectures become commonplace—these bugs will begin to show up as system failures.

Absolutely.

And if the current thinking shown in the above linked article prevails–of applying inappropriate design models to a multi-threaded environment without giving a single thought as to what the original problem was–then we’re not going to find any good solutions anytime soon.

Because ultimately the goal seems to be tackling the wrong problem. The problem is not

…[adding] mechanisms [to] enrich the programmer’s toolkit for pruning nondeterminacy.

No, the goal should be to think about what the hell it is you’re trying to accomplish in the first place, and using the correct design model so as not to create non-determinacy in the first place.

On that note I’ll leave you with two thoughts. First, we have the article Thread pools and work queues, of which my outline of an alternate mechanism above for sending out every value change as it happens really is just a work queue with a thread pool size of 1 thread.

And the second is not to fear thread deadlocks–just remember the following rule: if you always lock in order and unlock in reverse order, you’ll be fine. That is, if you have three locks, numbered 1 through 3, so long as you always write your code so that lock A locks before lock B if n(A) < n(B), then you will never deadlock. That’s because deadlocks occur if thread A locks 1,2 and thread B tries to lock 2,1: thread A is waiting for 1 to come free while thread B is waiting for 2 to become free. But if you always lock in order–then you will never have code that deadlocks because you will never have the code in thread B: it locks out of order.

Sometimes that can be damned hard: it can involve rethinking how some code is written, locking locks early, for example, or even adding what seems like unnecessary thread work queues or other stuff. But if you follow the rule above you will never deadlock.

And provably correct code wins over “pruning nondeterminacy” every day of the week and twice on Sundays.

With this in mind, you can imagine the constant mental anguish that was going on inside of my head as I read the following article, submitted into evidence as Exhibit B: Barrier. I mean–deliberately trying to create non-determinacy in order to demonstrate pruning techniques? Why not just give some thought as to appropriate multi-threaded design models and techniques for proper design?

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s