iPhone Multitouch

When handling touch events, the events you get via the touch events in the UIView class is pretty much the raw events from the hardware, wrapped in Objective C objects. If you’re dragging around one finger, then while keeping that finger down touch with a second, you get a new touchesBegan event. Lift the first finger but keep the second, and you get a touchesEnded event for the first finger, but the move events for the second continue.

I suppose this is what one would expect. But it means that if you drag with two fingers, you probably are going to receive touchesMoved events for each finger separately, rather than receiving a touchesMoved event for both fingers at the same time.

This implies that if you want to track all the fingers moving around at the same time, you need to maintain the state of affairs; that is, you need to keep track of the touch events that haven’t moved as well as the ones that have moved.

Here is some code I wrote which keep the current touch events in a secondary set, so I can see and track all the fingers as they move. This isn’t the best code in the world, but it does prove the concept.

@implementation TestTouch

- (id)initWithCoder:(NSCoder *)aDecoder
{
    self = [super initWithCoder:aDecoder];
    if (self) {
		self.multipleTouchEnabled = YES;
		set = [[NSMutableSet alloc] initWithCapacity:10];
    }
    return self;
}

- (void)drawRect:(CGRect)rect
{
	[[UIColor whiteColor] setFill];
	UIRectFill(rect);
	
	[[UIColor blackColor] setFill];
	for (UITouch *t in set) {
		CGRect r;
		CGPoint pt = [t locationInView:self];
		r.origin.x = pt.x - 22;
		r.origin.y = pt.y - 22;
		r.size.width = 44;
		r.size.height = 44;
		UIRectFill(r);
	}
}

- (void)dealloc
{
	[set release];
    [super dealloc];
}

- (void)touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event
{
	for (UITouch *t in touches) {
		[set addObject:t];
	}
	[self setNeedsDisplay];
}

- (void)touchesCancelled:(NSSet *)touches withEvent:(UIEvent *)event
{
	[set removeAllObjects];
	[self setNeedsDisplay];
}

- (void)touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event
{
	for (UITouch *t in touches) {
		[set removeObject:t];
	}
	[self setNeedsDisplay];
}

- (void)touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event
{
	[self setNeedsDisplay];
}

@end

If you need to capture when all the touches start and when they all end, you can do this by testing if the set (defined as NSMutableSet in the class) is empty.

For whatever reason I want to think that the touches set passed is full of all of the fingers, but no–you only get the finger that changed, not the fingers that stayed still or didn’t change. Thus, you need a set to capture them all.

Android Things To Remember

1. The onDetachFromWindow method is your friend, especially when writing image-intensive custom views.

2. Always recycle your Bitmaps.

3. The VM is out to kill you; don’t always trust it.

4. Related to 3, the VM is out to kill you because the relationship between activities, intents, views and other objects can easily create a circular link consuming huge memory resources that are held by some random and unforeseen global object. It’s why singletons are your best friend and your worst enemy.

5. I’d rather be writing iOS apps.

Drawing scaled images in Android is expensive.

So I wrote a custom view which displays a list of bitmaps. The way this works is to draw a grid of images by loading them from disk, then drawing those images using Canvas.drawBitmap(Bitmap, Rect, Rect, Paint).

And during scrolling it was dog slow.

So what I did was to pre-scale the images in a weak hash map:

    private WeakHashMap fScaledBitmaps = new WeakHashMap();

    private Bitmap getScaledBitmap(String url, int cellWidth, int cellHeight)
    {
        /*
         * Check our cache and return if it's present
         */
        Bitmap bmap = fScaledBitmaps.get(url);
        if (bmap != null) {
            if ((bmap.getWidth() == cellWidth) && (bmap.getHeight() == cellHeight)) return bmap;
            // size different; kill bitmap
            bmap.recycle();
            fScaledBitmaps.remove(url);
        }
        
        bmap = ... get our image from interior cache ...
        if (bmap == null) {
            // bitmap not present; return null
            return null;
        } else {
            // Bitmap loaded. Now grab the scaled version
            Bitmap scale = Bitmap.createScaledBitmap(bmap, cellWidth, cellHeight, true);
            bmap.recycle();

            fScaledBitmaps.put(url, scale);
            return scale;
        }
    }

And this sped up scrolling from sluggish drawing once a second to quick and smooth.

Lesson: drawing a scaled image to a canvas is frighteningly expensive. Like an order of magnitude slower than pre-scaling the bitmap and storing it in a weak reference or weak hash map.

Things to Remember: Why cells with an inserted image are taller than the image in GWT

So here’s a mystery.

Suppose you create in GWT a vertical panel or a flex table, and you add an image which is less than 15 pixels tall:

VerticalPanel panel;

...
panel.add(new Image("images/mydot.png"));

But for whatever reason, the cell displays as 15 pixels tall.

Apparently what happens is that the way the image object is inserted into the

object that makes up the vertical panel, an extra bit of text winds up being inserted alongside the image object.

And that blank has vertical height.

If you write the following, you can limit the vertical spacing, allowing for tighter heights:

Image image = new Image("images/mydot.png");
panel.add(image);
DOM.setStyleAttribute(image.getParent().getElement(),"fontSize","1px");

In testing this seems to tighten things up quite a bit.

I need to investigate this further. But apparently when DOM objects are being inserted during the construction of GWT objects, unwanted extra junk (in the form of blank text spaces) is being inserted at the same time.

Things To Remember: GWT Drag and Drop Edition

So I wrote the following code and wired it into the mouse up, down and move listeners so I can do drag and drop:

		@Override
		public void onMouseUp(MouseUpEvent event)
		{
			isDragging = false;
			// Do something to indicate we're done dragging
			
			DOM.releaseCapture(fClickLayer.getElement());
		}
		
		@Override
		public void onMouseMove(MouseMoveEvent event)
		{
			if (isDragging) {
				// Do something with the event to drag
			}
		}

		@Override
		public void onMouseDown(MouseDownEvent event)
		{
			DOM.setCapture(fClickLayer.getElement());
			// Note the initial state to start dragging
		}

And of course after the first three move events I stopped receiving move events, but instead the widget would be selected.

After banging my head against a brick wall for a couple of hours I realized I needed to prevent the browser from taking the default behavior on the received click and move events. And that’s done with the preventDefault() method:

		@Override
		public void onMouseUp(MouseUpEvent event)
		{
			isDragging = false;
			// Do something to indicate we're done dragging
			
			DOM.releaseCapture(fClickLayer.getElement());
		}
		
		@Override
		public void onMouseMove(MouseMoveEvent event)
		{
			if (isDragging) {
				event.preventDefault();		// ADD ME HERE
				// Do something with the event to drag
			}
		}

		@Override
		public void onMouseDown(MouseDownEvent event)
		{
			DOM.setCapture(fClickLayer.getElement());
			event.preventDefault();		// ADD ME HERE
			// Note the initial state to start dragging
		}

Duh.

As an aside, here’s a snippet of code that you can use to prevent something from being selected in HTML. You can also do this in CSS, I suppose. I encountered this snippet of code here.

	private native static void disableSelectInternal(Element e, boolean disable) 
	/*-{
    	if (disable) {
        	e.ondrag = function () { return false; };
        	e.onselectstart = function () { return false; };
        	e.style.MozUserSelect="none"
    	} else {
        	e.ondrag = null;
        	e.onselectstart = null;
        	e.style.MozUserSelect="text"
    	}
	}-*/;

I first tried hooking this up to the class receiving the mouse events, but to no avail.

Snippets of GWT

Strategy for validating text input in a TextBox during input:

One way to validate that the string being entered into a text box is properly formatted is to reject key changes as the user types, if the resulting string would result in an invalid string. The strategy I’m employing is to add a key press handler and schedule a deferred command: the deferred command then reverts the contents of the text box if the contents are illegal.

Thus:

public class TestEditorBox extends TextBox
{
    public TestEditorBox()
    {
        super();
        
        /*
         * Hang handler off key press; process and revert text of it doesn't
         * look right
         */
        addKeyPressHandler(new KeyPressHandler() {
            @Override
            public void onKeyPress(KeyPressEvent event)
            {
                final String oldValue = getText();
                Scheduler.get().scheduleDeferred(new Scheduler.ScheduledCommand() {
                    @Override
                    public void execute()
                    {
                        String curVal = getText();
                        if (!validate(curVal)) setText(oldValue);
                    }
                });
            }
        });
    }

    private boolean validate(String value)
    {
         // return true if the string is valid, false if not.
    }
...
}

The things in Android that keeps tripping me up.

android:layout_weight

When building a layout, the biggest thing that keeps going through my mind is “how do I get this object to lay itself out so it consumes only what is left in a linear layout flow?”

And the answer to that is android:layout_weight.

If you specify a layout and you want one of the controls to land at the bottom of the screen with a fixed height, then the other control in the LinearLayout should be set with height “match_parent”, and weight set to 1. This causes it to consume the rest of the space. (Bonus: you can split the view by having multiple controls with different weights, and you can even achieve an effect such as one control taking a third and the other two thirds, by using appropriate weights.)

android:gravity

It’s the other one I keep forgetting about. It allows you to center something on the screen, or flush it to the right, or whatever. Apply to the view to control it’s positioning inside the container parent.

I also have some code lying around here which helps to control multiple activities where a single activity would normally live, that I cobbled together by reading this post; he goes into how to extend an ActivityGroup to achieve multiple activities within the same tab group item. I think this principle can be extended to support other interesting effects, such as having a list view where each row in the list is it’s own activity. But that’s something I need to plug away at to see if I can make it work.

It’s not quite black magic…

Problem: I need to connect to an SSL socket while ignoring the trust certificate chain, so I can connect to a self-signed SSL connection. In particular I want to connect to an LDAP server using the Netscape LDAP library but I keep hitting the error:

“Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target”

I managed to find a snippet of code with the correct solution here, and am basically repeating it here for my own edification.

First, to create a client socket you need to use a socket factory. But in order to connect to an SSL socket while bypassing the trust chain on the certificate, you must first set up the socket factory:

SocketFactory gSocketFactory;
...
SSLContext sc = SSLContext.getInstance("SSL");
TrustManager[] trustAll = new TrustManager[] {
        new X509TrustManager() {
            @Override
            public void checkClientTrusted(X509Certificate[] arg0,
                    String arg1) throws CertificateException
                    {
                    }

            @Override
            public void checkServerTrusted(X509Certificate[] arg0,
                    String arg1) throws CertificateException
                    {
                    }

            @Override
            public X509Certificate[] getAcceptedIssuers()
            {
                return null;
            }
        }
};

sc.init(null, trustAll, new SecureRandom());
gSocketFactory = sc.getSocketFactory();

Once the socket factory has been created, you can make an SSL connection:

Socket socket = gSocketFactory.createSocket();
socket.connect(new InetSocketAddress(host,port));
// Do stuff with the socket

The magic bit of code creates a new trust management chain and installs it into a new SSLContext object, used to build our socket factory. The trusted chain contains one X509TrustManager, which does–nothing. And thus, we can connect to all SSL sockets without worrying if we have the correct certificate installed or if the remote SSL connection is properly signed.

Usual caviats remain: don’t do this unless you must, it creates all sorts of security problems (such as increased susceptibility to man in the middle attacks), your mileage may vary, limited supplies are available, no shoes no shirt no service.

Goodbye Far Clipping Plane.

I really wanted to write this up as a paper, perhaps for SigGraph. But I’ve never submitted a paper before, and I don’t know how worthy this would be of a SigGraph paper to begin with. So instead, I thought I’d write this up as a blog post–and we’ll see where this goes.

Introduction

This came from an observation that I remember making when I first learned about the perspective transformation matrix in computer graphics. See, the problem basically is this: the way the perspective transformation matrix works is to convert from model space to screen space, where the visible region of screen space goes from (-1,1) in X, Y and Z coordinates.

In order to map from model space to screen space, typically the following transformation matrix is used:

perspective.gif

(Where fovy is the cotangent of the field of view angle over 2, aspect is the aspect ration between the vertical and horizontal of the viewscreen, n is the distance to the near clipping plane, and f is the distance to the far clipping plane.)

As objects in the right handed coordinate space move farther away from the eye, the value of z increases to -∞, and after being transformed by this matrix, as our object approaches f, zs approaches 1.0.

Now one interesting aspect of the transformation is that the user must be careful to select the near and far clipping planes: the greater the ratio between far and near, the less effective the depth buffer will be.

If we examine how z is transformed into zs screen space:

derivematrix.gif

And if we were to plot values of negative z to see how they land in zs space, for values of n = 1 and f = 5 we get:

zpersgraph.png

That is, as a point moves closer to the far clipping plane, zs moves closer to 1, the screen space far clipping plane.

Notice the relationship as we move closer to the far clipping plane, the screen space depth acts as 1/z. This is significant when characterizing the accuracy of the representation of an object’s distance and the accuracy of the zs representation of that distance for drawing purposes.

If we wanted to eliminate the far clipping plane, we could, of course, derive the terms of the above matrix as f approaches ∞. In that case:

farclipinf1.gif

And we have the perspective matrix:

persmatrix2.gif

And the transformation from z to zs looks like:

zpersgraph2.png

IEEE-754

There are two ways we can represent a fractional numeric value. We can represent it as a fixed point value, or we can use a floating point value. I’m not interested here with a fixed point representation, only with a floating point representation of numbers in the system. Of course not all implementations of OpenGL support floating point mathematics for representing values in the system.

An IEEE 754 floating point representation of a number is done by representing the fractional significand of a number, along with an exponent.

ieee754.gif

Thus, the number 0.125 may be represented with the fraction 0 and the exponent -3:

ieee754ex.gif

What is important to remember is that the IEEE-754 representation of a floating point number is not accurate, but contains an error factor, since the fractional component contains a fixed number of bits. (23 bits for a 32-bit single-precision value, and 52 bits for a 64-bit double-precision value.)

For values approaching 1, the error in a floating point value is determined by the number of bits in the fraction. For a single-precision floating point value, the difference from 1 and the next adjacent floating point value is 1.1920929E-7, which means that as numbers approach 1, the error is of order 1.1920929E-7.

We can characterize the error in model space given the far clipping plane by reworking the formula to find the model space z based on zs:

zspacederive.png

We can then plot the error by the far clipping plane. If we assume n = 1 and zs = 1, then the error in model space zε for objects that are at the far clipping plane can be represented by:

zerrorderive.gif

Graphing for a single precision value, we get:

zerror.png

Obviously we are restricted on the size of the far clipping plane, since as we approach 109, the error in model space grows to the same size as the model itself for objects at the far clipping plane.

Clearly, of course, setting the far clipping plane to ∞ means almost no accuracy at all as objects move farther and farther out.

The reason for the error, of course, has to do with the representation of the number 1 in IEEE-754 mathematics. Effectively the exponent value for the IEEE-754 representation is fixed to 2-1 = 0.5, meaning as values approach 1, the fractional component approaches 2: the number is effectively a fixed-point representation with 24 bits of accuracy (for a single-precision value) from 0.5 to 1.0.

(At the near clipping plane the same can be said for values approaching -1.)

separator.png

All values in the representation range of IEEE-754 points have the same feature: as we approach the value, the representation is similar to if we had picked a fixed-point representation with 24 (or 53) bits. The only value in the IEEE-754 range which actually exhibits declining representational error as we approach that value is zero.

In other words, for values 1-ε, accuracy is fixed to the number of bits in the fractional component. However, for values of ε approaching 0, the exponent can decrease, allowing the full range of bits in the fractional component to maintain the accuracy of values as we approach zero.

With this observation we could in theory construct a transformation matrix which can set the far clipping plane to ∞. We can characterize the error for a hypothetical algorithm that approaches 1 (1-1/z) and one that approaches 0 (1/z):

zerrors.png

Overall, the error in model space of 1-1/z approaches the same size as the actual distance itself in model space as the distance grows larger: err/z approaches 1 as z grows larger. And the error grows quickly: the error is as large as the position in model space for single precision values as the distance approaches 107, and the error approaches 1 for double precision values as z approaches 1015.

For 1/z, however, the ratio of the error to the overall distance remains relatively constant at around 10-7 for single precision values, and around 10-16 for double-precision values. This suggests we could do away without a far clipping plane; we simply need to modify the transformation matrix to approach zero instead of 1 as an object goes to ∞.

Source code:

The source code for the above graph is:

public class Error
{
    public static void main(String[] args)
    {
        double z = 1;
        int i;
        
        for (i = 0; i < 60; ++i) {
            z = Math.pow(10, i/3.0d);
            
            for (;;) {
                double zs = 1/z;
                double zse = Double.longBitsToDouble(Double.doubleToLongBits(zs) - 1);
                double zn = 1/zse;
                double ze = zn - z;

                float zf = (float)z;
                float zfs = 1/zf;
                float zfse = Float.intBitsToFloat(Float.floatToIntBits(zfs) - 1);
                float zfn = 1/zfse;
                float zfe = zfn - zf;

                double zs2 = 1 - 1/z;
                double zse2 = Double.longBitsToDouble(Double.doubleToLongBits(zs2) - 1);
                double z2 = 1/(1-zse2);
                double ze2 = z - z2;

                float zf2 = (float)z;
                float zfs2 = 1 - 1/zf2;
                float zfse2 = Float.intBitsToFloat(Float.floatToIntBits(zfs2) - 1);
                float zf2n = 1/(1-zfse2);
                float zfe2 = zf2 - zf2n;
                
                if ((ze == 0) || (zfe == 0)) {
                    z *= 1.00012;   // some delta to make this fit
                    continue;
                }

                System.out.println((ze/z) + "t" + 
                        (zfe/zf) + "t" + 
                        (ze2/z) + "t" + 
                        (zfe2/zf));
                break;
            }
        }
        
        for (i = 1; i < 60; ++i) {
            System.out.print(""1e"+(i/3) + "",");
        }
    }
}

We use the expression Double.longBitsToDouble(Double.doubleToLongBits(x)-1) to move to the previous double precision value (and the same with Float for floating point values), repeating (with a minor adjustment) in the event that floating point error prevents us from propery calculating the error ratio at a particular value.

A New Perspective Matrix

We need to formulate an equation for zs that crosses -1 as z crosses n, and approaches 0 as z approaches -∞. We can easily do this by the observation from the graph above: instead of calculating

zoldformula.gif

We can simply omit the 1 constant and change the scale of the 2n/z term:

znewformula.gif

This has the correct property that we cross -1 at z = -n, and approach 0 as z approaches -∞.

znewgraph.png

From visual inspection, this suggests the appropriate matrix to use would be:

newpersmatrix.gif

Testing the new matrix

The real test, of course, would be to create a simple program that uses both matrices, and compares the difference. I have constructed a simple program which renders two very large, very distance spheres, and a small polygon in the foreground. The large background sphere is rendered with a diameter of 4×1012 units in radius, at a distance of 5×1012 units from the observer. The smaller sphere is only 1.3×1012 units in radius, embedded into the larger sphere to show proper z order and clipping. The full sphere (front and back) are drawn.

The foreground polygon, by contrast, is approximately 20 units from the observer.

I have constructed a z-buffer rendering engine which renders depth using 32-bit single-precision IEEE-754 floating point numbers to represent zs. Using the traditional perspective matrix, the depth values become indistinguishable from each other, as their values approach 1. This results in the following image:

rendertest_image_err.png

Notice the bottom half of the sphere is incorrectly rendered, as is large chunks of the smaller red sphere.

Using the new perspective matrix, and this error does not occur in the final rendered product:

rendertest_image_ok.png

The code to render each is precisely the same; the only difference is the perspective matrix:

public class Main
{
    /**
     * @param args
     */
    public static void main(String[] args)
    {
        Matrix m = Matrix.perspective1(0.8, 1, 1);
        renderTest(m,"image_err.png");
        
        m = Matrix.perspective2(0.8, 1, 1);
        renderTest(m,"image_ok.png");
    }

    private static void renderTest(Matrix m, String fname)
    {
        ImageBuffer buf = new ImageBuffer(450,450);
        m = m.multiply(Matrix.scale(225,225,1));
        m = m.multiply(Matrix.translate(225, 225, 0));
        
        Sphere sp = new Sphere(0,0,-5000000000000d,4000000000000d,0x0080FF);
        sp.render(m, buf);
        
        sp = new Sphere(700000000000d,100000000000d,-1300000000000d,300000000000d,0xFF0000);
        sp.render(m, buf);
        
        Polygon p = new Polygon();
        p.color = 0xFF00FF00;
        p.poly.add(new Vector(-10,-3,-20));
        p.poly.add(new Vector(-10,-1,-19));
        p.poly.add(new Vector(0,0.5,-22));
        p = p.transform(m);
        p.render(buf);
        
        try {
            buf.writeJPEGFile(fname);
        }
        catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Notice in the call to main(), we first get the traditional perspective matrix with the far clipping plane set to infinity, then we get the alternate matrix.

The complete sources for the rendering test which produced the above images, including custom polygon renderer, can be found here.

With this technique it would be possible to render correctly large landscapes with very distant objects without having to render the scene twice: once for distant objects and once for near objects. To use this with OpenGL would require adjusting the OpenGL pipeline to allow the far clipping plane to be set to 0 instead of 1 in zs space. This could be done with the glClipPlane call.

Conclusion

For modern rendering engines which represent the depth buffer using IEEE-754 (or similar) floating point representations, using a perspective matrix which converges to 1 makes little sense: as values converge to 1, the magnitude of the error is similar to that of a fixed-point representation. However, because of the nature of the IEEE-754 floating point representation, convergence to 0 has much better error characteristics.

Because of this, a new perspective matrix than the one commonly used should have better rendering accuracy, especially if we change the far clipping plane to ∞.

By using this new perspective matrix we have demonstrated a rendering environment using 32-bit single-precision floating point values for a depth buffer which is capable of representing in the same scene two objects whose size differs by 11 orders of magnitude. We have further shown that the error in representation of the zs depth over the distance of an object should remain linear–allowing us to have even greater orders of magnitude difference in the size of objects. (Imagine rendering an ant in the foreground, a tree in the distance, and the moon in the background–all represented in the correct size in the rendering system, rather than using painter’s algorithm to draw the objects in order from back to front.)

Using this matrix in a system such as OpenGL, for rendering environments that support floating point depth buffers, would be a matter of creating your own matrix (rather than using the built in matrix in the GLU library), and setting a far clipping plane to zs = 0 instead of 1.

By doing this, we can effectively say goodbye to the far clipping plane.

Addendum:

I’m not sure but I haven’t seen this anywhere else in the literature before. If anyone thinks this sort of stuff is worthy of SigGraph and wants to give me any pointers on cleaning up and publishing, I’d be greatful.

Thanks.