3D Clipping in Homogeneous Coordinates.

Recently I got an Arduboy, and I thought it’d be an interesting exercise to blow the dust off my computer graphics knowledge and put a simple 3D vector drawing system together. After all, while by today’s standards a 16 MHz doesn’t sound like a lot, back when I was learning Computer Graphics at Caltech in the 1980’s in Jim Blinn’s computer graphics class, that was quite a bit of computational power.

A limiting factor on the Arduboy is available RAM; after you factor in the off-screen bitmap for drawing, you only have perhaps 1250 bytes left–not a lot for a polygonal renderer. But when a 4×4 transformation matrix fits into 64 bytes, that’s plenty of space for a vector graphics pipeline.

First step in building a vector graphics pipeline, of course, is building a 3D line clipping system; you want to be able to clip lines not just to the left and right, top and bottom, but also to the near and far clipping planes. And you do want to clip before you draw the lines; on the Arduboy, line drawing coordinates are represented as 16-bit signed integers, and with a 3D rendering system it would be very easy to blow those limits.

Sadly, however, I was unable to find a concise description of the homogeneous coordinate system clipping algorithm that I used when I was at Caltech. It’d been 30 years, and I had a vague memory of how we did it–how we combined the Cohen-Sutherland algorithm with Clipping using homogeneous coordinates and some general observations made during class to build an efficient 3D line clipper. And an Internet search yielded very little.

So I thought I’d write an article here, in case in another 20 years I need to dip back into the archives to build yet another line clipping system.

If you want to skip the math and get to the algorithm, scroll down to the section marked “Algorithm.”

Some preliminaries, before we get started.

In Computer Graphics, it is advantageous to represent 3D points using homogeneous coordinates. That is, for any point in 3D space (x,y,z), you add an additional term w, giving (x,y,z,w).

To convert from a homogeneous coordinate (x,y,z,w), you use the following relationship:


Typically when we convert a point from 3D space to a homogeneous coordinates we simply tack a 1 at the end. There are special cases where you may want to do something simple (such as representing stars plotted infinitely far away, but typically you simply tack a 1 on:


The purpose of this extra parameter is to simplify coordinate transformation through transformation matrices. This extra 1 term, for example, allows us to represent translation (that is, moving a point from one location to another) as a simple matrix multiplication. So if we are moving the points in our system over by (5,3,2), we can create a 4×4 translation matrix:


and a point (x,y,z) would be translated through multiplication:


One nice thing about using homogeneous coordinates is that we can represent perspective through a matrix multiplication. I’ll save you the diagrams you see on other countless web sites by noting that to represent point perspective, you divide the location (x,y) by the distance z; objects far away appear smaller and closer objects appear larger.

If we use a perspective matrix which adjusts the size (x,y) to fit our screen (so objects to the far left are at x = -1, objects to the far top are at y = -1, right and bottom are at +1 respectively), then a point at (x,y,z) would map in our perspective transform:


(Side note: we use a right hand rule for coordinates. This means if up is +y and right is +x, then distance is -z. Which brings me to a rule of thumb with computer graphics: if the image looks wrong, look for an incorrect minus sign. While writing this article and testing the code below, I encountered at least 4 places where I had the wrong minus sign somewhere.)

Now if we convert our coordinates using the homogeneous coordinate to 3D point transformation above:


That is, our 3D point (in what we call “screen space”) has the point perspective transformation baked in, and points in 3D space at (x,y,z) show up with perspective on our 2D screen: we just drop the third term and plot points on our screen at the location (ax/-nz,by/-nz).

Notice that we’ve set our boundaries for screen space with the screen space rectangle (-1,-1)-(1,1). It is advantageous for us to clip our drawings at those boundaries, and (for reasons I’ll skip here), it helps to also clip Z coordinates at a near clipping plane (generally z=-1), and a far clipping plane (z=-∞).

The rest of this article will describe in concrete steps how to do this clipping in homogeneous coordinates, using the two articles cited at the start of this article.

First, let us note that to clip in the screen space x/w >= -1, in homogeneous coordinates we really are clipping at x >= -w.

Second, let us define what we mean by clipping. When we say we are clipping a line, what we’re doing is finding the point where a line intersects our clipping plane x >= -w. We then shorten the line at that new point, drawing only the segment that is visible.

So, given two points A and B, we want to find the point C where C intersects the plane x = -w, or rather, where x + w = 0:


We can do this by observing that the dot product of a point and a vector is the distance that point is along the vector, and that our clipping plane x + w = 0 can also be defined as the set of points P where


So to find our point C, we calculate the distance along the vector (1,0,0,1) (that is, x = -w) for the points A and B, and linearly interpolate to find the location C.


There are some other useful observations we can make about the set of relationships above.

First, if x is greater than -w — that is, if we are on the inside of the clipping plane x/w >= -1, then the dot product A(1,0,0,1) will be greater than or equal to 0.

Second, if both A(1,0,0,1) and B(1,0,0,1) are both negative, then trivially our line is not visible, because both endpoints of the line are not visible.

And we can repeat this operation for all the relationships: for x >= -w, x <= w, y >= -w, y <= w, and z >= -w.

The far clipping plane can also be taken care of, by observing that our far clipping plane at infinity is reached as z/w approaches infinity; that is, as w approaches zero. So our far clipping plane would be the relationship z <= 0.

This gives us the following normal vectors representing our 6 clipping planes:


Note that the vectors are set up so that when we are outside of the clipping plane, the value of the dot product is negative: that is, if the relationship x <= w holds, then 0 < w – x, and our clipping plane for that relationship is (-1 0 0 1).

The algorithm

This allows us to start sketching our solution.

Note our definition of a 3D vector is a homogeneous vector:

struct Vector3D
    double x,y,z,w;

First, we need a method to calculate all our clipping dot products. Note that because we’ve transformed our process into clipping against a set of vectors, the routine below could be generalized to add additional clipping planes. (For example, you may want to create a renderer that only renders the stuff in front of a wall in your game.)

static double Dot(int clipPlane, const Vector3D &v)
    switch (clipPlane) {
        case 0:
            return v.x + v.w;        /* v * (1 0 0 1) */
        case 1:
            return - v.x + v.w;      /* v * (-1 0 0 1) */
        case 2:
            return v.y + v.w;        /* v * (0 1 0 1) */
        case 3:
            return - v.y + v.w;      /* v * (0 -1 0 1) */
        case 4:
            return v.z + v.w;        /* v * (0 0 1 1) */
        case 5:
            return - v.z;            /* v * (0 0 -1 0) */
    return 0; /* Huh? */

Now we borrow from the Cohen-Sutherland algorithm, and create a method which generates an “outcode”; this is a bitmap with 6 bits, each bit is set if a point is “outside” our clipping plane–that is, if the dot product for the given coordinate is negative.

static uint8_t OutCode(const Vector3D &v)
    uint8_t outcode = 0;
    for (int i = 0; i < 6; ++i) {
        if (Dot(i,v) < 0) outcode |= (1 << i);
    return outcode;

(Note that we could easily optimize this method by using direct compares rather than by calling into our Dot method. We can also generalize it if the dot product method above is generalized.)

A nice feature about our OutCodes is that if we have the outcodes for points A and B, the following relationships are true:

  • if ((a_outcode & b_outcode) != 0) then both points are on the wrong side of at least one edge, and we can trivially reject this line.
  • if ((a_outcode | b_outcode) == 0) then both points are inside our viewing area, and we can trivially accept this line.

Now if it turns out both relationships are true, then we must clip our line A – B. And we can do that by calculating the alpha coordinates along A and B–if A is outside, we track the alpha on the A side of the line, and if B is outside, we track an alpha along B. And if the two flip–then we have a case like the one shown below:


This allows us to set up our final clipping algorithm.

First, we track the last point that was added, and the out code for that point:

static Vector3D oldPos;      // The last point we moved or draw to
static uint8_t oldOutCode;   // The last outcode for the old point.

We also rely on a drawing routine which actually draws in screen space. This should take the 3D homogeneous point provided, transform (using the transformation to 3D space noted above) to a 3D point, drop the Z component, convert the (x,y) components from the rectangle (-1,-1) – (1,1) to screen pixels, and draw the line as appropriate:

void MoveDrawScreen(bool draw, const Vector3D &v);

One last method we need is a utility method that, given a start point and an end point, and an alpha, calculates the linear interpolation between the start and end points.

static void Lerp(const Vector3D &a, const Vector3D &b, float alpha, Vector3D &out)
    float a1 = 1.0f - alpha;
    out.x = a1 * a.x + alpha * b.x;
    out.y = a1 * a.y + alpha * b.y;
    out.z = a1 * a.z + alpha * b.z;
    out.w = a1 * a.w + alpha * b.w;

Now our method declaration takes a new point and a flag that is true if we’re drawing a line, or false if we are moving to the point:

void MoveDrawClipping(bool draw, const Vector3D &v)
    Vector3D lerp;
    float aold;
    float anew;
    float alpha;
    float olda,newa;
    uint8_t m;
    uint8_t i;
    uint8_t newOutCode;
    uint8_t mask;

     *    Calculate the new outcode

    newOutCode = OutCode(v);

In the above we calculate the outcode for our new point v. This allows us to quickly determine if the point is inside or outside our visible area. This is useful when we want to just move to a point rather than draw to a line:

     *  If we are not drawing, and the point we're moving to is visible,
     *  pass the location upwards.

    if (!draw) {
        if (newOutCode == 0) {
    } else {

If draw is true, then we’re drawing a line, and so we get to the meat of our clipping algorithm. Note that our old out code and old points (stored in the globals above) have already been initialized. (Or rather, we assume they have been with an initial move call.)

So we can borrow from the Cohen-Sutherland algorithm and quickly determine if our line is entirely clipped or entirely visible:

         *  Fast accept/reject, from the Cohen-Sutherland algorithm.

        if (0 == (newOutCode & oldOutCode)) {
             *  The AND product is zero, which means the line may be
             *  visible. Calculate the OR product for fast accept.
             *  We also use the mask to quickly ignore those clipping
             *  planes which do not intersect our line

            mask = newOutCode | oldOutCode;
            if (0 == mask) {
                 *  Fast accept. Just draw the line. Note that our
                 *  code is set up so that the previous visible line has
                 *  already been moved to

            } else {

If we get to the code below, this means we have a value for mask which indicates the clipping planes that we intersect with. So now we want to interate through all the clipping planes, calculating the dot product for only those planes we intersect with:

                 *  At this point mask contains a 1 bit for every clipping
                 *  plane we're clipping against. Calculate C alpha to find
                 *  C, and adjust the endpoints A alpha (aold) and B alpha
                 *  (anew)

                aold = 0;            /* (1 - alpha) * old + alpha * new = pt */
                anew = 0;            /* so alpha = 0 == old, alpha = 1 == new */
                m = 1;
                for (i = 0; i < 6; ++i) {
                    if (mask & m) {

                        /* Calculate alpha; the intersection along the line */
                        /* vector intersecting the specified edge */

                        olda = Dot(i,oldPos);
                        newa = Dot(i,v);
                        alpha = olda/(olda-newa);

And next we need to bring in the side of the line which we know has clipped against the clipping plane. Meaning if A is outside our visible area and B is inside, we want to adjust Aα to the new value calculated in alpha; but if A is inside and B outside, we want to adjust Bα instead.

                        /* Now adjust the aold/anew endpoints according to */
                        /* which side (old or v) is outside. */
                        if (oldOutCode & m) {
                            if (aold < alpha) aold = alpha;
                        } else {
                            if (anew > alpha) anew = alpha;

And finally if somehow the edges cross, we have the circumstance above where the line is not visible but crosses two clipping planes. So we quickly reject that case:

                        if (aold > anew) {
                             *  Our line is not visible, so reject.

The rest simply closes our loop:

                    m <<= 1;

At this point if we have iterated through all six planes, we have a line which is visible. If our starting point in oldPos was not visible, then we need to move to the place where our starting point intersects with the clipping boundaries of our view space:

                if (i >= 6) {
                    /* Ran all clipping edges. */
                    if (oldOutCode) {
                        /* Old line coming into view */

And if our destination point is visible, we draw to it; otherwise, we draw to the point on the line we’re drawing to which intersects our clipping boundaries:

                    /* Draw to the new point */
                    if (newOutCode) {
                    } else {

Our clipping is done; all that is left is to close up the loops and conditionals, and store away in our globals the outcode for the point we just moved to, as well as the location of that point:


     *    Now update the old point to what we just moved to

    oldOutCode = newOutCode;
    oldPos = v;

There are a number of optimizations that can take place with the clipping code above. For example, we can also track an array of dot products so we’re not repeatedly calculating the same dot product over and over again. We could replace our outcode routine with simple compare operations. We could also extend the above routine to handle additional clipping planes.

But this gives us a vector line drawing clipping system that clips in homogeneous coordinates, complete with source code that can be compiled and executed.

What is old is new again; what started as a powerful desktop computer has now appeared as an embedded processor. So it’s good to remember the old techniques as we see more and more microcontrollers show up in the products we use.

How to tell the difference between a Computer Science person and a Software Engineer:

What is Noam Chomsky best known for?

  • His left-wing political positions, especially his positions on the Israeli-Palestinian conflict and his position on modern capitalism.
  • His work on Transformational generative grammars, and his work on formal grammars and the Chomsky-Schützenberger hierarchy.

Playing with CareKit, and a simple demo client/server app.

I’ve started over the past few weeks to play with Apple’s new CareKit API.

And I thought it’d be interesting to put together a demo app using GWT (for a web front-end service), Java and Postgres (on the back end), and iOS (with CareKit and ResearchKit) for the iOS app.

Of course the project is incomplete. It does demonstrate basic account management, as well as mobile onboarding, but at present the code to edit care instructions or gather information from the mobile device is not working.

Important note: The code is not HIPPA compliant. Specifically not logging has been wired in to track access to patient data. This code should not be used in a medical setting, at least without a thorough understanding of the security issues involved with handling patient data, and without adding the necessary code for HIPPA compliance, including finer grained security access, and security auditing.

This does demonstrate a couple of interesting features, which I believe are critical when working on a mobile application in a health care setting:

  • A Diffie-Hellman key exchange which encrypts the data that is exchanged. Private confidential information is encrypted in such a way so that even using a tool like Charles to reveal the contents of the packets will reveal no confidential information. (This algorithm is even implemented within GWT, so confidential information on the front-end web site is preserved in the face of an SSL “man in the middle” attack. Note that a number of firewall/antivirus products will generate a certificate which permits those products to examine all data between a client and server–so simply using SSL is insufficient in securing data between a client and server.)
  • An “Apple-TV” style login process, whereby the user logs in on the web, and adds his mobile device by entering an 8-character string displayed on his phone. This greatly simplifies the process of adding a device to an account–which is important for elder patients who may not have the motor skills to use the built-in keyboard on iOS.

Of course there are a bunch of things I haven’t completed yet, but which I intend to tinker with over the next few weeks and months:

  • Allowing a care plan to be edited in the GWT front end, so that a medical professional can create a care plan, and securely synchronize that plan with a mobile device.
  • Allow compliance and measurement information to be sent from the mobile device to the back-end, but with checks to determine if network access is available. The idea here is to allow information to be synchronized when the network is available, but only occasionally: data is cached and sent perhaps daily rather than at the moment when the measurement is taken.
  • Allow compliance information to be viewed on the web by a care professional.

There are other features that would need to be added to make this a functional application. Specifically:

  • A greater degree of access control and access authentication would need to be provided, in order to allow users to only view patient information they are authorized to view.
  • Logging and auditing information would need to be provided to log all access to sensitive information, and to allow audits of sensitive information to be performed.
  • Ideally this would also be integrated into a larger ERM system for patient care.

I do believe applications like CareKit are the wave of the future, in that it empowers users to have greater control over their own medical care, which is why I started playing with the software in the first place.

Complete sources can be found on GitHub.

Sometimes it’s nice to scratch a small itch.

So I picked up a small Arduboy on Kickstarter, and I tossed together a game.

It’s cool to be playing with a device with only 2.5K of RAM (of which 1K is dedicated to an offscreen drawing bitmap), and only 32K of program space (of which about 8K or so is dedicated to the Arudboy firmware). The way you write software for such a small device is completely different than how you’d write software for a larger device with more memory.

The game is a variation on the box moving puzzle games out there; at present I have 30 levels of varying difficulty. You can download the sources for the game from GitHub.

Makes me want to get involved in the IoT, of course with better security practices than we saw here, or here.

Welcome to the wonderful world of liability.

So the next project I’m taking on requires that I carry general liability insurance. And one of the requirements of the general liability insurances was to take down and stop selling E6B software for pilots.


Well, it’s not like I sold a lot of copies of the software to the public. And heck, the Android version wasn’t even active; the software had been taken down from the Play store because I forgot to renew some contracts with Google Play.

But it is a shame that something like that cannot be offered to the public.

I’ve got a couple of weeks; I’ll probably publish all of the code open source on my personal GitHub account instead, because why not?

Two Factor Authentication

Credential Stealing as an Attack Vector

Basically if someone can steal your password, they often can get into your system and do a lot of damage.

A quick review: authentication “factors” involve who you are, what you know, and what you have. A one-factor authentication technique involves just one of these three things: a key to your front door involves what you have, a password involves what you know. We can bolster the strength of any of these “factors” by requiring longer passwords or better designed keys and key locks, but you only have one thing to bypass to get into the system.

For “two factor” authentication to work, it involves two of these three factors: routinely it involves “what you have” and “what you know.” So your ATM card is a two-factor system: you must first present your ATM card, then enter your password PIN.

What makes two factor authentication more powerful is that now you have two systems (working on concert) to bypass. This also allows you to weaken one factor if the other factor must also be present–such as an ATM PIN which is an easily guessable 4 to 6 digit number.

The nice thing about cell phones is that it easily allows us to distribute software that allows your cell phone to participate in the login process: making “what you have” your cell phone as part of the login process.

And there are a number of applications out there which can be used as part of a two-factor authentication process, such as Google Authenticator, which implements TOTP and HOTP one-time passwords. It should also be relatively easy to implement your own one-time password application, though ideally unless you know what you’re doing, just download and drop in the libraries from someone else’s implementation.

One way to secure your communications…

Security is, in part, about making it more expensive for a hacker to crack your system and obtain secure information.

Yesterday I noted that just because you wrap your protocol in SSL/TLS doesn’t make it secure.

Today I’ve been playing with Diffie-Hellman key exchange, using the 1024 Bit MODP key from RFC 4306 as the constants G and P in the algorithm described in the Wikipedia article. I’ve implemented this in Java using BigInteger, in code that compiles using GWT to compile to Javascript, in order to secure a conversation between a web front end and a server back end. The resulting key generated by the Diffie-Hellman exchange is used to seed a Blowfish encryption scheme which also compiles to GWT; packets are thus encoded using Blowfish and the shared secret from a DH exchange, then sent wrapped in JSON using Base64 encoding.

And just now I got the whole thing to work: I know have secure packets between a web client and a web server back-end.

That is the sort of stuff that makes me happy.

Your protocol is not safe just because you wrap it in SSL/TLS.

A common thing I’ve encountered over the years at various companies is the implicit belief that it doesn’t matter what their protocol looks like; as long as it is wrapped in SSL/TLS (say, by making sure your endpoint is an https:// server instead of http://), then your protocol is secure from hackers and prying eyes.

And if your protocol is secure from prying eyes, then the actual security of the packets being sent is somewhat immaterial; since a hacker cannot discover the format of your packets they cannot hack your system.

The justification, of course, being the standard diagram of Adam talking to Beth while Charlie is in the dark as to what’s going on:

Adam and Beth with Charlie Confused.

… but …

But what if your protocol supports your app, which can be downloaded from the iTunes store for free? What if your protocol is the protocol between your web client and the back-end server? What if, for free, Charlie can get his own client?

Charlie Controls the Client

Well, there are a number of tools out there which Charlie can use to sniff the packets between a client under his control (such as your client app running on his phone, or your web front-end running on his computer), and your back-end server. I’ve used Charles, an HTTP proxy/monitor tool to successfully sniff the contents of a packet–and Charles will even create a self-signed certificate and install it so that the SSL/TLS certificate is considered valid, so that Charles can look inside the SSL/TLS stream to read the contents of the packets.

I suspect this is how the Nissan Leaf was hacked. I’m sure there are countless other applications out there being similarly hacked.

So how do you deal with this attack?

First, realize you have a problem. Realize that simply changing to SSL/TLS will not save a poorly designed protocol. Realize that someone can easily deconstruct the packets being sent between your client and the server, and that may expose you to spoofing.

Second, REST is dead. I’m sorry for those who worship at the altar of REST, but if the server is purely stateless, there is no state which represents the access controls a particular user may have been granted by the server when he originally connected.

It’s not to suggest we shouldn’t strive for a REST-like protocol; if the same command is sent at different times it should yield the same results, all other things being equal. But we need to consider some server-side state in order to regulate who has access to which features or which devices. (By the way, a traditional Unix file system, on which HTTP’s ‘get’, ‘put’ and ‘delete’ verbs are modeled after, and which form the basis of the REST architectural style, also contain access control lists to guarantee someone performing a read (or ‘get’ command) has access to the file (or URI) requested. Consider your logging into your desktop computer “state” from the file system’s perspective.)

Had Nissan incorporated access controls to their application–which could have been as simple as the user logging in and receiving a connection token to be used to verify ownership of the car he is controlling–then the Nissan Leaf hack would never have happened. Simply hiding the protocol behind SSL/TLS doesn’t help, as we’ve seen above.

Third, consider encrypting sensitive data, even if it is being sent via SSL/TLS. The problem is that there are several brands of dedicated firewalls which create their own valid SSL/TLS certificate so the firewall can sniff the contents of all https:// transactions–which means if you are sending your credit card, it is possible someone’s firewall hardware has sniffed your credit card information. And in an environment where there are legal requirements to provide proper access controls (such as mandated by HIPPA), this provides an added layer of security against proxy servers which are sniffing all the packets on your network, and which may be inadvertently logging protected information.

Update: This also applies to the Internet of Things. In fact, it applies even more to the Internet of Things, given the number of devices running software that will seldom (if ever) be updated.

Flaws in Samsung’s “Smart” Home Let Hackers Unlock Doors and Set Off Fire Alarms

I guess 30% faster isn’t bad.

Spent a bunch of time optimizing the SCBigInteger class on iOS to speed up RSA encryption/decryption and key generation in SecureChat. I’ve gotten a 30% increase in speed, which I suppose isn’t too bad, though it appears the Java implementation is still faster. And that makes me wonder what Java is doing.

Check out the latest at GitHub.

SecureChat: an update.

Once you have the infrastructure for secure chatting between clients, expanding it to handle more than just text is a matter of encoding.

And so I’ve extended the client to allow sending and receiving photographs. Securely.

A new version has been pushed to the master branch of GitHub.

The interesting part about all this is learning the limitations of encryption using an RSA public/private key architecture.

The upside: unless the encryption key is generated on your device and shared using a secure mechanism (such as public/private key architecture), your protocol is not secure. Sure, your specifications may claim that the server-generated encryption key used by the clients is not stored on the server after it is generated–but we only have your promise.

RSA public/private key encryption, however, is computationally expensive. Even on Android, where the BigInteger implementation has been seriously optimized over my first pass attempt at building my own BigInteger code, encrypting and decrypting an image can be quite costly. It’s why, I suspect, most commercial applications use a symmetric encryption mechanism–because most symmetric encryption systems are significantly faster (as they involve bit-wise operations rather than calculating modulus products on large integers.

Which is why I’ll be focusing my attention on the iOS implementation of BigInteger.

But even on Android, decrypting a message using a 1024 bit RSA key can be acceptably fast enough, especially if encryption security is more important to you than efficiency.

In practice I can see using something like Diffie-Hellman to mediate exchanges between two devices. Nothing behind the mathematics of that protocol require the protocol to be completed in one session; there is no reason why the server couldn’t store the first part of the exchange (the value A = ga mod p) on the server, so that later another client B couldn’t then complete the exchange. It implies each device stores the shared secret of all the devices it is communicating with–but with today’s modern devices, storing a thousand shared secrets doesn’t represent a lot of memory.

It may also be possible to use a variation of Diffie-Hellman in conjunction with a symmetric encryption algorithm (perhaps by using the shared secret as a key to such an encryption scheme).