Sunday, March 4, 2012

The Things We Throw Away

Upon finishing the development section of a project, I remarked that I had written about three times as much code as finally got shipped. Why is that?

Non-Linear Development

You always hear about FTW moments, when somebody writes some code and it works the first time they run it. Or when somebody is under pressure and he or she comes through in a pinch in record time. Isn't this the common case?

No. That might as well be Hollywood screenwriter glamorizing the world of hackers. This perception works against us in so many ways. Don't get me wrong: I have had many FTW moments. When an attempt goes right and succeeds way past my expectations. When a piece of code that I have written, a non-trivial section of code, works without error the first time. We can relish those moments, for sure. But they are blue moons, people. Lightning strikes. And at most they are tiny little victories.

Development is a non-linear process.

This may not be the case when it comes to writing a small subroutine, or a library function. But when it comes to the difficult, nearly intractable projects, like the ones I prefer to work on, rewriting and rethinking is an essential part of the game. The scope of such projects lends itself to research, which definitely keeps work interesting. By research, I sometimes mean looking for other solutions to the problem. Or related solutions.

I have said that when a problem is deconstructed, it can sometimes come to resemble other problems. We can use alternative approaches, and then reconstruct to provide completely new solutions to the problem that may simply have never been thought of before. This is a bit of a messy way to stay creative. But it has worked time and time again.

But it doesn't work every time I try it.

And this is one reason why development is non-linear: if one method doesn't work, then we have to try another until we do find one that does. This means taking suggestions from others. And trying them out. This means looking at related fields. Turning the problem inside-out, or looking at it backwards: tracing backwards from the solution.

This technique is used when someone else has demonstrated a solution to a problem. The thought goes like this: if they succeeded at solving it, then they would have to have done this. And to do that, they would have to have figured out how to do this other thing. Generally it is a chain of certain reasoning.

But when you are solving a problem for the first time, you are not guaranteed that this technique will lead to a solution. And so development becomes non-linear again. Because you can try to find a solution this way, but you may not succeed.

Hard Problems and Some Ways of Looking At Them

It simply comes down to this: some problems are so hard that you end up grasping at straws to figure out a solution to them. And my least favorite thing is when people say "oh, that should be easy" without even thinking about the problem. And also when they forget that it is not enough to solve the problem, but you must also make it possible for others to easily use what you have created. And that the solution you create has to live in an encomium of other processes as part of a workflow that makes sense and doesn't use too many resources.

The best thing to do is to first imagine what a good solution to the problem would look like. This means knowing how it should be used and in what context it would be most useful. Finally, trying for an interface that has the least number of controls. Here the theory is that the most useful feature is one that people don't actually have to use, because it's applied automatically. This means your implementation must determine what cases there are, and then make an attempt to handle all of them.

The common cases are the first ones that must be addressed. And usually this can lead you to a simpler form of organization that helps you to classify and solve the harder, less common cases. I have learned time and time again that first gearing the solution to the most difficult cases leads to more complex solutions. Instead, try to factor the more complex cases into your simpler model.

An Example

Boolean polygon operations are an example of a complex task that seems to have an easy solution. It all depends upon your imagination as to how easy the operation actually appears. But first, what is this problem?

Diagram 1: polygon
It is the task of computing a union or an intersection between two polygons. Each polygon consists of a finite number of points in the plane, connected sequentially by line segments.

So, we make an informal data structure for a polygon based on the geometry that represents it:

point: float x, y
polygon: int nPoints, array of point

We have used two definitions to make the polygon. The first is a point, represented by an x and a y in the plane. The second is the polygon, represented by a count (of points) and an array (of points).

Diagram 2: two polygons
Now we can see the problem as Boolean operations between polygons. Boolean operations include "and", "or", and "not". You can think of "and" as intersection. You can think of "or" as union. You can think of "not" as negation.

When you apply these to polygons, we can define the operations as applying to the area inside the polygon. Might this be complicated by negation? No, we can factor this in by simply making the definition that negating a polygon is the same as making its outside to be its inside, and its inside to be its outside.

Diagram 3: the intersection
of two polygons
Ironically, Boolean operations are easy to do with pixels (bitmasks), but hard to do with an outline representation.

The intersection between two polygons can be seen to be the area that is inside both polygons. In this case, to represent the intersection as a polygon requires that we compute intersection points.

But this highlights another assumption we need to make: the result of one of these operations is itself a polygon.

Diagram 4: the union
of two polygons
The union between two polygons can correspondingly seen to be the area that is inside either polygon.

So far, to the layman, the union and intersection are operations that are familiar to people who know Venn diagrams.

The notion of inside and outside may seem complicated in itself, but it may easily be formalized by using fairly simple analytic geometry operations. We can define the inside of a polygon to be the area that lies to left of an edge when the edges of a polygon are traced in counterclockwise order.

It turns out that this definition can actually be used to define what is inside and what is outside the polygon. We'll get to that in a bit.

Diagram 5: two concave polygons
But polygons are not all convex. And this leads to the first situation that causes us to throw out previous assumptions.

We see two concave polygons here that touch, so they have a well-defined union and intersection.

The intersection and the union have special properties, both of which may not be allowed for in our definitions. For instance: the definition of a polygon is a closed list of points. And the definition that the result of the Boolean operation is a polygon.

A single polygon.

This is all well and good until we see the results of intersection and union

Diagram 6: the intersection
of two concave polygons
Here we have the intersection of the two concave polygons. The result is actually a set of two disjoint polygons.

So this forces us to revise our assumption of the result of a Boolean operation being a polygon. The result is obviously more correctly stated to be one or more polygons. Of course, if the two polygons don't actually touch, then we should update our assumption of the result of a Boolean operation between polygons to include zero or more polygons.

And this, I think is no big deal, since it involves only the result.  But wait, what if we wish to operate on a result? With polygons A, B, and C, we might what to produce (A intersect B) union C. In the case where a result becomes more than one polygon, can the operation possibly be simply iterated over polygons to produce the result? No. If A intersect B produces two polygons, then the union might bridge them and still produce only one result. There is no way to get this result by simply iterating over polygons. However, another solution does exist: if A intersect B produces the two polygons, D and E, then we can evaluate (D union C) union E to produce the result. But there's no guarantee that D union C won't produce two polygons also, and so we must revise our definition of a polygon.

We define a polygon now to consist of one or more outlines, each consisting of an array of points.

point: float x, y
outline: int nPoints, array of point
polygon: int nOutlines, array of outline

Diagram 7: the union of
two concave polygons
But wait, we haven't considered the union of the two concave polygons yet.

Look! The result is also two polygons, but one polygon is a hole in the other.

And can't a hole also contain an island inside it? It looks like we are going to need a recursive definition for polygon, folks!

We will consider an area to be an outline of the edge of a polygon, and a hole to be another area (by definition negative) inside it:

point: float x, y
outline: int nPoints, array of point
area: outline, int nHoles, array of area
polygon: int nAreas, array of area

To facilitate the determination of what's inside and what's outside, we will define outlines to be oriented counterclockwise for a positive area, and clockwise for a negative area (hole).

So we had to throw out some assumptions, and also some definitions (actually our first definition!) once we realized that we had to represent holes and multiple outlines.

The Operation Itself

Now we get to the Boolean operation itself. How are we going to implement it? It seems like we can use a pretty simple rule, considering the union operation as our guide. The set of points comprising the union of polygons A and B appears to be the set of points of polygon A that are outside of polygon B taken together with the set of points on polygon B that are outside of polygon A. Well, plus the intersection points.

So we invent a rule: pick a point on one polygon that is outside of the other polygon. Trace forwards along it until you come to an intersection point. Then switch to the other polygon. Trace forwards along that polygon. When you come to the start point, you close the output polygon.

This rule comes from a simple underlying assumption: at an intersection point, edges cross. At one edge, a polygon crosses the border of the other polygon, either going from the inside to the outside, or going from the outside to the inside.

This rule gets us one of the polygons of the union shown in diagram 7, but how do we get the other outline? When we output a point to the output polygon, we must mark the point in the source polygon as used. Then we simply look for an unused point in one of the polygons that is outside the other polygon. And start another trace from there.

The process stops when there are no more unused points that qualify.

How does intersection work? We may use DeMorgan's law:

intersection(A, B) = negation(union(negation(A), negation(B)))

Practically, this means we need to reverse both polygons, compute their union, and reverse the result. When we reverse a polygon, its inside and outside become swapped topologically.

This also gives us the way to chop one polygon out of another.

Flies In the Ointment

Diagram 8: self-
intersecting polygons
There are a couple of things that will mess up this neat solution. The first is a self-intersecting polygon.

If we use the winding number rule to define what is inside the polygons, then it probably is useful to first convert these cases to a case that isn't self-intersecting. For the bow-tie, this will still result in two polygons that touch each other at a common point. For the flying-V, this will result in one easy (but concave) polygon.

Also, polygons that touch themselves from the outside of from the inside are a form of self-intersecting polygons, and must be resolved to a form that is less ambiguous.

The other problem case that needs to be considered is this: what happens when the edges of the two polygons become collinear?

I have concocted an example. For readability, I have separated the lines of the two polygons. But just imagine for a moment that they are exactly coincident at edges and points.

A solution to the intersection between these two shapes can easily lead to a polygon that touches itself from the inside. It would be nice to prevent this kind of result from being produced. Or, more simply, we can just resolve it into a non-self-intersecting polygon on output.

The Example Applied

In Painter 4, I implemented the mosaic feature. This capability defined each tile as a polygon. Each polygon was initially a 4-edge quadrilateral. But there was an additional problem of grout. The tiles could never really get closer together than a certain grout distance.

The grout distance was implemented by creating an outset polygon for every existing polygon. So a polygon was outset by the grout distance. Any new tile being placed into the mosaic was chopped by the outsets of the existing tiles so the new tile could come no closer to them than the grout distance. This required the implementation of intersection and negation.

Once a tile was chopped by one neighbor, it might be chopped by another neighbor, etc.

The good thing was that the tile polygons, at least initially, could not self-intersect. This cut down on some computation. Also, they were initially only four points. This cut down on intersection searching.

What I Haven't Actually Described

I haven't actually described the point inside polygon test, the point to left of line test, and the find all intersections between two polygons process. There exist fairly simple processes for all of them.

I also haven't described what happens to the algorithm when a polygon's edges are collinear. This can happen if three successive points fall on a line. Suffice it to say that we must remove the center point.

It's also nice to know how to orient a polygon counterclockwise. The simplest method is to determine its area using trapezoidal integration and arrange it so a positive area implies that the polygon is counterclockwise and a negative area implies that the polygon is clockwise.

To Sum It All Up

All I know is that you can spend quite a bit of time getting this programming solution right, and along the way you might change your assumptions or your internal data structures several times before coming to a fruitful conclusion.

Like I said, non-linear development.

3 comments:

  1. Thanks for describing your boolean polygon algorithms. This will probably be the most practically useful blog for me so far. Imagine extending it to 3D.

    ReplyDelete
    Replies
    1. If you want to do it in 3D you need to look at the 3D convex hull algorithm from Dijkstra. One of his books has it detailed quite clearly in it. That's a good start.

      Most 3D boolean operations happen in ray tracers. So they don't need the facets for the final result. But somebody has this code, I think.

      Delete