Tuesday, March 13, 2012

Hard Problems

I like to solve hard problems, which is good since that's my job. This involves significant analysis, and complex problem-solving time and time again.

I have been sharing some of the details of the complex problem solving required to produce features in Painter, but for once I will discuss a tiny, almost lost bit of work I did at my current job. This involves a problem I have been seeking to solve for at least ten years: the lofting problem.

You might be asking yourself how I can talk about it, since it involves my current work. The answer is that this work is now in the public domain, since the process I am about to describe is detailed quite explicitly in US Patents 7,227,551 and 7,460,129. It is also not particularly secret or unavailable since it can be had by using Core Image, part of Mac OS X, as I will indicate.

The Lofting Problem

A Text Mask to be Shaded
If you have a bit of masked information, like text, it is classic design to create a version of the text that is 3D. I'm not talking about extruded letters, which are kind of passé. I'm talking about surface texture that makes the letters look puffy. And stand out. Here you see some masked text. Optima extra bold.

While working on something quite different, another employee, Kok Chen, and I came across something truly amazing.

We solved the lofting problem.

The Wrong Answer
Now, to introduce you to this hard problem, let me show you what happens when you try to create puffy text in Painter.

Here is the result. To do this, I used Apply Surface Texture using Image Luminance, and cranked up the Softness slider.

But all this can do is soften the edge. It can't make the softness sharper near the edge, which is required to get a really classy rendering.
The Right Answer!
Using the lofting problem solution we thought of, you get a much better rendering. The corners have a shine that goes right into them, because the curvature of the surface is continuously approaching a sharp point at the corner. Yet remains pleasingly rounded in the center.

Think of a piece of rubber or flexible cloth tacked down to a hard surface at all points along its edge, and then puffed full of air from underneath: lofted.

I think you get the point. The difference is like night and day! Before I implemented this in Core Image, this was not an easy effect to achieve, and probably couldn't even be done without huge amounts of airbrushing.

In Core Image, you might use the Color Invert, Mask To Alpha, Height Field From Mask, and Shaded Material effects to achieve this. And you might have to use a shiny ball rendering as the environment map, like I did. By the way, you can mock up most of this in Core Image Fun House, an application that lives in the Developer directory. Core Image is a great tool to build an imaging application. Just look at Pixelmator!

But How Was This Solved?

Exactly! This was a very interesting problem in constraints. You see, the area outside the text is constrained to be at a height of zero. The area inside the text is initialized to be a height of 1. This is the first image in sequence.

Now blur the text with an 8-pixel radius. It creates softness everywhere, and puts some of the softness outside the text area, because blur actually is something that causes shades to bleed a bit. This is the second image in sequence.

Then you clamp the part outside the text to be zero height again. This recreates a hard edge, and creates the third image in sequence. I have recreated this process in Painter. The clamping was done by pasting in the original mask, choosing a compositing layer method of darken, and dropping the layer, collapsing it back onto the canvas.

Next I take that same image and blur it by a 4-pixel radius. This produces the fourth image in sequence.

I clamp it back to the limits of the original text by pasting, choosing a darken layer method, and dropping, producing the fifth image in sequence.

Notice after the clamp step, the edge remains hard but the inside is becoming more accommodating to the edge, yet remains smooth in its interior.

With the next images in the sequence, I blur with a 2-pixel radius, clamp to the edge of the original mask, then blur with a one pixel radius and clamp to the hard edge of the original mask again to get the final lofted mask.

The last image in sequence is the shaded result, using Apply Surface Texture in Painter.

I should confess that these images are done at a larger scale (3X) and then down sampled for your viewing pleasure. Also, Painter only keeps 8 bits per channel, so the result was a bit coarse, and had to be cleaned up using the softness slider in the Apply Surface Texture effect.

In Core Image, we use more than 8 bits per component and thus have plenty of headroom to create this cool effect at much higher resolution and with fewer artifacts.

It becomes clear that the last several frames of this sequence look practically the same. This is because the blurs are smaller and the effect of the clamping against the edge of the original mask is less prominent. Yet these last steps become very important when the result is shaded as a height field, because the derivatives matter.

Smart People and Hard Problems

It should seem obvious, but it really does take smart people to solve hard problems. And a lot of work. And trial and error. So I'm going to tender a bit of worship to the real brains that helped shepherd me on my way through life and career. To whom I owe so much. And I will explain their influences on me.

The first was Paul Gootherts. Paul is an extra smart guy, and, during high school, Paul taught me the basics of computer programming. He, in turn, got it from his dad, Jerome Gootherts. A competent programmer at 16 (or earlier, I suspect), I owe my livelihood to Paul and his wisdom; he also conveyed to me something that (I think his uncle told him): if you are good at one thing, you can find work. But if you can be good at two things, and apply what you know and cross-pollenate the two fields, then you can be a real success. I applied this to Painter, and almost all of my life. And perhaps someday I'll apply it to music as well. Paul helped me with problems in computation and number theory, and his superb competence led me to develop some of my first algorithms in collaboration with him.

The second was Derrick Lehmer, professor emeritus at UC Berkeley. He taught me that, though education is powerful, knowledge and how you apply it is even more powerful. And it's what you do with your life that is going to make the difference. This has guided me in far-reaching ways I can only begin to guess at. Professor Lehmer ("Don't call me Doctor; that's just an honorary degree") helped me with understanding continued fractions and their application to number theory and showed me that prime numbers and factorization are likely to remain the holy grail of mathematical pursuits. Oh, and his wife Emma (Trotskaia) Lehmer, who was also present, provided many hours of interesting conversation on our common subject: factoring repunits (numbers consisting of only ones). I must credit a few of my more clever number theoretic ideas to her.

Other smart people have also provided a huge amount of inspiration. Tom Hedges had a sharp eye and an insightful mind. He taught me that no problem was beyond solving and he showed me that persistence and a quick mind can help you get to the root of a problem. Even when nobody else can solve it. Tom helped me with Painter and validated so much that I did. Without Tom, it would have been hard to ship Painter at all!

I have probably never met as smart a person as Ben Weiss. When the merger between MetaTools and Fractal Design took place, I became aware of his talents and saw first hand how smart this guy is. He can think of things that I might never think of. He confirmed to me that the power of mathematical analysis is central to solving some of the really knotty problems in computer graphics. And he proceeded to famously solve the problem of making the median filter actually useful to computer graphics: a real breakthrough. I've seen Ben at work since Metacreations, and he is still just as clever.

While Bob Lansdon introduced me to Fourier-domain representation and convolutions, it was really Kok Chen that helped me understand intuitively the truly useful nature of frequency-domain computations. He introduced me to deconvolution and also showed me that there were whole areas of mathematics that I still needed to tap to solve even harder problems. Kok retired a few years back and I sincerely hope he is doing well up north! Kok basically put up with me when I came to Apple and helped me time and time again to solve the early problems that I was constantly being fed by Peter Graffagnino. When it came time to work with the GPU and later with demosaicing, Kok Chen consistently proved to be indispensable in his guidance.

There are plenty other people who I will always refer to as smarter than myself in various areas, but they will have to go unnamed until future posts.

7 comments:

  1. I remember how difficult it was for me to achieve this effect circa 1998:

    http://www.coolpage.com/3Dize_28_wht.gif
    http://www.coolpage.com/

    My solution was a 3D rendering with multiple point light sources. Your solution using only 2D operations is quite interesting.

    I suppose the patent will prevent me from using it in the patented form. I saw in your blog on Color that you explained the "allow only printable colors" checkbox for the Painter Color Picker was removed due to "patent trolls".

    Afaik, the patent process takes a lot of time and effort. I wonder if that is a significant drain on creative time in our limited productive lifespan, or does that legwork get offloaded to attorneys and technical writers.

    ReplyDelete
    Replies
    1. When you make stuff for a living, its hard to be altruistic when people steal your stuff and say they invented it. so you have to defend. Especially she your livelihood depends upon it!

      The effect is actually 3D, since it involves a Z height function of X and Y. And it gets shaded in 3D, by using a light ball - a pre-rendered 3D environment map.

      But the unique thing about this effect is the gradients near the edge of the sharp-edged form. When you look at the wrong answer vs. he right answer, it's like night and day!

      Delete
    2. I didn't study the patent. I was under the impression that the key improvement (the edge gradients) was a 2D blur and masking operation.

      I am interested in the concept that if it costs more to steal than to pay. At 10 cents per song, it would cost more (in lost time) to search for a pirated copy. For many people that is true even at $1. I have purchased probably 1000+ physical CDs in my lifetime and also numerous $1 downloaded songs.

      For algorithms, I am interested in accomplishing the same business model, where developers find it more cost effective to pay for an implementation, than roll their own implementation of a stolen algorithm (and maintain it). This is another reason I am doing research on a programming language for fine grained modularity.

      Delete
    3. I agree that stealing is wrong, and that the cost of goods, when they are entirely intangible, will go down, but not to zero.

      When an artist makes a song, that takes work. if a lot of people buy it, then it makes money and the artist gets incentive to make more music, and hopefully get better.

      With artwork for this blog, I have taken the approach that I will produce all of it by hand. This makes any copyright claims essentially null. And I have all the originals as well.

      With programming, the situation is slightly more cloudy. Since programs cover how something is done (and not what is done), and are not content per se, they will probably exist in the domain of the patent rather than the copyright.

      Software patents are controversial, though. And I won't go into that.

      You state correctly that the time it takes to design and implement an algorithm is worth time and is therefore worth money. But intellectual property law trumps that.

      The way something gets done is then owned and subject to license.

      Delete
    4. "The way something gets done is then owned and subject to license"

      I am contemplating a different monetization model for software algorithms, but I have some doubts whether it is scalable and realistic (so it is not the focus on my work).

      In my model, enforcement against stealing is a social media effect, whereby stolen implementations of algorithms receive less social support, e.g. users, bug reports and fixes, blogs. People only steal when they can do it secretly. Our constitution says we have the right to be judged by a jury of our peers directly, not a proxy. Social media can do that.

      I need to perfect a language for finer grained module reuse first. One issue is I am not sure if it is scalable that an algorithm can be implemented in one (or limit set of) s/w module, that can be reused in most use cases. Or if the existing patent model of many licensees (varied implementators) is more scalable. I strongly suspect my model is theoretically more scalable (in the overall analysis of career profits and accomplishments), because of positive feedback loops of multifurcated network effects.

      There are other challenges to solve in my model, e.g. the metric for use and compensation.

      The big picture win is that then the unit of fungible money is knowledge itself, since snippets of s/w are the encoding of knowledge. Any way, I am getting way out into the realm of dreamland...back to more realistic work...

      Delete
    5. Oh and this blog is about "Hard Problems", so we can dream can't we.

      Delete
    6. It's always OK to dream. I used to dream about algorithms that could be plugged into each other like Legos. This concept is partially realized (by other people) at Apple in the Quartz Composer application. It is a wonderful prototyping tool.

      I still do dream of such things. Any operator-based language can be described by legos.

      I have an application where the music sections and chords are blocks that can be assembled like letters on a Scrabble crib. And then expanded into real music by the application of texture. Well, a lot of musician-programer think the same I believe.

      I'm not sure the idea you have is concrete enough for people to understand I think it needs to be embedded in an IDE and demonstrated. It might be tough to get people to understand it without some examples and also applications that are programmed in it that do real things.

      Delete