Wednesday, March 14, 2012

Who We Are

It's time to fess up. Because in order to move forwards, we must understand who we are.

Almost two years ago I was wrapped up in a cocoon of my job, my life, and no friends. My best friends for life, John Derry and Tom Hedges had moved away and passed away respectively. I had taken my past and buried it deep in the back yard, metaphorically. All my notes from the Painter days were on the shelf.

Why I Do Dat?

When I was CEO and main developer for Painter, I was on a roller coaster of positive emotions, of hard work, of constant reassurance from my friends. But when the Metacreations board decided to sell off the software, my baby, and I was chosen to do the deed, I accepted the hard task because I knew if anyone else did it all would be lost. This meant closing down the software division, really everything I had built. The people I knew and worked with, all of whom I valued deeply, needed a kind hand to shepherd them through the layoff process and moving on. Again, if someone else were to do it, it might have ended up far worse.

And in the process I was sure nobody would understand the sacrifice, yet some clearly did, which humbled me. So, as I went to consult for Corel for the final 18 months of my work on Painter, I wrapped myself up and closed myself off, and wound up insulated. Even from my friends.

And, as a CEO, I also found that many friends were really not friends, but really fair-weather friends. I have mentioned in my blog many who were not, but still, this realization is a hard fall to take.

So I put my past in boxes and stored them literally on the shelf. Which left me broken.

I can imagine many of us having to do things we don't want to do. Perhaps we do them because they are the lesser of two evils: Hobson's choice. Like going to the dentist or letting a cavity get worse. When it is our choice, we must accept the responsibility. We must own it. When, however, it is someone else's choice, we can more easily be scarred by the process.

Hard Problems

So about two years ago a kind friend started me on a path towards reconciling with my past. Because, in this case, the hard problem is me.

In conversation, I talked about my past and the crazy things that have happened to me. And slowly, I became unwrapped, and un-cocooned. Eventually I started this blog and have done my best to embrace my past so I could move forwards.

And you have seen some interesting posts along the way that talk about ideas, creativity, hard problems, intense development, and even the things we throw away. I have also been hoping that my way of looking at things will be of use to you, so I have given you thoughts on three-dimensional thinking, alternate reality, disruptive technology, and even shared with you my memories of Steve Jobs.

Most of these blog posts are decorated with stories from my past, part of the process of embracing it. Part of the healing process. My posts on Painter and the Fractal Design era are my present to the Painter users and the Fractal Design and Metacreations folk, who I must ask to forgive me.

Moving Forwards

Life is always moving forwards whether we want it to or not. At Apple, I have been moving my development chops forward at an astonishing rate. But I can't really talk about it, except to say that there might be a few patents with my name on them that can point you in the directions that I have gone.

So my job with this blog is to move the rest of me forwards. To continue to develop myself since the future waits for no one; rust never sleeps. I hope for you, gentle reader, to be entertained, to enjoy me revealing interesting topics, all so you can see a few things from my relativistic observation point.

I apologize here and now that you also have to put up with so much stuff that litters my past. It's just part of my therapy.

Some day I hope to figure out who I am. In the meanwhile, I think I understand the ramifications of my past and it has helped me.

But do remember that this blog only reveals one side of me: the part I can talk about. So that may be why it is so heavily strewn with recollection.

Still, I must continue to move forwards.

Let's Vote!

To be responsive to my users, I think now would be a good time to ask for comments on what I should post about. What do you love? What do you hate? What would you like to see? Do you value my point of view? Are my posts on creativity interesting?

Please add a comment on this, dear reader, and help me move forwards.

The relativistic observer,
Mark Zimmer

Tuesday, March 13, 2012

Hard Problems

I like to solve hard problems, which is good since that's my job. This involves significant analysis, and complex problem-solving time and time again.

I have been sharing some of the details of the complex problem solving required to produce features in Painter, but for once I will discuss a tiny, almost lost bit of work I did at my current job. This involves a problem I have been seeking to solve for at least ten years: the lofting problem.

You might be asking yourself how I can talk about it, since it involves my current work. The answer is that this work is now in the public domain, since the process I am about to describe is detailed quite explicitly in US Patents 7,227,551 and 7,460,129. It is also not particularly secret or unavailable since it can be had by using Core Image, part of Mac OS X, as I will indicate.

The Lofting Problem

A Text Mask to be Shaded
If you have a bit of masked information, like text, it is classic design to create a version of the text that is 3D. I'm not talking about extruded letters, which are kind of passé. I'm talking about surface texture that makes the letters look puffy. And stand out. Here you see some masked text. Optima extra bold.

While working on something quite different, another employee, Kok Chen, and I came across something truly amazing.

We solved the lofting problem.

The Wrong Answer
Now, to introduce you to this hard problem, let me show you what happens when you try to create puffy text in Painter.

Here is the result. To do this, I used Apply Surface Texture using Image Luminance, and cranked up the Softness slider.

But all this can do is soften the edge. It can't make the softness sharper near the edge, which is required to get a really classy rendering.
The Right Answer!
Using the lofting problem solution we thought of, you get a much better rendering. The corners have a shine that goes right into them, because the curvature of the surface is continuously approaching a sharp point at the corner. Yet remains pleasingly rounded in the center.

Think of a piece of rubber or flexible cloth tacked down to a hard surface at all points along its edge, and then puffed full of air from underneath: lofted.

I think you get the point. The difference is like night and day! Before I implemented this in Core Image, this was not an easy effect to achieve, and probably couldn't even be done without huge amounts of airbrushing.

In Core Image, you might use the Color Invert, Mask To Alpha, Height Field From Mask, and Shaded Material effects to achieve this. And you might have to use a shiny ball rendering as the environment map, like I did. By the way, you can mock up most of this in Core Image Fun House, an application that lives in the Developer directory. Core Image is a great tool to build an imaging application. Just look at Pixelmator!

But How Was This Solved?

Exactly! This was a very interesting problem in constraints. You see, the area outside the text is constrained to be at a height of zero. The area inside the text is initialized to be a height of 1. This is the first image in sequence.

Now blur the text with an 8-pixel radius. It creates softness everywhere, and puts some of the softness outside the text area, because blur actually is something that causes shades to bleed a bit. This is the second image in sequence.

Then you clamp the part outside the text to be zero height again. This recreates a hard edge, and creates the third image in sequence. I have recreated this process in Painter. The clamping was done by pasting in the original mask, choosing a compositing layer method of darken, and dropping the layer, collapsing it back onto the canvas.

Next I take that same image and blur it by a 4-pixel radius. This produces the fourth image in sequence.

I clamp it back to the limits of the original text by pasting, choosing a darken layer method, and dropping, producing the fifth image in sequence.

Notice after the clamp step, the edge remains hard but the inside is becoming more accommodating to the edge, yet remains smooth in its interior.

With the next images in the sequence, I blur with a 2-pixel radius, clamp to the edge of the original mask, then blur with a one pixel radius and clamp to the hard edge of the original mask again to get the final lofted mask.

The last image in sequence is the shaded result, using Apply Surface Texture in Painter.

I should confess that these images are done at a larger scale (3X) and then down sampled for your viewing pleasure. Also, Painter only keeps 8 bits per channel, so the result was a bit coarse, and had to be cleaned up using the softness slider in the Apply Surface Texture effect.

In Core Image, we use more than 8 bits per component and thus have plenty of headroom to create this cool effect at much higher resolution and with fewer artifacts.

It becomes clear that the last several frames of this sequence look practically the same. This is because the blurs are smaller and the effect of the clamping against the edge of the original mask is less prominent. Yet these last steps become very important when the result is shaded as a height field, because the derivatives matter.

Smart People and Hard Problems

It should seem obvious, but it really does take smart people to solve hard problems. And a lot of work. And trial and error. So I'm going to tender a bit of worship to the real brains that helped shepherd me on my way through life and career. To whom I owe so much. And I will explain their influences on me.

The first was Paul Gootherts. Paul is an extra smart guy, and, during high school, Paul taught me the basics of computer programming. He, in turn, got it from his dad, Jerome Gootherts. A competent programmer at 16 (or earlier, I suspect), I owe my livelihood to Paul and his wisdom; he also conveyed to me something that (I think his uncle told him): if you are good at one thing, you can find work. But if you can be good at two things, and apply what you know and cross-pollenate the two fields, then you can be a real success. I applied this to Painter, and almost all of my life. And perhaps someday I'll apply it to music as well. Paul helped me with problems in computation and number theory, and his superb competence led me to develop some of my first algorithms in collaboration with him.

The second was Derrick Lehmer, professor emeritus at UC Berkeley. He taught me that, though education is powerful, knowledge and how you apply it is even more powerful. And it's what you do with your life that is going to make the difference. This has guided me in far-reaching ways I can only begin to guess at. Professor Lehmer ("Don't call me Doctor; that's just an honorary degree") helped me with understanding continued fractions and their application to number theory and showed me that prime numbers and factorization are likely to remain the holy grail of mathematical pursuits. Oh, and his wife Emma (Trotskaia) Lehmer, who was also present, provided many hours of interesting conversation on our common subject: factoring repunits (numbers consisting of only ones). I must credit a few of my more clever number theoretic ideas to her.

Other smart people have also provided a huge amount of inspiration. Tom Hedges had a sharp eye and an insightful mind. He taught me that no problem was beyond solving and he showed me that persistence and a quick mind can help you get to the root of a problem. Even when nobody else can solve it. Tom helped me with Painter and validated so much that I did. Without Tom, it would have been hard to ship Painter at all!

I have probably never met as smart a person as Ben Weiss. When the merger between MetaTools and Fractal Design took place, I became aware of his talents and saw first hand how smart this guy is. He can think of things that I might never think of. He confirmed to me that the power of mathematical analysis is central to solving some of the really knotty problems in computer graphics. And he proceeded to famously solve the problem of making the median filter actually useful to computer graphics: a real breakthrough. I've seen Ben at work since Metacreations, and he is still just as clever.

While Bob Lansdon introduced me to Fourier-domain representation and convolutions, it was really Kok Chen that helped me understand intuitively the truly useful nature of frequency-domain computations. He introduced me to deconvolution and also showed me that there were whole areas of mathematics that I still needed to tap to solve even harder problems. Kok retired a few years back and I sincerely hope he is doing well up north! Kok basically put up with me when I came to Apple and helped me time and time again to solve the early problems that I was constantly being fed by Peter Graffagnino. When it came time to work with the GPU and later with demosaicing, Kok Chen consistently proved to be indispensable in his guidance.

There are plenty other people who I will always refer to as smarter than myself in various areas, but they will have to go unnamed until future posts.

Saturday, March 10, 2012

Piano Improv

Software development can cause frustration: large projects run over, thorny problems seem intractable, bugs can be hard to find, leading to hours of hard, painstaking logical analysis.

And that's why I like to turn to music to gain perspective.

You can check out my relationship with music in my blog post, Music. You can learn a little bit about how I write songs in my blog post, Writing Songs. And now I will share with you some of the ways I create music ad libitum.

Years of Piano Playing

I taught myself to play the piano in 1973 and this led to years of constant obsession with piano-playing. I went through the catalogs of the Beatles, Pink Floyd, the Grateful Dead, and many other artists. But most of all, I learned to improvise.

I once read that Ludwig Van Beethoven was a master improviser. He could sit at the Hammerklavier and spin tales of music directly from his touch. I always found his music to be inspiring. I also read that Johann Sebastian Bach was a genius at improvisation. And also Schubert, Mozart, Franz Liszt, and others.

In 1973 I wondered if I could ever learn to do that. Just touching the keys of the piano that my parents had bought for my sister Susan to learn piano, but which she had rejected after a few lessons, I saw a tremendous regularity. A devilish simplicity and yet a complexity that defied simple rigorous analysis.

So in that summer when I was left alone in the house in Sunnyvale while the rest of the family went to New Mexico, I had nothing but time to myself.

I have mentioned that the upright piano we had was painted puke green and was tuned a half-semitone flat. The strings were too old to make them any sharper, which would require making them tighter. So it got tuned more flat so all the strings could at least be in tune with each other.

And when you teach yourself notes on such a piano, you can forget about having perfect pitch.

After going to college at Caltech, I found a piano in the practice room in Page house. I played it incessantly. And my abilities improved immediately. The quality of the instrument makes a huge difference, I found.

When I went to UC Berkeley in late 1975, it was as a music major, as I have mentioned. But in between assignments, I improvised. Sometimes 8 hours a day. This is what I meant by obsession.

There were a few years when I had to work and push the piano-playing aside, but I always returned to it. There was always just too much value to it, emotionally and intellectually. It's probably good I did, too, given the frustration capacity for the profession I finally landed in.

I played piano wherever I could find one. I found that the Yamahas and the Kawais were the best pianos. And in that generation, this was true. I played a Bösendorfer once, but I found it to be stiff and unexpressive. I also played Steinways but found them to be unremarkable and sometimes quite muddy. In retrospect, it was probably just how well they were tuned.

In the early 1980s I was improvising quite a bit, with quite a varied style. A little modern, a little freestyle. You can check out the following improvisation to see what I was playing like in September 1981. often my improvs are titled based on the date they were recorded, in this case 09-22-81 #1 shows it was the first recording I did on the 22nd of September, 1981.


As the 80s wore on, I continued to play and write pieces. Here is another improvisation that I worked into a full-blown piano piece, called Black Widow. This piece is much more structured and rhythmically interesting. My style was progressing.


Eventually in the late 1980s I got the Yamaha 7-footer that I use today. I also had, for a while, a Yamaha 7-foot Diskclavier, but I didn't like the way it worked. Yes, I had two grand pianos for a while. I'm serious about pianos, and I probably always will be. But I also worked with synthesizers. Including a weighted-key synthesizer that still works great after so many years, the Roland RD-500. Supposedly it was a roadie's dream because it was so stable and bulletproof. As a digital piano, it also featured various kinds of electric piano styles. In 1990, I played the following improvisation. This was when I was about two months into the secret work on Painter, before another living soul saw what I was cooking up. It is interesting to see the sonorous moods I had in those periods. This improvisation does a lot to tell you the story behind my early work on Painter: intricate, creative, and spontaneous. It's called 11-10-90 #2. Note: sometimes I recorded more than one improv a day, if I was on a roll.


My improvisation recordings are entirely unedited, so if I make a mistake: it's all on me!

Also from this era is an improvisation that features more interesting rhythm but also a melodic inner section that has a quote from one of my songs from the era (which I no longer have a recording of) called keys to your heart. It's not bad. It's called e piano improv, a rather undescriptive title it's true.


When I moved out by the beach, for a while I had a house with room for two pianos. It was there that I did most of my modern production work with ProTools. But I continued to improvise. Even in a smaller, more manageable house I still have a 7-foot grand. And I play piano every day. When I'm writing songs, I will improvise on the theme of the song. Usually I will work up the song and record it onto my iPhone which at least gets it down for posterity. My worst problem is that I don't record the songs that I compose from day to day.

So, the nature of improvisation is that most of it goes unrecorded, which is a pity when I do a particularly brilliant one. So maybe one in twenty gets recorded.

After 39 years of piano-playing, my fingers and my brain are entirely in sync. Does this mean every note is perfect? No. But it does mean that I can maintain a rhythm while modulating from key to key. All the chords have familiar hand shapes, so I don't really look at the keys any more. Sometimes I play in the dark, but usually only when I am alone.

How I Improvise

So, how do I do it? What is going through my mind when I'm playing?

If you were to watch, you would just see me sit down and play.

With my hands on the keyboard, I focus first on the key I'm playing in. And whether or not I want to start on a chord that is moving or stable. Moving chords are like sevenths: minor sevenths, major sevenths, inversions (particularly those with the seventh in the bass). This is because sevenths typically resolve into more plain chords in a cadence. Here we have an A minor seventh chord resolving into a D major chord. These chords also have very natural hand shapes to me, and I favor them. But, of course, they can be expressed in any key and their hand shapes might be a bit different. Also, suspended fourths are unstable, and tend to resolve fairly quickly.

Playing in different keys is important. At one point, in in the mid-80s, I forced myself to improvise in a different key each time I sat down at the piano. I had a little tally sheet. J S Bach used to do this, I suspect, since his Well-Tempered Clavier has two books with a piece in every key. Fréderic Chopin also did this with his Preludes and Études.

Anyway, it's also important to try out a few rhythms with a chord sequence I work out. And also to try a few chord inversions, so the notes on the top can begin to form a cohesive melody.

When it comes to the bassline, I will find ways to work chromatic descending bass into my improvisation. This sounds very good, and often forms a focus point for a chord sequence. Sometimes I put the third in the bass. Sometimes the seventh goes into the bass.

Here the bass descends to the seventh and down to the third of the C chord, to finally rest on F, with an added 2nd (G) added on top. This kind of sequence works best when the bass is very low indeed, perhaps one to two octaves below where I've written it here.

After a number of years playing, I have tried out pretty much all the chord sequences there are. But of course that is wrong because there are plenty of chord sequences that sound terrible. And I naturally avoid those after perhaps making the mistake of trying them. Once.

Another thing to remember is that the melody doesn't have to fit the chord. You may find that the melody contains suspended notes, or even notes that are entirely off-chord. This can be to great effect.

Chord inversions are of interest, when you are playing. They can help add color to the chord. And sometimes inversions are totally necessary in creating the feel you want.

Here I have shown some inversions of the C ninth chord, with the root of the chord (C) taken out. It's only fair that someone should play only four notes at a time, to avoid too much of a handful of notes. The one I favor lately is the third in the sequence, with the G being played by the thumb.

Ninth chords are in general interesting because they produce a thicker chord. In C alone, there are at least four interesting ninth chords. I have shown some of them here.

They really only differ in whether the E and the B are natural or flatted.

The C major ninth chord (or CM9) is bright, a bit dissonant, and often leads into a C chord by having the top two notes raise diatonically up to C and E.

The C minor ninth chord (Cm9) is soft and brooding. There is less dissonance because there is no B natural. But now the D and the Eb form a direct dissonance.

The C ninth (C9) chord has no semitone dissonance, and has a lush, cheerful sound. It is descended directly from the C seventh chord, but has an added D.

The C minor major ninth (CmM9) chord combines the darkness of the major chord with the dissonance of a C major ninth chord, and can be a bit ghastly and tragic.

So you can alter the color of your music by using the right chord. This is particularly so in the first chord of your verse. For instance, Pink Floyd, in their song Breathe, starts their verse off with an Em9 chord, resolving to an A chord. It is the minor ninth chord, with its sonorous sound, yet tainted by the added 2nd, creates the ambience of the entire song. In fact, that particular song has some of the most perplexing and powerful chord sequences in the repertoire.

A Question and an Answer

It has always been interesting to me, since improvisation and music generation is so completely hardwired into my brain, if it might not also be possible to write a computer program to do something similar. I think harmony is pretty easy to encode into a computer program. Indeed, in 1975 I did such a thing.

But it is an entirely different thing to encode the structure of music into a computer program. I am, of course referring to the high-level structure of music. The ABA format, the romantic period sonata form with its introduction, exposition, development, recapitulation, and coda sections. If we look at modern songs, there is the intro/verse/refrain/bridge/outro structure and all the common rearrangements of them.

My question is this: even if all these forms were to be faithfully duplicated by a songwriting program, would the computer be able to replicate the angst, emotions, and desires of the human composer?

The answer is simple actually. Such a songwriting program would not be able to write the songs. That would be the human user's job. Why do I say this? Because with Painter, our job was to accurately collect the artist's expression, and faithfully reproduce it, allowing the artist's style to come through. Because Painter and the computer running it are enablers for human creativity. They allow you to do more than you could do before.

And this is precisely what a songwriting program would do for a composer: enable them to compose bigger and better pieces, and let their emotions and creativity come through like never before.

And What About Improvisation?

Well, after many years of work at the piano, my brain is that channeler of creativity. That enabler of emotional expression.

Maybe Fractal Design Composer is what's needed. Maybe I can take my improvisational skills and build a program around it.

Maybe that's where music and me come full circle.

Hmm.

Wednesday, March 7, 2012

Marbling

In Painter 4, we came out with several new features. One was mosaics. Another was marbling. Marbling is the application of drops of colored ink onto the surface of a liquid, playing with the ink, and imparting that color imagery onto a piece of paper. Marbling was used in the industrial revolution to make the frontispieces of books.

This technique was a bit of a lost art because the people who created them actually figured out the process and kept it secret in an unusual way. They constructed several rooms. Each one had an artist that knew his part of the process. The artwork (and sometimes a pan of liquid ink) was passed through a slit in the wall in between the rooms. After several rooms, the finished piece would emerge. In this way, no single person knew the entire process and thus its creator could keep it very secret.

The first step was dropping various colors of ink onto a liquid surface called a size. Why it's called that, I will never know. As you drop a bit of ink down, all the other bits of ink are distorted by the new drop. This creates a very interesting pattern, called stone.

Here you see a stone pattern constructed by dropping a lot of drips of ink onto the liquid surface of the size.

The stone pattern is the undistorted pattern that is used as a colorful basis of the marbling pattern. This stuff can be created in a simple way, using a conformal mapping. Let's say the new drop has radius R. We preserve the area of the colors already on the size by displaying them from the center of the drop, altering the distance function of the color you fetch using a function of R, and also of D, the distance of the pixel you are computing from the center of the blob. In this way, you can evaluate the entire stone pattern using a ray-tracing technique where all the drops of ink are taken into account for every pixel of the image.

This is done in the Painter Esoterica:Blobs... effect. That's actually how I created this image.

The second step taken by the artisans (in a separate room, and by a separate artist) was to run pins and rakes through the liquid, further distorting the ink in interesting ways. They usually moved the pins and rakes in a sinusoidal pattern.

Here you see the result of one pass with a rake. The source image was a colorful soft spiral of purples, greens, white, and browns. You can see the rake was passed from left to right. The sinusoids are quite pronounced. The ink sticks to the rake tines and gets pulled in the direction that the rake is pulled.

This is how the marbling patterns are created, by distorting the colors of the image, creating a cool-looking pattern.

This is how it starts, but the various marbling formulas usually feature several passes.

After three passes, and using a very special formula, you can get some very nice patterns, like this bouquet pattern.

It starts with a very fine rake vertically from top to bottom, and with very small tines so the distortion is just enough to create a small high-frequency pattern.

Then two larger vertical sinusoids are used to mix it up, using a widely-spaced rake with large tines. This results in the bouquet pattern you see here.

You have probably seen these marbling patterns in the old books. Binders like them a lot.

There are literally dozens of interesting marbling patterns, and each has a cool formula. Many of them are programmed into Painter as presets.

Painter still has this marbling command, Esoterica:Apply Marbling...

Here is another bouquet pattern, with different placement of the widely-spaced vertical rakes. I think I considered this to be one of the better results in the bouquet class of marblings.

You can get thistle patterns by passing one rake vertically, and passing another in between the lines of the other one in the opposite direction (this pulls up and down the ink, in alternating sections). Then you can bouquet it.

When you use circular patterns with the rake, it gets even weirder, but Painter can't do that with its Apply Marbling... effect. There might just be a bit more to explore in marbling.

This is an experimental design done by using the Blobs... effect. Here, the ink blobs were filled with an image (the pattern source) instead of a flat color. This can create some really interesting patterns, as this shows.

Use the right colors and it pops! it comes alive with vibrance. But it's a little disconcerting, since it seems to be alive and organic somehow.

The implementation of marbling was also a bit of a raytracing trick as well.

When distorting an image, all distortion functions must run backwards. This is because you are enumerating all output pixels and trying to find the source pixel for that output pixel. This is the reverse from simply taking a pixel and moving it from its source to its definition.

So the marbling implementation was really just iterating over every pixel and computing a function going from the displayed pixel to the place where it came from.

Graphics programming can be a bit complicated sometimes.

When Painter 4 came out, as far as I know, it had the first computer implementation of marbling and stone patterns ever.

It was fun figuring it out, and I always did like to show new things.

Tuesday, March 6, 2012

Hackers, Part 2

Today, the FBI took down LulzSec, the splinter group of hackers responsible for so many incursions. Months ago, I speculated that they were known down to the person but I was premature. It turns out that their leader, "sabu" was known, though. That's when the FBI secretly arrested him and turned him into the most interesting mole in hacker history. While, in Hackers Part 1, I speculated that they were disbanded, it turns out that they had turned their efforts onto a new theme, AntiSec. It is good someone got them, because they supposedly had thousands of infected servers at their beck and call (topiary claimed this).

LulzSec, populated with personalities like sabu (Hector Xavier Monsegur of New York), kayla (Ryan Ackroyd of London), topiary (Jake Davis of London, actually arrested last year), pwnsauce (Darren Martyn of Ireland), palladium (Donncha O'Cearrbhail of Ireland), and anarchaos (Jeremy Hammond of Chicago), were responsible for a number of attacks that penetrated systems (mostly using password hacks), stole data and simply posted it (mostly on pastebin.com). Data sometimes included identity information and credit card information. But they really liked to ply DDoS attacks, which are made possible (and apparently popular) using off-the-shelf software like LOIC. The complicated process in finding anarchaos is detailed in this link.

Then things heated up, and in June 2011, other groups started outing LulzSec members. The link points to a pastebin post by the A-Team, a public rival hacking group. Their speculation about topiary was wrong, it appears, but they got sabu right. I wonder how other specified members uncommon, laurelai, eekdakat, nigg, madclown, avunit, tflow, and joepie91 are faring. They are listed in this link, some with names and addresses. Maybe they are on the way. But again, if they got topiary wrong and sabu right, then their record isn't exactly perfect.

In either case, Alpha Mike Foxtrot!

So it's clear that, when this happened, the FBI moved in and turned him. That can't be good for the other members.

This is on the heels of two interesting developments in hackerdom. The first is Anonymous and their prying into the international anti-hacking taskforce's conference calls. The second is the spoofing of Anonymous for the insertion of malware into their actual computers.

The FBI kind of got a black eye when Anonymous posted details of a conference call (the Anon-Lulz International Coordination Call) that occurred between the anti-hacking taskforces in both the US and the UK. The FBI recently admitted that this occurred. So that tells us that the posted transcript of the call was actually correct. It was during this call that the Anonymous member tehwongz was outed. Well, he's a 15-year-old kid, so no names were mentioned. He claimed to have hacked Valve's Steam network. The conference call was hacked by palladium (also known as anonsacco) and this is detailed in this link.

The other notable incident seemed to pass by without notice, although it did receive netplay. Here, a purported Anonymous tool for hacking, posted on pastebin.com, was actually malware in itself. This means that all the Anonymous sympathizers that downloaded and used this code, were infected with malware that would send their dox (identities and other useful information) to someone. This was detected by Symantec very recently.

I speculate that someone wants to know who they are. And get this interesting tidbit of information: the malware was spoof-posted on pastebin right after the MegaUpload raid.

It might be a perfect thing for a government to do to get these guys. First the MegaUpload raid occurs, enraging the hacktivists, then a malware post occurs, then the Anonymous hacktivists decide to use the tool to stage a DDoS attack. And voila! Plenty of names and IP addresses are streaming in.

We will see in the coming weeks and months what comes of this, I think.

Sunday, March 4, 2012

The Things We Throw Away

Upon finishing the development section of a project, I remarked that I had written about three times as much code as finally got shipped. Why is that?

Non-Linear Development

You always hear about FTW moments, when somebody writes some code and it works the first time they run it. Or when somebody is under pressure and he or she comes through in a pinch in record time. Isn't this the common case?

No. That might as well be Hollywood screenwriter glamorizing the world of hackers. This perception works against us in so many ways. Don't get me wrong: I have had many FTW moments. When an attempt goes right and succeeds way past my expectations. When a piece of code that I have written, a non-trivial section of code, works without error the first time. We can relish those moments, for sure. But they are blue moons, people. Lightning strikes. And at most they are tiny little victories.

Development is a non-linear process.

This may not be the case when it comes to writing a small subroutine, or a library function. But when it comes to the difficult, nearly intractable projects, like the ones I prefer to work on, rewriting and rethinking is an essential part of the game. The scope of such projects lends itself to research, which definitely keeps work interesting. By research, I sometimes mean looking for other solutions to the problem. Or related solutions.

I have said that when a problem is deconstructed, it can sometimes come to resemble other problems. We can use alternative approaches, and then reconstruct to provide completely new solutions to the problem that may simply have never been thought of before. This is a bit of a messy way to stay creative. But it has worked time and time again.

But it doesn't work every time I try it.

And this is one reason why development is non-linear: if one method doesn't work, then we have to try another until we do find one that does. This means taking suggestions from others. And trying them out. This means looking at related fields. Turning the problem inside-out, or looking at it backwards: tracing backwards from the solution.

This technique is used when someone else has demonstrated a solution to a problem. The thought goes like this: if they succeeded at solving it, then they would have to have done this. And to do that, they would have to have figured out how to do this other thing. Generally it is a chain of certain reasoning.

But when you are solving a problem for the first time, you are not guaranteed that this technique will lead to a solution. And so development becomes non-linear again. Because you can try to find a solution this way, but you may not succeed.

Hard Problems and Some Ways of Looking At Them

It simply comes down to this: some problems are so hard that you end up grasping at straws to figure out a solution to them. And my least favorite thing is when people say "oh, that should be easy" without even thinking about the problem. And also when they forget that it is not enough to solve the problem, but you must also make it possible for others to easily use what you have created. And that the solution you create has to live in an encomium of other processes as part of a workflow that makes sense and doesn't use too many resources.

The best thing to do is to first imagine what a good solution to the problem would look like. This means knowing how it should be used and in what context it would be most useful. Finally, trying for an interface that has the least number of controls. Here the theory is that the most useful feature is one that people don't actually have to use, because it's applied automatically. This means your implementation must determine what cases there are, and then make an attempt to handle all of them.

The common cases are the first ones that must be addressed. And usually this can lead you to a simpler form of organization that helps you to classify and solve the harder, less common cases. I have learned time and time again that first gearing the solution to the most difficult cases leads to more complex solutions. Instead, try to factor the more complex cases into your simpler model.

An Example

Boolean polygon operations are an example of a complex task that seems to have an easy solution. It all depends upon your imagination as to how easy the operation actually appears. But first, what is this problem?

Diagram 1: polygon
It is the task of computing a union or an intersection between two polygons. Each polygon consists of a finite number of points in the plane, connected sequentially by line segments.

So, we make an informal data structure for a polygon based on the geometry that represents it:

point: float x, y
polygon: int nPoints, array of point

We have used two definitions to make the polygon. The first is a point, represented by an x and a y in the plane. The second is the polygon, represented by a count (of points) and an array (of points).

Diagram 2: two polygons
Now we can see the problem as Boolean operations between polygons. Boolean operations include "and", "or", and "not". You can think of "and" as intersection. You can think of "or" as union. You can think of "not" as negation.

When you apply these to polygons, we can define the operations as applying to the area inside the polygon. Might this be complicated by negation? No, we can factor this in by simply making the definition that negating a polygon is the same as making its outside to be its inside, and its inside to be its outside.

Diagram 3: the intersection
of two polygons
Ironically, Boolean operations are easy to do with pixels (bitmasks), but hard to do with an outline representation.

The intersection between two polygons can be seen to be the area that is inside both polygons. In this case, to represent the intersection as a polygon requires that we compute intersection points.

But this highlights another assumption we need to make: the result of one of these operations is itself a polygon.

Diagram 4: the union
of two polygons
The union between two polygons can correspondingly seen to be the area that is inside either polygon.

So far, to the layman, the union and intersection are operations that are familiar to people who know Venn diagrams.

The notion of inside and outside may seem complicated in itself, but it may easily be formalized by using fairly simple analytic geometry operations. We can define the inside of a polygon to be the area that lies to left of an edge when the edges of a polygon are traced in counterclockwise order.

It turns out that this definition can actually be used to define what is inside and what is outside the polygon. We'll get to that in a bit.

Diagram 5: two concave polygons
But polygons are not all convex. And this leads to the first situation that causes us to throw out previous assumptions.

We see two concave polygons here that touch, so they have a well-defined union and intersection.

The intersection and the union have special properties, both of which may not be allowed for in our definitions. For instance: the definition of a polygon is a closed list of points. And the definition that the result of the Boolean operation is a polygon.

A single polygon.

This is all well and good until we see the results of intersection and union

Diagram 6: the intersection
of two concave polygons
Here we have the intersection of the two concave polygons. The result is actually a set of two disjoint polygons.

So this forces us to revise our assumption of the result of a Boolean operation being a polygon. The result is obviously more correctly stated to be one or more polygons. Of course, if the two polygons don't actually touch, then we should update our assumption of the result of a Boolean operation between polygons to include zero or more polygons.

And this, I think is no big deal, since it involves only the result.  But wait, what if we wish to operate on a result? With polygons A, B, and C, we might what to produce (A intersect B) union C. In the case where a result becomes more than one polygon, can the operation possibly be simply iterated over polygons to produce the result? No. If A intersect B produces two polygons, then the union might bridge them and still produce only one result. There is no way to get this result by simply iterating over polygons. However, another solution does exist: if A intersect B produces the two polygons, D and E, then we can evaluate (D union C) union E to produce the result. But there's no guarantee that D union C won't produce two polygons also, and so we must revise our definition of a polygon.

We define a polygon now to consist of one or more outlines, each consisting of an array of points.

point: float x, y
outline: int nPoints, array of point
polygon: int nOutlines, array of outline

Diagram 7: the union of
two concave polygons
But wait, we haven't considered the union of the two concave polygons yet.

Look! The result is also two polygons, but one polygon is a hole in the other.

And can't a hole also contain an island inside it? It looks like we are going to need a recursive definition for polygon, folks!

We will consider an area to be an outline of the edge of a polygon, and a hole to be another area (by definition negative) inside it:

point: float x, y
outline: int nPoints, array of point
area: outline, int nHoles, array of area
polygon: int nAreas, array of area

To facilitate the determination of what's inside and what's outside, we will define outlines to be oriented counterclockwise for a positive area, and clockwise for a negative area (hole).

So we had to throw out some assumptions, and also some definitions (actually our first definition!) once we realized that we had to represent holes and multiple outlines.

The Operation Itself

Now we get to the Boolean operation itself. How are we going to implement it? It seems like we can use a pretty simple rule, considering the union operation as our guide. The set of points comprising the union of polygons A and B appears to be the set of points of polygon A that are outside of polygon B taken together with the set of points on polygon B that are outside of polygon A. Well, plus the intersection points.

So we invent a rule: pick a point on one polygon that is outside of the other polygon. Trace forwards along it until you come to an intersection point. Then switch to the other polygon. Trace forwards along that polygon. When you come to the start point, you close the output polygon.

This rule comes from a simple underlying assumption: at an intersection point, edges cross. At one edge, a polygon crosses the border of the other polygon, either going from the inside to the outside, or going from the outside to the inside.

This rule gets us one of the polygons of the union shown in diagram 7, but how do we get the other outline? When we output a point to the output polygon, we must mark the point in the source polygon as used. Then we simply look for an unused point in one of the polygons that is outside the other polygon. And start another trace from there.

The process stops when there are no more unused points that qualify.

How does intersection work? We may use DeMorgan's law:

intersection(A, B) = negation(union(negation(A), negation(B)))

Practically, this means we need to reverse both polygons, compute their union, and reverse the result. When we reverse a polygon, its inside and outside become swapped topologically.

This also gives us the way to chop one polygon out of another.

Flies In the Ointment

Diagram 8: self-
intersecting polygons
There are a couple of things that will mess up this neat solution. The first is a self-intersecting polygon.

If we use the winding number rule to define what is inside the polygons, then it probably is useful to first convert these cases to a case that isn't self-intersecting. For the bow-tie, this will still result in two polygons that touch each other at a common point. For the flying-V, this will result in one easy (but concave) polygon.

Also, polygons that touch themselves from the outside of from the inside are a form of self-intersecting polygons, and must be resolved to a form that is less ambiguous.

The other problem case that needs to be considered is this: what happens when the edges of the two polygons become collinear?

I have concocted an example. For readability, I have separated the lines of the two polygons. But just imagine for a moment that they are exactly coincident at edges and points.

A solution to the intersection between these two shapes can easily lead to a polygon that touches itself from the inside. It would be nice to prevent this kind of result from being produced. Or, more simply, we can just resolve it into a non-self-intersecting polygon on output.

The Example Applied

In Painter 4, I implemented the mosaic feature. This capability defined each tile as a polygon. Each polygon was initially a 4-edge quadrilateral. But there was an additional problem of grout. The tiles could never really get closer together than a certain grout distance.

The grout distance was implemented by creating an outset polygon for every existing polygon. So a polygon was outset by the grout distance. Any new tile being placed into the mosaic was chopped by the outsets of the existing tiles so the new tile could come no closer to them than the grout distance. This required the implementation of intersection and negation.

Once a tile was chopped by one neighbor, it might be chopped by another neighbor, etc.

The good thing was that the tile polygons, at least initially, could not self-intersect. This cut down on some computation. Also, they were initially only four points. This cut down on intersection searching.

What I Haven't Actually Described

I haven't actually described the point inside polygon test, the point to left of line test, and the find all intersections between two polygons process. There exist fairly simple processes for all of them.

I also haven't described what happens to the algorithm when a polygon's edges are collinear. This can happen if three successive points fall on a line. Suffice it to say that we must remove the center point.

It's also nice to know how to orient a polygon counterclockwise. The simplest method is to determine its area using trapezoidal integration and arrange it so a positive area implies that the polygon is counterclockwise and a negative area implies that the polygon is clockwise.

To Sum It All Up

All I know is that you can spend quite a bit of time getting this programming solution right, and along the way you might change your assumptions or your internal data structures several times before coming to a fruitful conclusion.

Like I said, non-linear development.

Saturday, March 3, 2012

Intense Development

There are periods of time during a project when I don't even want to sleep. Others around me get very annoyed. But when I come out the other end, something magical can be seen. This is partly because I, thankfully, work in the realm of computer graphics. And partly because I'm a visual person who can imagine a visual result that others can appreciate.

And it's all in the demo.

There is no sleight of hand in a demo. Not when people are to be impressed. But sometimes people just don't get the value in what you construct. This is where you have to educate them, to show them the value, to connect it to something they can understand. You have to make all that obsessive development time mean something.

You need to become tolerable again.

I have talked about where ideas come from. About the different frames of mind we can be in. About how to foster creativity in the first place. But, once you get the idea and reason out how it can be implemented, there is a new phase that needs to be explored. How does this process unfold, this intense development? How does the large feature or the complex technique get implemented? How can we, as mere humans, even manage something like this? What tools do we use to do the seemingly impossible? What parts of our brains do we have to use to accomplish our goals?

Organization

The best method to tackle a large project is to get organized. I do this by taking notes, drawing pictures, and building tools.

I have found that some of the best notes to take are these:

  • new ideas or features that you would like to explore
  • problems that need to be resolved
  • places to look when updating to some new arrangement of the code
For most people, the note-taking process is a hassle. But you really need to start taking those notes to accomplish a project that is so big you can't keep it all in your head!

When drawing a picture, sometimes a flowchart is useful. Here we have the basic step in constructing a Laplacian pyramid. The objective is to decompose the step into smaller operations, a process known as top-down decomposition.

Here the basic step gets split into reduction, expansion, and difference substeps.

The reduction step is the process of converting an image into another image that is half the size in both width and height. And one which thus does not contain any of the highest-frequency information in the original image. The expansion step is the process of resizing the half-sized image back into full size. This image will be blurrier than the original by definition. The difference step is the process of determining the differences between the original full-sized image and the blurred full-sized image. These differences form the highest frequency detail in the image.

This step can be repeated to create a quarter-sized image and a half-sized detail image.

So not only is the image decomposed into various frequency bands, but the process of decomposing the image has also been decomposed into steps!


Rational Processes

Using your rational mind is partly deduction, and partly experience. For instance, when you implement a gradient operation, experience tells you that the center of a line has a zero gradient, and either side of the line has a non-zero gradient. As a practical demonstration of this, consider the Painter brush stroke. It is from an airbrush at high opacity with a 2 pixel diameter: a typical thin line.

If you compute the gradient using a Sobel technique, each 3x3 neighborhood of the image is convolved with two 3x3 kernels. There are variations on this theme, but usually the kernels will look something like this:

 1  2  1                   -1  0  1
 0  0  0      and       -2  0  2
-1 -2 -1                  -1  0  1

The first kernel is for computing gradients in the y direction (horizontally-oriented edges) and the second gradient is for computing gradients in the x direction (vertically-oriented edges).

Convolution means multiplying each element of the kernel with  corresponding pixel in the actual neighborhood in the image and forming a sum of the products.

You do that for both kernels, producing two sums, which you can imagine to be the x and y value of a vector field. The gradient is simply the magnitude of that vector.

The result of this is a gradient like you see here. Notice that the center of the line has an empty space in it, corresponding to a zero edge.

My rational mind already knows this through experience. So this means that if I want to use the gradient as a mask, and process the center pixels of the line, I will have to do something to fill in the center of the gradient. Like an annealing operation (a blur followed by an increase of the contrast or exposure of the gradient).

A rational mind mixed with the ability to visualize is probably the best way to get image processing operations done the quickest. But there are times when visualizing is not enough. We must see the intermediate results and check that they are being produced correctly and effectively. This brings us to the next technique: building tools.

Building Tools For Visualizing and Debugging

Any process in image processing, no matter what it is, will have intermediate results. There will be a blurred buffer, morphology applied to something, a gradient, a vector field, some representation that needs to be visualized. And we may need to verify that each step is being accomplished correctly, or verify that the step is even doing what we imagined it would, and is thus useful in the process of finding a solution.

So we need to construct a tool to see the intermediate results, to study them, to inspect them, and to debug their construction when your idea of what they should look like does not match what you get.

I have done this time and time again with large projects I have worked on, and it has enabled me to make much faster progress on a large project. And with a tool such as this, it becomes another thing: your demo environment. Not only can you see what's happening, but others can as well.

In order for a demo to come off smoothly, your implementation has to be fast as well. This means that you will need to implement selective update, and also you will need to make it go as fast as possible through optimization.

It doesn't matter what kind of project you are working on. You will always need to demo to justify your continued work. You will need to show progress. You will need to convince people that it can be done.

Tool construction (a testbed with demo capability) is your best tool to accomplish this!

Choosing the Best System to Build On

When constructing an image processing tool that involves steps, intermediate results, complex staging, or heavy computation, you need to choose a system to build it all on top of. For my purposes, I am considering a Macintosh as my system platform. But there are APIs and methodology that apply to any task.

Core Image is a good API for image processing, when your result is constructed one pixel at a time. It can allow you to utilize a GPU or a multi-core CPU to get the job done, and it can render the task of constructing a pass on your data into a simple thing. This is highly desirable when you have a lot of passes to construct. Core Image kernels are pretty easy to construct. You can reference any number of source images, but you may produce only one pixel in the destination image. This conceptually works pretty easy for blurs, color operations, compositing operations, and even transitions. You can build Core Image filters on top of your operations, and their parameters are entire images. And settings for your operations.

OpenGL is a good system for doing computation and presenting that computation inside a texture on screen. When this texture is transformed in 3D, as in "onto a 3D object" then this is the ideal API to accomplish the task. OpenGL may also be used for computing results on 2D flats that are presented using an orthographic projection. The computation can occur using almost any OpenGL operation or it can occur using a fragment program. This is conceptually the same as Core Image, so there is not much value in going the OpenCL route unless textures are going to be transformed in 3D.

OpenCL is a good system for doing arbitrary computation using the GPU and the CPU. You can support multiple output buffers as well as multiple input buffers. This means that come simulation operations are easier. Also, things like scatter and gather to and from planar color formats are much more natural. For instance, conversion of RGB to YCC where the Y is kept separate from the CbCr information can be supported very easily. One RGB image input, two images, one Y ands the other CbCr output.

Multi-core CPU computation is another good method to get things done fast. Here you can use Grand Central Dispatch to easily queue your computation on multiple CPUs. It has never been easier.


The Dangers of Obsession

You can get buried in a project. It can overcome you. This can have a very confusing effect. Unless you disentangle yourself from it for a while and take a step back, you run the risk of becoming irrevocably lost.

Back in my Caltech days, there were those people who were interested in Dungeons and Dragons (D&D). This sometimes resulted in people becoming obsessed with the rule systems and the immersive game-play.

And sometimes people just got lost. First they forgot to shower, neglecting their basic cleanliness. Then they showed the effects of malnutrition: the endless supply of Coke and little white powdered-sugar donuts. They started talking about fifth-level clerics and trolls. They always carried those little clear twelve- and twenty-sided dice around with them. And one day they didn't come to class. And never appeared again.

These were good, perhaps weak-willed people who were casualties of war. The war against obsession.

Yet I also saw people get obsessed in technical and scientific matters. These were called grad students. They would work on their thesis obsessively, disappearing into a dark cave until they came out with something hard and shiny like a diamond. I observed that obsession had its value, it seems.

Buried in Complexity

You can add more and more to a program over a period of many months. This is called add-on programming. And it can lead to another problem: complexity. A haphazard programmer can continue to kludge up a piece of code using branching and questionable data structures. This can lead to spaghetti code: twisty passages all alike.

The only solution to this problem is rethinking it: it must be rewritten. There is no other way if it is to be modified in the future. If you were adding more and more stuff to it, then this is a virtual certainty. At this point it is time to develop the right control structures and data structures to render the solution in the most effective and extensible way.

Immersive Programming

At some point you will need to debug what you have created and make it work. This requires total immersion. The better you have organized your code, the easier it will be to understand the processes it uses and thus to figure out which steps are correct and which are incorrect. This is the process of debugging.

It's like putting your head into the code and visiting codeland.

One thing is sure: you better have your head on straight when you debug a large project the first time. This will be when your organization and rethinking of control and data structures will pay off.

SOmetimes when debugging a project it becomes clear that there is a logic flaw in the code. This can be a small one, like an off-by-one error, or some statements that are out of order.

Or it can be a very large problem indeed. One with huge ramifications for the code.

My advice is to fix it before going any further, no matter how sweeping the implied changes are.

To Sum It All Up

Once you have been through fifty or so large projects, you begin to see patterns much more clearly. Perhaps you can profit from some of the patterns I have found, and some of the cautionary tales.

All I know is that I mostly had to learn these things the hard way.

Sigh.