Follow by Email

Wednesday, June 4, 2014

Interesting Persons, Part 1

Back in the 1980s I was fascinated by sound synthesis and analysis. The most well-known work I did was a little application called SoundCap (for Sound Capture) that was coupled with an Analog-to-Digital converter initially sold by Fractal Software, my partnership with Tom Hedges, and eventually sold by MacNifty. It is fortunate for many of the early Macintosh developers that this box hooked up to the back of a Mac through the serial port. Several sound-producing apps were produced with it, including Airborne! by San Diego's Silicon Beach Software.

Stephen St. Croix was a friend of mine. He contacted me at Fractal Design in the 1990s and wow'ed me with a few of his wondrous stories. We spoke at length on several occasions about digital sound synthesis, one of my many hobbies. I was surprised to learn that he was one of the inventors, at Marshall Electronics, of the Time Modulator, the box that introduced digital delay line flanging to more than a few famous musicians.

The most interesting story he told me was about the job he did with Lay's. Yes, the people who make the potato chips. It seems that their spokesman, Jack Klugman (of Quincy fame), had lost his voice as a result of throat cancer. This really made a problem for them because his commercials for Lay's potato chips were pulling quite well. After all, he was a very recognizable and a well-loved actor. His voice was distinctive. People listened to him.

Stephen informed me that they invented a new kind of voice synthesis device to recreate his voice. It used formant synthesis. Incredibly, they could exactly duplicate the distinctive gravelly sound of his voice in this manner! It seems that the very low-frequency warbling of his vocal cords, though inimitable by human voice impersonators, was entirely imitable by digital synthesis techniques.

At Marshall Electronics, they spent quite some time analyzing sound. They had room analyzers. And so they also had room simulators. But the least known cleverness involved voice analyzers. Imagine picking apart someone's voice, layer by layer. Figuring out the pitch-profiles and the syllabic inflections. Hand-tuning the cadence of the words. My mind was boggled constantly by Stephen's work.

I informed him of my work in music extraction. I had a special application called Do-Re-Mi that allowed you to whistle a tune that could be output using MIDI in key duration format, complete with amplitude and pitch profiles suitable for modulating a pitch wheel and a volume pedal. It could tell you how many cents (hundredths of a semitone) sharp or flat you were when you whistled. I used a clever correlation technique that involved a time-delta histogram for correlation, pitch-multiple disambiguation, Lagrange peak-finding, and other techniques for isolating the pitch accurately. This work was all done in the 1980s, before Fractal Design, as part of Fractal Software's work.

Tom Hedges, of course, was the hardware designer of the first Macintosh sound sampling box and my contribution was the software, much of it written in Motorola 68000 assembler. Our work with sound continued when we did a bit of work with Bogas Productions, involving Ed Bogas, Ty Roberts, Neil Cormia and others. I met them through a mutual acquaintance, Steve Capps, who was working on the Finder in 1984.

I wrote a sequencing application in 1984 and Tom was fascinated by it. He modified it so it could sequence samples and then proceeded to digitize his piano, note for note. This was in a day when samplers existed, but were quite crude and expensive. He encoded Rhapsody in Blue (he was so proud of playing it) and also a perennial favorite, Wasted on the Way (a thickly vocal-harmonic piece from Crosby, Stills, and Nash). We were both musically literate, but in different ways. I was a composer who played piano and I was fully familiar with sheet music (actually, I had to teach the rudiments of it to Tom before he could digitize the songs, which took a week or so to get it just right). Tom was a DJ with KZSU Stanford and an advanced audiophile. And he had a very wide understanding of music. His father played piano (which explained Tom's interest in Gershwin's Rhapsody in Blue).

So when I began speaking with Stephen St. Croix, I was very deep into audio analysis and synthesis. And the author of a very popular application for sound manipulation on the coolest new computer around, the Macintosh.

It wasn't a big surprise at all that we spent hours and hours talking about sound synthesis, analysis, music, and the recording business. Crazy times and a really good guy.

Seven Ways

There are seven ways that we best retain information, and five of these ways are tied to our natural innate skills as humans. These five are: typing, handwriting, speaking, seeing, and hearing. Two other ways help you complete the process of learning by semantic cross-tagging: mixing and anchoring.

Typing is a skill that we develop to codify something in symbolic notation: language. When we use keyboards for entry, this gives the language center of our brain a workout, which is concerned with coding and symbolication. But what are these codes and symbols? In language, we break our writing into chapters, chapters into paragraphs, paragraphs into sentences, sentences into words, and words into letters. These symbols, their organization, and their semantic meanings are inherent to symbolic processing. And, as humans, we definitely excel at this.

But there are more kinds of codes and symbols. When we use a musical instrument, we usually produce music in a coded symbolic representation: note for note. We break songs into sections, such as verses, refrains, and bridges. We break sections into chords. We layer melody on top of accompaniment, on top of bass. We accent with drums. We break melodies and chords into notes. We even break notes into tone, duration, and volume. Unlike text, music has quite a number of internal properties of continuity. Like staccato and slurring notes together. All of these are also kinds of language symbols that our brains use. Clearly we are using our brains' auditory centers when we make music.

I certainly didn't miss that writing is a bit like music, also. Because when we write creatively, we use plots like we use the interrelationship of melodies and leitmotivs. We make characters and develop alternate realities. We use metaphor and hyperbole. We season our writing with alliteration and onomatopoeia. A theme can pervade a novel. The resolution of a character's arc can stir us like a brilliant cadence. But writing struggles in the last chapter to compete with the finality and intense closure of the coda for a great piece of music.

Handwriting is a perfect way to match our muscle memory to our brains. We coordinate our hands and eyes to denote what we hear or what we think. Taking notes can be a compelling way to retain your thoughts. When we combine it with symbolic representations, we can end up with text, mathematical equations, musical notation, or even scribbles, doodles, and drawings. Let's face it, we think a bit more when we are handwriting than when we type, because a different part of our brain is required to do it.

In some ways, handwriting is utilizing the visual center of our brain. Typing does this also because we use our eyes to verify the text we enter. It seems like it is the connections between brain centers that reinforce our understanding of knowledge and help us to retain and memorize.

Speaking is our natural form of expression. We use our voices conversationally and this method of communication is highly generative, using our cognitive powers to express a thought, a concept, to deliver commands, to convince or inform. We use our language processing centers in a different way, and this is evident in the way we often speak very differently than we write: less formally. When we are in front of a group, we speak from memory, following a train of thought. Actors and presenters learn to do this and shade their performances with attitude and gesticulation, making the art of speaking a multi-dimensional task.

When we sing, we are expressing much more than just notes and words. We are using emotion. We link our generative capabilities to our voice when we sing. When we learn to play piano and sing, we are using much more of our brain than we usually might employ.

Seeing is much more than just looking at a photograph or diagram. It's also seeing in the mind's eye. Some people are very visual and can instantly see a concept in their head before they can express it. They can see the directions on a map in their head when they drive. Our eyes are the key to visualizing, certainly. But even blind people can see concepts. We have spatial reasoning to thank for this. When you have a visual memory, you get to see an object when it is described.

There is more to visualizing than just what is real, though. We can thank our imaginations for this fact. We can imagine impossible figures, for instance, and this concisely illustrates that our imaginations can transcend the real.

Perhaps for many people the spark of an idea comes visually. Perhaps concepts are symbolic for others. Perhaps concepts are neither visual nor symbolic for some: just floating in consciousness waiting to be expressed in some way.

Hearing is a natural way to capture and acquire information. But few of us actually hear a sentence and turn it into text in our head. Maybe a few of us turn it into visual information. But most likely hearing is its own domain. Somehow what we hear simply gets directly converted to knowledge. Still, often we must write something down to retain it, usually.

When I am composing or playing piano, I do not generally rely on my ear to remember the tune and the rhythm. Thankfully, I can record what I play. In other situations, I write down what I play (by hand), in common musical notation.

Even so, I can hear quite a bit of music in my head. It even seems like it is playing back. At 17 years old, I used to do this just before going to sleep, in that nebulous state in between waking and sleeping. I would consciously play a piece in my head. One that I was working on, or a familiar song. Or even a symphony. I guess I was practicing the ability to imagine polyphony. I was on the verge of being a composer at that age.

Mixing modes is the most powerful form of memorization. Sight-reading is a great way to commit a piece to memory. Playing a piece I'm composing, to cement the chord and melody structure is a good way to hone a piece. Record it and listen to it later, to take a step back and form new ideas for where the piece is going.

Listening and taking notes is a good mixture of modes for memorization and retention. But if you really want to cement it into your memory, type up your notes later. Draw diagrams. Learning, though, is much more than memorization. True retention requires application of a concept.

Anchoring is an essential endgame for learning a subject properly. I have a friend who says "I don't want to hire the people who can memorize terms and subjects, I want to hire people that can do something with what they've learned". Memorizing words in a foreign language is useful, but using those same words in sentences is much more powerful because then the words will forever be connected to concepts and subjects in your mind.

Sunday, February 2, 2014

The New Brand

In my notes from 1997 and 1998 I found this graphic from the last days of Fractal Design, immediately after the merger with MetaTools, and the start of the newly-formed company, which was to be called MetaCreations. It shows my irreverent take on typography, with letters verging on an alien alphabet. Perhaps this was my thinking in those days, clearly influenced by Star Trek: The Next Generation design and increasingly beginning to think that aliens were taking over my company.

The graphic was a last hurrah, buried in my logo-search stack. These were the papers that detail the search for a new company name and logo, begun as a result of merger. Those were turbulent days, full of interesting ideas that never made it. Here is another little sketch from that collection of the doodles drawn in those days when the meetings were long and the bickering was uncomfortable. I was already thinking about the metaphors for the idea processor.
Name search

First came the name search. The first edict, from John Wilczak (the MetaTools CEO and soon to be replaced) was that the name should contain "Meta". Once you put that flag in the ground, there are only so many names that can be chosen. We all bought into it.

John Derry and I thought up several meta-rooted names for the company. We centered around various concepts, like making: names like metaforge, metafactory, and metaforce. We also tried words around branding: names like metabrand, metaware, metafactor, and metacraft. Next we covered concept names like metapath, metaform, and metadesign. Of course, we also looked at location names like metaworld, metastage, metasphere, metawave, and metalevel. Combination names sometimes became useful, like metalith and metastar. We were going for simplicity and pith.

We had a hundred names to choose from, and three or four made the top of the list. But it turned out that they were always taken by one company or another, and so proved themselves to be unsuitable for our purposes.

In the end, the root word meta (meaning "on another level") was merged with "create" and we somehow found MetaCreations as our new name. We worked out the typestyle, using a PR branding firm called 30SIXTY, contracted by Sallie Olmstead. The result was a very good type treatment. One of their designs stuck, seen here. MetaCreations passed the trademark search and so we found ourselves in the position of needing a good catchphrase to go with it.

Catchphrase and Logo

At this point, we hired a new CEO and the branding began afresh. This cast us into disarray: the implications of three separate groups pushing in different directions. Let me introduce you to the three groups:

One group was Gary Lauer's group. Gary was the new CEO, hired by the board and taking on the challenge of merging two cultures with a third culture of his own. The second group was Kai Krause's group. Kai was the design thinker from MetaTools and the creative face of the company. The third group was John Derry and myself, Mark Zimmer. But, frankly, I took the lead because I was the representative to the logo group. As you will see, the three groups couldn't agree less. And yet we eventually found a logo. Here I show a doodle from a page drawn during the endless logo meetings.

The catchphrase Gary preferred was staid and traditional: The Visual Computing Software Company. The logos from his group were not unlike the ones from Claris in style. The other two groups saw the logos as pedestrian and frankly uninteresting. Here you can see one of the color schemes of his final logo set. The earlier ones were considerably more amateurish. This one features an M-shape with a bit of a shine nestling into it. My comments on this particular logo are unprintable, sadly: I will leave them to your imagination. Kai felt pretty much the same about this logo.
The catchphrase Kai preferred was genuinely clever: where great ideas are born. Also, John Wilczak, before he left, preferred start the migration, though I'm still not sure where he was going with that one. The logos from Kai's group initially centered on an egg - with the idea of hatching a new idea. Other groups just kept thinking "Meta lays an egg" as the headline. After a brief trademark search, we discovered Software Ventures had an egg with a shadow as its logo, and that was the final crack.
In the sessions for my group, John and I tossed around the creativity concept endlessly. One catchphrase was bringing creativity to you. Another was changing the way people think. Our final try was sparking your creativity. While fascinating and very ambitious, I still think Kai's catchphrase was best. Our logo designs centered on a hand - for software that was human-centric. The hand was the artist's signature from the days of the cave-painters. Other groups just saw "stop" - a hand telling you not to enter. Here, we placed it inside an oval form to suggest an egg.
All three groups had a basic problem - the other two groups opposed their design. So Gary, thinking his group was more equal than the other two, decided to make a presentation of his logo. Allowing us to choose the color scheme. Ah, that was a rough meeting.

This required Kai and I to work together on a new logo. I dredged up an old design: the trefoil knot. I had made this design in 1983 when working as a consultant for Auto-Trol (I was building them a 3D system for computer-aided engineering). I had resurrected this design when working on Detailer, the 3D Paint Program, adding a mirrored surface to it. Here we see a small version of this knot, produced in Detailer using a brassy look. Kai's people used Bryce to create a much cleaner, smoother nicely-tilted version of this knot, and added a slight soft shadow underneath it.

This shadow was eventually omitted and the catchphrase was changed once again.

This time I wasn't asked. As you can see, it became The Creative Web Company. The times were changing, and at this time, before the dot-com boom and collapse, everything had to be  naively covered with web-web-web. The trefoil knot had a nice reflection and self-shadowing, though. Kai and I approved the form.

After Kai and I decided to redesign the logo, he had his people design some new forms. Many of them were based on threefold symmetry, which I also tend to prefer. One designer, Athena Kekenes, produced some iconic figures that still hold up today. The first figures were triangular-symmetry organic forms that had a very interesting, yet somehow alien, lilt to them. With tree-like branching properties and spherical ends, it looked a bit like some strange form of sea-weed. You can see one of the designs here. We asked for some more ideas.

One was a cube that had a sphere subtracted from it. This, when viewed from the direction of a corner, had a six-fold symmetry that was quite pleasing.

Here you can see the cube, with the sphere subtracted. With Boolean operations in Bryce, this stuff just jumped right out of the imagination into the page.

When you look at it, it's a bit busy. It has a shadow, the three visible sides of the cube have different shades. The sphere has gradations. The objects even shadows itself!

I'm sure this is what Kai and Athena were thinking when they came up with the simpler version of the logo.


Here we can see the flower logo. It's very clean, simple, stylistic, and suggestive of 3D.

Negative space is used in two ways: the sphere is negative space when subtracted from the cube, and the flower is the result of looking through the negative space of the 3D form and coloring in the holes.

In some way, though, we found these cube-based logos to be too derivative of the Silicon Graphics logo. I even found the SGI logo in my notes right next to this one.

Kai's group never really gave up on the egg, until the trefoil knot became our focus. By that time we were tired of the process of logo search.

Here is an egg-derived logo that used the shape several times in negative and positive space to form an op-art logo. This held up better because it could be reproduced in black-and-white. As any good logo should.

But eventually we centered on the trefoil knot. It's iconic form was clear. Before any of his people had a chance to perfect the trefoil with reflections and shadows, he had his people do special black-and-white versions of the trefoil.

This version is exceptionally clever, using rotated versions of the knot silhouette in alternating colors, then subtracting out the middle.

Though we liked the line art reproduce-ability of this form, we thought it looked a bit too much like the woolmark logo.

In all, we spent too much time working on the logo. Gary, in his desperation, did an end-around and created his own logo and placed it on our products. This was done because, after all, we had to have something to put on shipping products. This was another logo based on the M (and, it seems, on Freddy Krueger).

As you can see, Gary even replaced the typeface we selected! His quest for a unified package design was next. This actually made us mad, because each product was a brand of its own and the entire concept of unified product packaging design seemed wrong.

What did we do to deserve this?

In all, I wasn't really satisfied with what came out of this process. Personally, I doubt Kai was either. Meta continued to create great products, nonetheless. Bryce, Painter, KPT, and Poser saw fantastic new versions. And Ray Dream Designer metamorphosed into Carrara, which was a very ambitious project and a great product in its own right.

And I just kept drawing new iconic logo designs. I knew that someday they might be useful to me. And someday the story would be told.

Thursday, December 12, 2013

The Unstoppable Now

The universe seems to be moving forwards, ever forwards, and there's nothing we can do about it. Or is there? Is the world too tangled to unravel?

Changing political landscapes

We all see the changes in the world. Climate change is the new catchphrase for global warming. Some areas of the world may never sort themselves out: the Koreas, the Middle East, Africa. Yet we can look to the past and see how a divided Germany re-unified, how South Africa eliminated the apartheid government and changed for the better (bless you Nelson Mandela, and may you rest in peace), how Europe has bonded with common currency and economic control.

Good and bad: will Europe solidify or become an economic roller coaster? Will Africa stabilize or continue on its path of tribal and religious genocide? Will Iran become a good neighbor, or will it simply arm itself with nuclear weapons and force a confrontation with Israel?

Despotic secular regimes have been overthrown in the Islamic world (Egypt, Tunisia, and Libya) and social media seems to have become a trigger for change, a tool for inciting revolution. Some regimes are experiencing slight Islamic shifts, like Turkey. But Egypt, having moved in that direction when the Islamic Brotherhood secured the presidency, is now moving away from it in yet another revolution.

The more things change, the more they stay the same.

The reason that social media became an enabler for the changes we are seeing is because people care. Crowdsourced opinion has an increasing amount of effect on government. Imagine that! Democracy in action. Even in countries that have yet to see democracy.

Let's look at one of the biggest enablers for this: the iPhone.

The iPhone and its effect

Yes, this is one of the biggest vehicles for change because it raised the bar on handheld social media, on internet in your pocket, and on the spread of digital photography. The ability to make a difference was propagated with the iPhone and the devices that copied it. Did Steve Jobs know he was starting this kind of change? He knew it was transformative. And he built ecosystems like iTunes, the App store, and the iBookstore to make it all work. Without the App Store, we'd all still be in the dark ages of social media. The mobile revolution is here to stay.

Holding the first iPhone was like holding a bit of the future in your hands. It was that far ahead of the pack. Its amazing glass keyboard was met with skepticism from analysts at first, but the public was quick to decide it was just fine for them. A phone that was just a huge glass screen was more than an innovation. It was a revolution.

It's even remarkable that Steve Ballmer panned the first iPhone when it came out. By doing that, he drew even more attention to the gamble Apple was making, and in retrospect made himself look amazingly short-sighted. And look where it got him! Microsoft's lack of success in the mobile industry seems predictable, once you see this.

Each new iPhone iteration brings remarkable value. Better telephony (3G quickly became 4G and that quickly became LTE), better sensors (accelerometer, GPS, magnetometer, gyrometer, etc.), and better camera, lenses, flashes, and BSI sensors. Bluetooth connectivity makes it work in our cars. Siri makes it work by voice command. Each new feature is so well-integrated that it just feels like it's been there all along. Now that I have used my iPhone 5S for awhile, I feel like the fingerprint sensor is part of what an iPhone means, now.

This all-in-one device has led to unprecedented spread of pictures. It and its (ahem, copycat) devices supporting Google's Android and more recently Microsoft's Windows Phone 8 have enabled social media to become ever more present, and influential, in our world.

In 2012, a Nielsen report showed that social media growth is driven largely by mobile devices and the mobile apps made by the social media sites.

Hackers, security, whistleblowers

A battle is being fought in the field of security.

Private hackers have been stealing identities and doing so much more to gain attention, and we know why.

Then hackers began attacking companies and countries, plying their expertise, for various causes. The Anonymous and LulzSec groups fought Sony against the restrictiveness of gaming systems, against the despotic regime in Iran, against banks they believed were evil.

Enter the criminal hacking consortia, which build programs like Zeus for constructing and tasking botnets using rootkit techniques, for perpetrating massive credit card fraud.

Then the nation state hacking organizations began to do their worst. With targeted viruses like Flame, Stuxnet, and Duqu. Whole military organizations are built, like China's military unit 61398 with the sole task of hacking foreign businesses and governments.

Is anybody safe?

It is very much a sign of the times that the latest iPhone 5S features Touch ID. You just need your fingerprint to unlock it. Biometrics like fingerprints and iris scans (something only you are) are becoming a good method for security engineering. There are so many public hacker attacks that individual security is quickly becoming a major problem.

New techniques for securing your data, like multi-factor authentication, are becoming increasingly both popular and necessary. Accessing your bank and making a money transfer? Enter the passcode for your account (something only you know), then they send your trusted phone (something only you have) a text message and you enter it into the box. The second factor makes it more secure because it is more certain to be you and not some interloper spoofing you.

The landscape of security has been forever changed by the whistleblowers. Whole organizations were built to support them (WikiLeaks) and governments, banks, and corporations were targeted. The release of huge sets included confidential data from the US Military, from the Church of Scientology, from the Swiss Bank Julius Baer, from the Congressional Research Service, and from the NSA, via Edward Snowden.

It is notable that WikiLeaks hasn't released secret information from Russia or China. It is most likely that they would be collectively assassinated were that the case. Especially given such events as the death of Alexander Litvinenko.

The founder of WikiLeaks, Julian Assange, is currently a self-imposed captive in the Ecuadorean embassy in London. In an apparent coup, one of the WikiLeaks members, Daniel Domscheit-Berg decided to leave WikiLeaks, and when he left, he destroyed documents containing America's no-fly list, the collected emails of the Bank of America, insider information from 20 right-wing organizations, and proof of torture in an undisclosed Latin American country (unlikely to be Ecuador, and much more likely to be one of its adversaries, such as Colombia). Domscheit-Berg apparently left to start up his own leaks site, but later decided to merely offer information on how to set one up.

The trend is that the general public (or at least a few highly-vocal people) increasingly expect all secrets to be revealed. And yet, I expect that they would highly value their own secrets. This is why there is such a trend towards protecting individual privacy.

The reality is organizations like WikiLeaks are proud to reveal secrets from the western democracies like America, but are reticent to do so for America's adversaries like Russia. Since this creates an asymmetric advantage, these organizations can only be viewed as anti-American. Even if they aren't specifically anti-American, they inevitably have this effect.

So they are playing for the Russians whether they believe it or not.

Does the whistleblower movement have the inherent potential for disentangling the world political situation? Perhaps in the sense that knots can be cut, like the Gordian Knot. But disentangled? No.

The only way that the knots can be unraveled is if everybody begins to play nice. And I don't really see that happening.

Perhaps Raul Castro will embrace America as an ally now that we have shaken hands. Perhaps Iran will stop its relentless bunker-protected quest for Uranium enrichment. Perhaps the Islamic militias in Africa will declare a policy of live-and-let-live with their Christian neighbors and stop the wholesale slaughter.

It's good to be idealistic. In idealism, when it is peace-oriented, we see a chance for change. In the social media revolution we see a chance for the moderate majority to be heard.

Only we can stop the unstoppable now.

Tuesday, October 22, 2013

Knots, Part 3

Knots are also intertwining, and sometimes present a bit of complexity when rendering them. Separate ends may be intertwined, as when we tie our shoes. But loops can also intertwine, and this creates a kind of impossible figure because they are most difficult to actually make. As with Borromean rings and the Valknut, we can also use twists and loops.

In the previous post on knots, I included what I considered to be the simplest intertwining of a loop containing a twist.

Here a gray four-leaf clover loop with twists at the corners intertwines with a brown loop with inside twists. This creates a form of duality because the brown loop is really a four-leaf clover loop turned inside-out. The over-under rule is used on each thread to produce a maximally tied figure. A bit of woodcut shading is also used.

Now I'd like to show a natural extension of the figure eight intertwined with a simple loop. I designed this form a few days ago, but it took me a few days to get to a proper rendering. I used the same techniques to produce this as I used in the examples from the previous post. Except that I used a spatter airbrush on a gel layer to create the shading when one thread passes under another.

I used a simple airbrush on a screen layer to create the ribbon highlights. As always, I wish I had more time to illustrate!

But this figure shows how four loops can become intertwined in an interesting way by twisting each loop once.

Each knot I draw starts out as a thin black line on a page. I don't even worry about the crossings and their precedence. I just try to get the form right. The final result is very complex and simple at the same time.

Knots have their stylistic origins in antiquity. They were used for ornament and symbology by the Celts, the Vikings, and the ancient Chinese.

A purple loop with three twists intertwines with a blue circle in this knot.

The shines were created using a lighter, more saturated color and mixed into the gel layer using the Just Add Water brush in Painter. It's a bit like a Styptic pencil and was one of the first mixing brushes I created in Painter in 1991.

Enjoy!

Saturday, October 12, 2013

Knots, Part 2

In an earlier post, I talked about knots. And knots are entanglement, there is no doubt. They serve to bind, secure, to tie down, to hang up, and even to keep our shoes on.

In this post I will talk about knots as a way to entangle two threads. I will continue to use the planar method of showing knots, combined with precedence at crossover points. An over-under rule is used to keep the knots maximally entangled.

In addition, I will show how to draw knots using my drawing style, which is a little bit scratchboard-watercolor, a little bit woodcut, and a lot retro. You can find more of my style (and lots more knots) at my Pinterest artwork board.

The over-under rule characterizes one of the best ways to organize the making of a knot. In its simplest form, you can see a less confusing, more iconic representation.

This knot is a clover interleaved with a ring. The ancient name for the clover symbol is the Saint John's Arms. The clover is used to symbolize places of interest on a map, the command key on Macs, and cultural heritage monuments in Nordic countries, Estonia, and a few other places. This symbol has been around for at least 1500 years.

The other day while working on a complicated programming problem, and I drew such a clover absent-mindedly and realized suddenly that I could pass a ring through its loops, hence this figure. When you draw the clover as a knot, it is also called the Bowen knot.

It seemed like the simplest thing at the time. Then I tried to draw it in its current form: not so easy! After a few hours (off and on) with Painter yesterday I finally had this figure smoothed out in nice outlines. Today I shaded and colored it. Sure, maybe the purple is a bit much, but I like the simple forms and the way they intertwine.

After making this figure originally, I went back to my programming. But there was a nagging question in the back of my head. What was the simplest intertwined figure that had a twist in it? I had to think simple, so I drew an infinity as a twisted bit of rope.

Then I wondered how a ring might enter the picture. I tried one way and then it hit me: use the over-under rule.

This is the figure I ended up with. Now that's much simpler than the first, and iconic in its own way, I think. It could be a logo in an even simpler form. O-infinity? Well, there's nothing like a logo created for no particular reason!

But how are such knots created, really? Is there an easy way?

Start with a line drawing showing the paths of the two threads. This is how I started. I put them at an angle because I drew the oval first. This was a natural angle for me to draw it right-handed.

Then I turned the page and drew the infinity so that the oval passed through each of the figure-eight's loops.

It wasn't exactly symmetric. Though I do like symmetry, I like even more to make my drawings a bit imperfect to show that they are hand-drawn. If I were designing for a logo, though, I'm not sure I'd make the same choice.

Next I drew the figure again, but with an indication (by breaking the lines so they don't quite cross over each other) of which thread is on top and which crosses under.

Here is my first attempt.

But there is a basic flaw: if I were to grab the oval and pull it, it would easily come loose from the figure-eight! Needless to say this wasn't the knot I was looking for so I redrew it again using the tried-and-true over-under rule which states this: as you pass along a thread, it must pass first over and then under the other threads, alternating in succession.

Here is the result of redrawing it. As you can see, it has a much nicer integrity. It seems to be entangled properly.

So now I have a basic plan for the entanglement of the knot. Now I must plan to draw the knot using outlines for each thread. This means that each thread must really be two lines that are parallel to each other. I call this the schematic version.

I use the original line drawing as a guide and draw two lines parallel to the original line, one line on each side. Originally I worked in black ultra-fine Sharpie on thick 32# copy paper.

The wide lines drawing, as you can see, is getting a bit complicated. But fortunately I have a legend for which lines to draw in and which lines to erase: the second hidden-line diagram above.

I use this as a template so I can redraw the image, using only the new wide lines. With this I can create a hidden-line version of the wider knot. It is easy to accomplish this by placing the blank sheet over the original and using it as tracing paper.

Of course when I do this, I avoid drawing the centerline. This keeps the drawing simple. In this way, you can see that the centerline was a for-reference-only diagram for what follows.

Here is the wide hidden-line version. This one is much clearer and certainly much closer to what I was trying to create.

But it is a bit flat, like a road. And the crossings are really dimensionless.

I brought this into Painter and smoothed out the lines, making them a bit more consistent. Then I worked a bit of magic by using my woodcut style.

How do I do that?

I'm glad you asked! At each crossover, I draw three or four lines on the "under" sides of the crossover. Then I draw to create wedges of black that meet very close to the "over" lines. Finally I use a small white brush to sculpt the points of the wedges, making them very pointy.

This simulates what could be created using a V-shaped ductal tool with linoleum or wood.

Well, this process takes a bit of time. If you count, you can see I had to create about 40 wedges, sculpting each of them into a perfect line or curve. But I am patient.

Sometimes I widened the "under" lines to meet the outermost wedges. This makes a more natural-looking woodcut.

Finally, in Painter I use a gel layer and fill in color on top, filling in each area of the thread using a slightly different color.

This gives me the final result, a unified entanglement of two interesting threads! This result is quite similar to the scratchboard-watercolor look that I like. I used the same technique exactly to create the knot at the top of this post. In past posts, I have used this technique to create many illustrations, of course. I like this look because it's easy to print and it is good for creating logos.

For instance, if I take the plain wide line version and blacken the white background, I get a version that can be manipulated into a logo form. After that, I invert the colors of the image and that gives me a clean black logo on white. Then I use a layer in Screen mode to colorize the black segments of the threads.

Here is a logo version of the knot, expressed in colorful tones. But this won't do for O-infinity at all! It might easily be an O in purple and the figure-eight in navy blue. On black.

But that's not my idea of a good company name, so I will leave it like this!

There are plenty of styles for redrawing this knot that make interesting illustrations.

This one is not a knot, really. But it is an interesting redrawing of the figure.

This is called an inline treatment.

Remember the Neuland Inline font that was used for the movie Jurassic Park?

This figure can be used as the start of about 100 different illustrations, depending upon which crossings you want to black in or erase.

I tried several before I realized that it wasn't the direction I wanted to go with the logo.

Trial-and-error is often the way with creativity!

I have other knots I'd like to draw, but they certainly do take time! It's good to be drawing again.

Sunday, October 6, 2013

Bigger Pixels

What is better? Bigger pixels or more megapixels? In this blog post, I will explain all. The answer may not be what you think it is!

Image sensors

Digital cameras use image sensors, which are rectangular grids of photosites mounted on a chip. Most image sensors today in smartphones and digital cameras (intended for consumers) employ a CMOS image sensor, where each photosite is a photodiode.

Now, images on computers are made up of pixels, and fortunately so are sensors. But in the real world, images are actually made up of photons. This means that, like the rods and cones in our eyes, photodiodes must respond to stimulation by photons. In general, the photodiodes collect photons much in the way that our rods and cones integrate the photons into some kind of electrochemical signal that our vision can interpret.

A photon is the smallest indivisible unit of light. So, if there are no photons, there is no light. But it's important to remember that not all photons are visible. Our eyes (and most consumer cameras) respond only to the visible spectrum of light, roughly between wavelengths 400 nanometers and 700 nanometers. This means that any photon that we can see will have a wavelength on this range.

Color

The light that we can see has color to it. This is because each individual photon has its own energy that places it somewhere on the electromagnetic spectrum. But what is color, really? Perceived color gives us a serviceable approximation to the spectrum of the actual light.

Objects can be colored, and lights can be colored. But, to determine the color of an object, we must use a complicated equation that involves the spectrum of the light from the light source and the absorption and reflectance spectra of the object itself. This is because light can bounce off, be scattered by, or transmit directly through any object or medium.

But it is cumbersome to store light as an entire spectrum. And, since a spectrum is actually continuous, we must sample it. And this is what causes the approximation. Sampling is a process by which information is lost, of course, by quantization. To avoid this loss, we convolve the light spectrum with color component spectra to create the serviceable, reliable color components of red, green, and blue. The so-called RGB color representation is trying to approximate how we sense color with the rods and cones in our eyes.

So think of color as something three-dimensional. But instead of X, Y, and Z, we can use R, G, and B.

Gathering color images

The photons from an image are all mixed up. Each photodiode really just collects photons and so how do we sort out the red photons from the green photons from the blue photons? Enter the color filter array.

Let's see how this works.

Each photosite is really a stack of items. On the very top is the microlens.

The microlenses are a layer of entirely transparent material that is structured into an array of rounded shapes. Bear in mind that the dot pitch is typically measured in microns, so this means that the rounding of the lens is approximate. Also bear in mind that there are millions of them.

You can think of each microlens as rounded on the top and flat on the bottom. As light comes into the microlens, its rounded shape bends the light inwards.

The microlens, as mentioned, is transparent to all wavelengths of visible light. This means that it is possible that an infrared- and ultraviolet-rejecting filter might be required to get true color. The colors will become contaminated otherwise. It is also possible, with larger pixels, that an anti-aliasing filter, usually consisting of two extremely thin layers of silicon niobate, is sandwiched above the microlens array.

Immediately below the microlens array is the color filter array (or CFA). The CFA usually consists of a pattern of red, green, and blue filters. Here we show a red filter sandwiched below.

The CFA is usually structured into a Bayer pattern. This is named after Bryce E. Bayer, the Kodak engineer that thought it up. In this pattern, there are two green pixels, one red, and one blue pixel in each 2 x 2 cell.

A microlens' job is to focus the light at the photosite into a more concentrated region. This allows the photodiode to be smaller than the dot pitch, making it possible for smaller fill factors to work. But a new technology, called Back-Side Illumination (BSI) makes it possible to put the photodiode as the next thing in the photosite stack.   This means that the fill factors can be quite a bit larger for the photosites in a BSI sensor than for a Front-Side Illumination (FSI) sensor.

The real issue is that not all light comes straight into the photosite. This means that some photons are lost. So a larger fill factor is quite desirable in collecting more light and thus producing a higher signal-to-noise ratio (SNR). Higher SNR means less noise in low-light images. Yep. Bigger pixels means less noise in low-light situations.

Now, the whole idea of a color filter array consists of a trade-off of color accuracy for detail. So it's possible that this method will disappear sometime in the (far) future. But for now, these patterns look like the one you see here for the most part, and this is the Bayer CFA pattern, sometimes known as an RGGB pattern. Half the pixels are green, the primary that the eye is most sensitive to. The other half are red and blue. This means that there is twice the green detail (per area) as there is for red or blue detail by themselves. This actually mirrors the density of rods vs. cones in the human eye. But in the human eye, the neurons are arranged in a random speckle pattern. By combining the pixels, it is possible to reconstruct full detail, using a complicated process called demosaicing. Color accuracy is, however, limited by the lower count of red and blue pixels and so interesting heuristics must be used to produce higher-accuracy color edges.

How much light?

It's not something you think about every day, but the aperture controls the amount of light let into the camera. The smaller the aperture, the less light the sensor receives. Apertures use f-stops. The lower the f-stop, the larger the aperture. The area of the aperture, and thus the amount of light it lets in, is proportional to the reciprocal of the f-stop squared. For example, after some calculations, we can see that an f/2.2 aperture lets in 19% more light than an f/2.4 aperture.

Images can be noisy. This is generally because there are not enough photons to produce a clear, continuous-tone image, and even more because the arrival time of the photons is random. So, the general rule is this: the more light, the less noise. We can control the amount of light directly by increasing the exposure time. And increasing the exposure time directly lets more photons into the photosites, which dutifully collect them until told not to do so. The randomness of the arrival time is less a factor when the exposure time increases

Once we have gathered the photons, we can control how bright the image is by increasing the ISO. Now, ISO is just another word for gain: a volume knob for the light signal. We crank up the gain when our subject is dark and the exposure is short. This restores the image to a nominal apparent amount of brightness. But this happens at the expense of greater noise because we are also amplifying the noise with the signal.

We can approximate these adjustments by using the sunny 16 rule: on a sunny day, at f/16, with ISO 100, we use about 1/120 of a second exposure to get a correct image exposure.

The light product is this:

(exposure time * ISO) / (f-stop^2)

This means nominal exposure can be found for a given ISO and f/number by measuring light and dividing out the result to compute exposure time.

If you have the exposure time as a fixed quantity and you are shooting in low light, then the ISO gets increased to keep the image from being underexposed. This is why low-light images have increased noise.

Sensor sensitivity

The pixel size actually does have some effect on the sensitivity of a single photosite in the image sensor. But really it's more complicated than that.

Most sensors list their pixel sizes by the dot pitch of the sensor. Usually the dot pitch is measures in microns (a micron is a millionth of a meter). When someone says their sensor has a bigger pixel, they are referring to the dot pitch. But there are more factors affecting the photosite sensitivity.

The fill factor is an important thing to mention, because it has a complex effect on the sensitivity. The fill factor is the amount of the array unit within the image sensor that is devoted to the surface of the photodiode. This can easily be only 50%.

The quantum efficiency is related to the percentage of photons that are captured of the total that may be gathered by the sensor. A higher quantum efficiency results in more photons captured and a more sensitive sensor.

The light-effectiveness of a pixel can be computed like this:

DotPitch^2 * FillFactor * QuantumEfficiency

Here the dot pitch squared represents the area of the array unit within the image sensor. Multiply this by the fill factor and you get the actual area of the photodiode. Multiply that by the quantum efficiency and you get a feeling for the effectiveness of the photosite, in other words, how sensitive the photosite is to light.

Megapixel mania

For years it seemed like the megapixel count was the holy grail of digital cameras. After all, the more megapixels the more detail in an image, right? Well, to a point. Eventually, the amount of noise begins to dominate the resolution. And a little thing called the Airy disc.

But working against the megapixel mania effect is the tiny sensor effect. Smartphones are getting thinner and thinner. This means that there is only so much room for a sensor, depth-wise, owing to the fact that light must be focused onto the plane of the sensor. This affects the size of the sensor package.

The granddaddy of megapixels in a smartphone is the Nokia Lumia 1020, which has a 41MP sensor with a dot pitch of 1.4 microns. This increased sensor size means the phone has to be 10.4mm thick, compared to the iPhone 5S, which is 7.6mm thick. The extra glass in the Zeiss lens means it weighs in at 158g, compared to the iPhone 5S, which is but 115g. The iPhone 5S features an 8MP BSI sensor, with a dot pitch of 1.5 microns.

While 41MP is clearly overkill, they do have the ability to combine pixels, using a process called binning, which means their pictures can have lower noise still. The iPhone 5S gets lower noise by using a larger fill factor, afforded by its BSI sensor.

But it isn't really possible to make the Lumia 1020 thinner because of the optical requirements of focusing on the huge 1/1.2" sensor. Unfortunately thinner, lighter smartphones is definitely the trend.

But, you might ask, can't we make the pixels smaller still and increase the megapixel count that way?

There is a limit, where the pixel size becomes effectively shorter than the wavelength of light, This is called the sub-diffraction limit. In this regime, the wave characteristics of light begin to dominate and we must use wave guides to improve the light collection. The Airy disc creates this resolution limit. This is the diffraction pattern from a perfectly focused infinitely small spot. This (circularly symmetric) pattern defines the maximum amount of detail you can get in an image from a perfect lens using a circular aperture. The lens being used in any given (imperfect) system will have a larger Airy disc.

The size of the Airy disc defines how many more pixels we can have with a specific size sensor, and guess what? It's not many more than the iPhone has. So the Lumia gets more pixels by growing the sensor size. And this grows the lens system requirements, increasing the weight.

It's also notable that, because of the Airy disc, decreasing the size of the pixel may not increase the resolution the resultant image. So you have to make the sensor physically larger. And this means: more pixels eventually must also mean bigger pixels and much larger cameras. Below a 0.7 micron dot pitch, the wavelength of red light, this is certainly true.

The human eye

Now, let's talk about the actual resolution of the human eye, computed by Clarkvision to be about 576 megapixels.

That seems like too large a number, and actually it seems ridiculously high. Well, there are about 100 million rods and only about 6-7 million cones. The rods work best in our night vision because they are so incredibly low-light adaptive. The cones are tightly packed in the foveal region, and really only work in lighted scenes. This is the area we see the most detail with. There are three kinds of cones and there are more red-sensitive cones than any other kind. Cones are usually called L (for large wavelengths), M (for medium wavelengths), and S (for small wavelengths). These correspond to red, green, and blue. The color sensitivity is at a maximum between 534 and 564 nanometers (the region between the peak sensitivities of the L and M cones), which corresponds to the colors between lime green and reddish orange. This is why we are so sensitive to faces: the face colors are all there.

I'm going to do some new calculations to determine how many pixels the human eye actually does see at once. I am defining pixels to be rods and cones, the photosites of the human eye. The parafoveal region is the part of the eye you get the most accurate and sharp detail from, with about 10 degrees of diameter in your field of view. At the fovea, the place with the highest concentration, there are 180,000 rods and cones per square millimeter. This drops to about 140,000 rods and cones at the edge of the parafoveal region.

One degree in our vision maps to about 288 microns on the retina. This means that 10 degrees maps to about 2.88 mm on the retina. It's a circular field, so this amounts to 6.51 square millimeters. At maximum concentration with one sensor per pixel, this would amount to 1.17 megapixels. The 10 degrees makes up about 0.1 steradians of solid angle. The human field of vision is about 40 times that at 4 steradians. So this amounts to 46.9 megapixels. But remember that the concentration of rods and cones falls off at a steep rate with the distance from the fovea. So there are at most 20 megapixels captured by the eye in any one glance.

It is true that the eye "paints" the scene as it moves, retaining the information for a larger field of view as the parafoveal region sweeps over the scene being observed. It is also true that the human visual system has sophisticated pattern matching and completion algorithms wired in. This probably increases the perceived resolution, but not by more than a factor of two by area.

So it seems unlikely  that the human eye's resolution can exceed 40 megapixels. But of course we have two eyes and there is a significant overlap between them. Perhaps we can increase the estimate by 20 percent, to 48 megapixels.

If you consider yourself using a retina display and then extrapolate to the whole field of view, this is pretty close to what we would get.

So this means that a camera that captures the entire field of view that a human eye can see (some 120 degrees horizontally and 100 degrees vertically in a sort of oval-shape) could have 48 megapixels and you could look anywhere on the image and be fooled. If the camera were square, it would probably have to be about 61 megapixels to hold a 48 megapixel oval inside. So that's my estimate of the resolution required to fool the human visual system into thinking it's looking at reality.

Whew!

That's a lot of details about the human eye and sensors! Let's sum it all up. To make a valid image with human-eye resolution, due to Airy disc size and lens capabilities, would take a camera and lens system about the size and depth of the human eye itself! Perhaps by making sensors smaller and improving optics to be flexible like the human eye, we can make it twice as good and half the size.

But we won't be able to put that into a smartphone, I'm pretty sure. Still, improvements in lens quality, BSI sensors, wave guide technology, noise reduction, and signal processing, continue to push our smartphones to ever-increasing resolution and clarity in low-light situations. Probably we will have to have cameras with monochromatic (rod-like) sensors to be able to compete with the human eye in low-light scenes. The human retinal system we have right now is so low-light adaptable!

Apple and others have shown that cameras can be smaller and smaller, such as the excellent camera in the iPhone 5S, which has great low-light capabilities and a two-color flash for better chromatic adaptation. Nokia has shown that a high-resolution sensor can be placed in bigger-thicker-heavier phones that has the flexibility for binning and better optics that push the smartphone cameras ever closer to human-eye capabilities.

Human eyes are hard to fool, though, because they are connected to pattern-matching systems inside our visual system. Look for image interpretation and clarification algorithms to make the next great leap in quality, just as they do in the human visual system.

So is it bigger pixels or simply more of them? No, the answer is better pixels.