Friday, January 6, 2012

Image Sensors

I used to carry around a heavy camera in a Tamrac bag (and pocketed a Swiss Army knife) wherever I went. Why is it now an iPhone 4S? It's all about priorities, and image sensors.
The tech Swiss Army knife.
Image taken in natural light with an iPhone 4S.
The iPhone 4S is my current touchstone, leash, and vital connector with the world. The giant Tamrac bag with the Canon 1Ds MarkII has been replaced by a messenger bag (or man-purse as some people call it - ahem) with an iPad 2 as its main treasure. Now the iPhone is in the pocket and the Swiss Army knife had been moved to the bag. With a bottle of Litozin. And the priorities: the iPhone 4S is now a camera, phone, instant messenger, browser, map, compass, email reader, dictation device, and personal assistant all rolled into one: It has become much more useful than a Swiss Army knife to me. This is largely because of the advancements of miniaturization, the capability of MEMS, and the huge progress in image sensors.

We are beginning to replace many of our cameras with a camera phone. These flickr camera usage stats certainly point to that. Let's talk now about the single most important aspect of the camera phone that has altered the status quo: the CMOS image sensor.

So, what's in an image sensor? Colors! Right? Well, that is only partially right, since the main ingredient is silicon. But, how can something so small record colors so well, and in such good focus?

It's All in the Pixels

A CMOS image sensor in a smartphone consists of a very small piece of a silicon wafer that has been fabricated in a special design that senses photons, converts them to electrons, transports them to analog-to-digital converters (ADCs), converts them to digital information, and transmits them as an array of data.

To the right we see a hand-drawn cutaway of a small portion of an image sensor using today's techniques. At the very top, a set of transparent microlenses are the sensor's first point of contact with the photons, which are then focused onto a smaller area. This is necessary because the photodetectors (photodiodes) don't fill the complete area for each pixel. The fraction of a pixel area that they do fill up is called the fill factor. In between the microlenses and the photodiodes lies the color filter array (CFA). This rectangular array of filters contains a separate filter for each photodetector. The filters take the broad-spectrum light and filter it down to a portion of the light that is limited to a particular part of the spectrum for the specific color of the photosite. This behaves similarly to the rods and cones in our own eyes, and is the reason we see trichromatic color. Each photosite takes either a red, a green, or a blue measurement.

The Swedish Natural Color System
Our vision system also works in red, green, and blue. Well, not exactly. Close to the fovea, the central nerve bundle we see with in the back of our eyes, we all have a predominance of red and green cones. There are not many blue cones as compared with red and green ones (perhaps only 2%) and they are located farther away from the fovea. It is odd, but the red cones outnumber the green cones by about 2 to 1. But we have way more rods than cones. The rods respond to luminance, the panchromatic (broad-spectrum) brightness of the light. Because about 60% of panchromatic light is green, it means that green is the most important color to our sight.

Our nervous systems process this raw information into something closer to the Swedish Natural Color System, which organizes red and green, and yellow and blue (really indigo) as opponent colors (diagram shown above). This helps to explain why different kinds of colorblindness occur. And, by the way, it's a really nice way to organize color. And it's Swedish. Can't overemphasize that.

In a color filter array, the photosites are arranged in a rectangular lattice of pixels in what is commonly called a Bayer pattern. This was named after Eastman Kodak scientist Bryce E. Bayer, who first thought it up.

The CFA Bayer pattern has 25% red pixels, 50% green pixels, and 25% blue pixels. Being able to separate them into color components makes it easier to construct an image sensor. But this also means that many of the photons are filtered out and thus the sensor is less efficient than it would be if it were only sensing panchromatic light.

Foveon sensors, first invented by Carver Mead and friends of Caltech, actually use all of the photons coming into a photosite, using partially transparent layers to stop each light component in a separate layer. Clever idea!

But, as it happens, almost all smartphones use Bayer-pattern CFAs for their image sensor packages. The color pattern is converted to a full RGB color at each pixel through a complex and mysterious process called demosaicing. More on that in a later post, if I'm allowed.

Source: Eric Fossum, Dartmouth.
Once the photons are focused and filtered, they hit the photodetector. To right, you see the focus process. The photodetector converts the photons to electrons, and stores them in a charge well. A switch at the photosite controls the transfer of charge, which drains the charge well. An amplifier at the photosite converts the electrons to voltages after the charge transfer. Then, the voltages propagate down the wire as signals and are converted to digital values by column ADCs. To cut down on cost, the ADCs are time-multiplexed usually. At the end of the process, each pixel gets a 10-bit value (ranging from 0 for black to 1023 for a full charge well). In DSLR sensors, with much bigger photosites, this can be a 12- or 14-bit number.

One thing to remember is this: the bigger the photosite, the shorter the exposure can be to get a decent noise-free signal. This is because a larger photosite can collect more photons per unit time. The size of a photosite is measured by the distance between squares on the sensor, called the pixel pitch. Anybody interested in seeing how much pixel pitch matters in the quality of an image can check out Roger N. Clark's page on sensor size.

Smartphones usually have very small sensors in them, with pixel pitches of 1.4 microns. Note that the wavelength of visible light varies from 0.4 microns for violet to 0.7 microns for red.

Latest Improvements in Sensors

Helen obsessively uses an iPhone 4
for social media, even while at dinner!
The latest improvement in sensor technology has been the Back Side Illumination (BSI) technology. In a typical Front Side Illumination (FSI) sensor, the metal traces and amplifiers that carry the voltages to the ADCs are on the same side of the wafer as the photodetector. This causes problems with fill factor, because the metal traces take up space in a photosite. Even when they are stacked vertically, there is still a problem with photon scattering and reflection. In a BSI sensor, the photodetectors are on the back side of the wafer, along with the microlenses, the color filter array, and the metal masking layer. This means that there is more room for the photodetector. And thus a higher fill factor.

And with a higher fill factor, the detectors are a larger percentage of the pixel pitch and thus can collect more photons per unit time. All this means that, with BSI sensors, you can get a better (less noisy) exposure in low light.

The iPhone 4S (according to chipworks.com) has an 8 megapixel BSI Sony image sensor with a 1.4 micron dot pitch. According to iSuppli, the sensor package uses a five-lens system, which apparently results in better sharpness. To right: an image captured at a restaurant in available light shows Helen with her face primarily illuminated by an iPhone. And, oh yes, those are Dr. Dre Monster Beats.

Smaller and Smaller?

What happens when sensor sizes get so small that the dot pitch is about the same as a wavelength of light? This is called a sensor beyond the sub-diffraction limit. As it turns out, this can actually help. because it is not necessary to have an anti-aliasing filter (present in larger sensors) at this small scale. Most smartphones do not have an anti-aliasing filter.

So it's a good thing, right? Not completely. Remember, the smaller the pixel, the fewer photons per unit time can be collected. This translates to: more noise. So there is an informational limit that we reach. Let's talk about the sub-diffraction limit now. There is some thinking that the resolution of the image doesn't actually go up past this limit. It just gets more blurry. It is this blur that helps eliminate the need for an anti-aliasing filter in the first place. So you can look at it as a point of diminishing returns.

Perhaps a deconvolution can correct this, but such operations are expensive and possibly beyond the capabilities of modern hardware.

Perhaps the most important thing about this is that, to get more megapixels, the sensor will have to be physically larger, and with it the focal length of the sensor package must also be larger.

For more ideas on the possible future of image sensors, check out Eric Fossum's slides on image sensors.

The Whole Package

There is more to a smartphone camera than just an image sensor. The lens is also important as well! But particularly, the speed of autofocus is important. A Norwegian company called poLight AS has developed a new lens system, called TLens, that autofocuses in mere milliseconds, using piezoelectrics. Such a system could be used to keep a video in focus in real time.

Or it could be used to focus on multiple subjects when capturing a frame, for all-in-focus applications.

If you are not faint of heart, you can look at this EETimes-Europe report on poLight lens systems.

Another company with a novel lens approach is e-Vision, that has developed a liquid crystal lens that can be refocused electrically. This particular technology could be programmed for image stabilization, I believe.

Both these technologies can be used in smartphone cameras, because they are amenable to miniaturization, and could replace the current voice coil motor (VCM) technology.

No comments:

Post a Comment