French Fried Files

Chances are, you've heard that computers "think" in 1s and 0s. By themselves, a 1 or a 0 doesn't really hold much information, so programmers group these bits into larger units, such as bytes and kilobytes.

Many things can easily fit into a small number of bytes; text for example, doesn't need much room as every letter can be represented by a single byte. Pictures, on the other hand, require an awful lot of those 1s and 0s. Sound and video files use even more space, but for our purposes we'll just focus on images.

The simplest, and the most wasteful, way to store an image digitally is to use four bytes for each pixel. These bytes represent the Red, Green, Blue, and Alpha channels that make up the pixel's color. Your typical 16-megapixel photograph is 4,920 pixels wide by 3,264 pixels high, so we'd need to store a little over 16 million pixels. That works out to roughly 64 megabytes, or over 512,000,000 of those 1s and 0s, just to save the image to your computer.

Big numbers like that can be hard to visualize, so let's put that in perspective. If you're old enough to remember using floppy disks, you'd need almost 70 of them to store a single photo. Alternatively, you could only store about 60 images on a device like an iPhone.

Obviously, the people who made these devices aren't storing photos using this method, so what's happening?

Simple: the information is being compressed.

When talking about computers, compression refers to the techniques used to make information take up less space. There are many, many different ways to do this, but they all fall into one of two camps: they are either lossless or lossy.

Lossless compression systems save every piece of the original data, even though they use tricks to store it. Images, for example, can be stored in smaller files by listing all of the colors used in the image, and then representing each pixel as an index to that list. Images with few colors can really save space this way - for example, an image with less than 16 colors can store two pixels in a single byte, while monochrome images can use a single 1 or 0 to store each pixel.

Lossy compression takes a very different approach: instead of saving everything, it discards information that might not be relevant. You wouldn't use a lossy method to store a novel or spreadsheet, but it's a lifesaver for pictures, video, and sound.

As it turns out, our senses aren't as perfect as computer software. While a computer can tell the difference between pure red and almost pure red, they'll look the same to us. This means our image files don't need to closely monitor every change in color; an approximation will do just fine.

This isn't foolproof however. If too much is discarded, the image will become corrupted. Areas of noticeable distortion caused by the compression are known as "artifacts", and images with too many artifacts are generally unusable.

French Fried Files is a study of what happens when lossy compression is used too aggressively on images.

What is Lossy Compression?