PBM, the crayons of image formats

Share on:

Image standards, so many to choose from…

Even before people had powerful graphics abilities in their PCs (remember until VGA was widely adopted in 90’s most PCs only supported 1, 4 or 16 colors) there were a literal sea of image formats. Most of them you will have never seen, heard of or used. Many were application, hardware or system specific.

In 1994, when OReilly was doing what they did best, they published an “Encyclopedia of Graphics File Formats”. This 900+ page monster of a book sought to be the go to book for understanding and writing libraries for image formats. For example pages 392-394 talks about Lotus Pic binary format down to the headers and encoding. You will be excused if you never knew there was a spreadsheet specific image format (you are not excused if you didn’t realize there was a spreadsheet before Excel, let alone 2 major ones).

1994 reference for graphics formats

All of these image formats brought with them many problems to an early world of advanced graphics. Here are the primary ones that I remember as I embarked into image processing back then.

  1. Many were fairly complicated and specific to their use.
  2. You needed a lot of converters, or the converters got bigger than we could run on those old machines.
  3. Some weren’t appropriate for some uses.
    1. JPEG was lossy and inappropriate for data in a research context.
    2. GIF only stored 256 colors which prevented storing dynamic ranges that were on par with the human eye or available sensors.
    3. TIFF and GIF often used compression algorithms that were patented and hard to legally work with.

One format to connect them all.

As the number of image formats grew there was an exponential growth problem forming in the converters. If I had a program for every pair of image formats I would need N * N or N^2 programs. If someone wrote a single program it would be too large for the memory of the systems in that day. The code would also get very hard to maintain.

The solution was to have a single file format that every image format could be converted to and from. It needed to be simple and easy to work with. Back in the day it also was a plus if it could be emailed to someone (we didn’t have attachments for binary files and most mail systems only supported 7 bit ASCII, now called UTF-7) without processing the file on either side for special text encoding.

Jef Poskanzer created PBMplus that fit all of those goals. It was such a useful tool that when he stopped working on it in 1991 others took it up as NETPBM and continued developing it through now. The idea was very simple. So simple that I can describe it in a small space here, and hopefully you can then create an image in your text editor (really!).

PBM format

Technically there were 3 formats, but the other 2 are variants of the first and each more advanced one can hold the simpler ones with some cost in storage. You’ll see.

The first 2 bytes of the file is the letter P and a number. These are the file magic values to identify the file type and the number symbolizes the type of PBM file.

Magic first 2 bytes Typical file extension Type
P1 .pbm A Black and White bitmap (PBM)
P2 .pgm A grayscale image (PGM)
P3 .ppm An RGB image (PPM)
P4 .pbm Binary version of PBM added later
P5 .pgm Binary version of PGM added later
P6 .ppm Binary version of PPM added later

The next entry was often on a separate line for ease of use, although there were implementations that just used any old white space. Either way it would be 2 integers, the first being width and the second height. Now we have the dimensions of the image.

For PGM and PPM there was a third line/entry integer. That integer was the maximum value in the image. It’s a handy way of describing the dynamic range of the image before reading it in.

Technically any line starting with a ‘#’ is a comment and the remainder of the line was to be ignored by the machine. Humans ignoring it was left to the reader of the file.

The actual image was just a list of numbers after this header. For bitmap it was just a sequence of 0’s and 1’s. Independent of white space or line length in the file, the program was to read width entries until it read that height times. For PGM it was numbers from 0 to the max value and PPM it was triplets of numbers from 0 to the max value with the triplet values mapping to Red, Green and Blue respectively. White space was required between numbers in both of these formats.

Writing code for this format was so easy you could get a few of the functions on a sheet of paper with nice spacing and comments in languages like C. Implementing a new reader or writer in a language like Python could be done in a few minutes from memory.

Here are a few examples to prove the point. See if you can decode them in your head. It isn’t hard.

1P1
23 3
31 0 1
40 1 0
51 0 1
1P2
25 5
340
410 0 0 0 10
50 20 0 20 0
60 0 40 0 0
70 20 0 20 0
810 0 0 0 10
1P3
25 5
3255
40 0 0  0 0 0  0 0 0  0 0 0  255 0 255
50 0 0  0 0 0  0 0 0  255 0 0  0 0 255
60 0 0  0 0 0  255 0 0  0 0 0  0 0 255
70 0 0  255 0 0  0 0 0  0 0 0  0 0 255
8255 0 0  0 0 0  0 0 0  0 0 0  0 0 255

I added some extra spaces to make them easier to read, which is allowed. I could have used tabs and really made it more like ASCII art, but that would make the code harder to implement.

The binary version of each file type is the same header, but instead of the numbers for the image data it is a set of bits or bytes (depending on the type of file) with each holding the same value as the text. This is a bit smaller and runs faster, but without compression in general these files are fairly big.

So we have a format that also maps to a simple C structure in memory. Write code to read or write to that structure and you have an easy time making a converter to and from a PBM format. You can also write generic image processing routines on the PBM versions of the files. Now it becomes easy to pipeline commands together in Unix for batch image processing.

In grad school it was very common for me to write a script that did something like this.

1#!/bin/sh
2
3tifftopnm $1.tiff | ppmtopgm | pnmtohisteq | pnmtops > $1.ps 

This would take the name and convert the tiff file to a PPM, take the intensity by mapping RGB to Y (think of this as converting it to grayscale, but it has more meaning), perform histogram equalization to bring out detail and output a printer ready postscript file of the same name.

If I needed to implement a new experimental algorithm I could do it using the library I already had and just bring in common routines from existing work on NETPBM.

The need dwindled…

Over time the formats settled to a few primaries and a few specialty types. We see JPEG, GIF and PNG used very commonly. TIFF still gets used where it has advantages, but is becoming rarer all the time. The problem is that all of those formats are rather involved to implement. Almost everyone uses the same libraries for each because there just aren’t that many options.

NETPBM though has not dwindled at all. While it never rose to the prominence of the others it is still at the heart of many processing pipelines on the Internet. The developers know it and trust it for it’s sheer simplicity.

Archival format

Because of it’s sheer simplicity I recommend the NETPBM file types despite their overall size. Storage has gotten cheaper than images have gotten large. A large collection might require a large storage array, but if you need to know the exact value of a pixel there is no better format to trust. In a century I’m pretty sure a programmer could implement a new reader with a little trial and error, and no documentation, in a few hours.