The AIIM Blog - Overcoming Information Chaos

8 Ways to Reduce your Storage and Bandwidth Costs for Document Imaging Solutions

Written by John Mancini | Jul 7, 2009 6:00:00 AM

Enterprise Imaging applications can be challenging to run efficiently. Unlike other data, document images are usually large, which means they take up a lot of memory, use a lot of disk storage, and take a long time to process or send over a network.

However, advanced image processing techniques can easily get you an order of magnitude improvement in size, speed, or bandwidth. If you start using a few of these techniques, you’ll see how easily you can reduce your hardware budget (which, incidentally, will reduce power consumption, maintenance, downtime, etc.)

  1. Resample the image to a smaller size and adjust the DPI so that it prints to the same size.

    A color scan of US letter size paper at 300 DPI is 2,550 pixels wide by 3,300 pixels long, for a total of 25MB. If you resample so that you cut each dimension in half, and then adjust the DPI to 150, your image takes up just over 6MB, or 25% of the original. This will reduce the storage size and the bandwidth needed to transmit the image over a network.

  2. Convert to grayscale or black and white.

    If your document is using 24-bit color, but you don’t mind losing color, you can convert to grayscale, which uses about 33% of the space. If you can convert to 1-bit without losing meaning, your documents will be about 4% of the original size. This will reduce the storage size and the bandwidth needed to transmit the image over a network.

  3. Use a better compression algorithm.

    Advanced compression algorithms like JBIG2 and JPEG2000 can result in smaller files without sacrificing quality. You might not have an easy way of viewing these images directly, but PDF supports them as a way to compress its images, so put them in a PDF, and anyone with Acrobat Reader can view them.

  4. Use tiled formats.

    If you often need just part of an image, use a tiled format, such as Tiled-TIFF, which makes getting regions of the image faster. If you have web-based viewers that know how to tile images before sending them, you’ll use fewer server resources to tile the image.

  5. Use automated border crop.

    Some scans, especially of smaller items, like checks, have a large dark border around the edges. Use an algorithm that can detect and remove this, leaving you with just the important part of the image. Incidentally, this will save you ink if you print these documents.

  6. Remove blank pages.

    If you are scanning two-sided documents, you probably have some blank pages. Detect and remove them.

  7. Remove unneeded metadata.

    Images often carry around extra metadata that was put in by the device or software that created them. If you don’t need it, remove it. You’ll save storage and bandwidth. If you need the data, it might be better to extract it and store it separately.

  8. Create thumbnails on the server, and send them on demand.

    If you are preparing a web page of thumbnails, then make them on the server (don’t use browser features to resize them). Detect if the thumbnail is viewable on the page, and request it on demand. This will lower bandwidth requirements and make the pages load faster.