minidjvu: how to compress into DjVu

This file describes minidjvu 0.3 library; this is also fully applicable to 0.33 version.

The library interface is unstable.

See also: how to decode a DjVu page


Step 0: get things working

Add this include line to you source files that use minidjvu:
    #include <minidjvu.h>
I'll assume that your compiler can find the minidjvu headers and your linker can link against the library. If not, try to read INSTALL or README, or try to add the parent release directory into the header search path.

This examples also require

    #include <assert.h>
    #include <stdio.h>
    #include <stdlib.h>

Step 1: get the bitmap to compress

There are many ways to get a bitmap, but only loading from files is demonstrated here.

To load a Windows BMP file, use mdjvu_load_bmp(); here's the example with error handling:

    const char *input = "your_input_file_name_here.bmp";
    mdjvu_error_t error;
    mdjvu_bitmap_t bitmap = mdjvu_load_bmp(input, &error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }
PBM files are read in the same way with mdjvu_load_pbm():
    const char *input = "your_input_file_name_here.pbm";
    mdjvu_error_t error;
    mdjvu_bitmap_t bitmap = mdjvu_load_pbm(input, &error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }
TIFF files are a bit different: the function mdjvu_load_tiff() receives another argument, a pointer to resolution:
    const char *input = "your_input_file_name_here.pbm";
    mdjvu_error_t error;
    int32 resolution = 300; // always set a default value in case TIFF has no dpi recorded
    mdjvu_bitmap_t bitmap = mdjvu_load_tiff(input, &resolution, &error);
    if (!bitmap)
    {
        fprintf(stderr, "%s: %s\n", input, mdjvu_get_error_message(error));
        exit(1);
    }

Step 2 (optional): smooth the bitmap

"Smoothing" is a filter applied to the bitmap before splitting into letters. The idea is to remove pixels that are probably noise. Right now, the implementation is very simple, but still wins up to 5% of file size (on scanned documents). Use
    mdjvu_smooth(bitmap);
to smooth it.

Step 3: split the bitmap

We have the bitmap now; but we need a split image. A split image, or simply an image, is a sequence of commands "put (a bitmap) at point x = (an integer), y = (an integer)". An image is obtained from a bitmap by splitting.

You have to supply the resolution (in dots per inch) and a pointer to options, which may be NULL.

    int32 dpi = 300;    // change the resolution if necessary
    mdjvu_image_t image = mdjvu_split(bitmap, dpi, NULL);
    assert(image);

Step 4: call compression routine

The main compression function is called mdjvu_compress_image(). It takes two arguments: the image and options. For lossless compression, NULL option will do:
    mdjvu_compress_image(image, NULL);
Lossy compression is trickier: you have to create options structure and options for the pattern matcher. Suppose you want to compress with the aggression of 110, cleaning and printing verbose messages to stdout; here's the example of doing it:
    mdjvu_matcher_options_t m_options = mdjvu_matcher_options_create();
    mdjvu_compression_options_t options = mdjvu_compression_options_create();
    mdjvu_set_aggression(m_options, 110);
    mdjvu_set_matcher_options(options, m_options);
    mdjvu_set_clean(options, 1);
    mdjvu_set_verbose(options, 1);
    
    mdjvu_compress_image(image, options);

    mdjvu_compression_options_destroy(options);
You don't have to destroy the matcher options, since destroying compression options does this.

Step 5: save the image

Just one call to mdjvu_save_djvu_page() does the job. The file is silently rewritten if it exists.

The function mdjvu_save_djvu_page() takes an extra parameter: erosion flag.

Here's an example of dealing with possible errors:

    int erosion = 0;
    const char *output = "your_output_file_name_here.djvu";
    mdjvu_error_t error;
    if (!mdjvu_save_djvu_page(image, output, &error, erosion))
    {
        fprintf(stderr, "%s: %s\n", output, mdjvu_get_error_message(error));
        exit(1);
    }
For the sake of completeness, there's a second declaration of error in this example. Obviously, you should remove it if you plan to compile this.

Step 6: clean up

If you no longer need the image and the bitmap, destroy them:
    mdjvu_image_destroy(image);
    mdjvu_bitmap_destroy(bitmap);
You could as well destroy the bitmap immediately after splitting it.