828 lines
37 KiB
Text
828 lines
37 KiB
Text
|
|
Using the JBIG-KIT library
|
|
--------------------------
|
|
|
|
Markus Kuhn -- 2020-08-03
|
|
|
|
|
|
This text explains how to use the functions provided by the JBIG-KIT
|
|
portable image compression library jbig.c in your application
|
|
software. The jbig.c library is a full-featured implementation of the
|
|
JBIG1 standard aimed at applications that can hold the entire
|
|
uncompressed and compressed image in RAM.
|
|
|
|
[For applications that require only the single-bit-per-pixel "fax
|
|
subset" of the JBIG1 standard defined in ITU-T Recommendation T.85
|
|
<http://www.itu.int/rec/T-REC-T.85/en>, the alternative implementation
|
|
found in jbig85.c may be preferable. It keeps not more than three
|
|
lines of the uncompressed image in RAM, which makes it particularly
|
|
suitable for embedded applications. For information on how to use
|
|
jbig85.c, please refer to the separate documentation file jbig85.txt.]
|
|
|
|
|
|
1 Introduction to JBIG
|
|
|
|
We start with a short introduction to JBIG1. More detailed information
|
|
is provided in the "Introduction and overview" section of the JBIG1
|
|
standard. Information on how to obtain a copy of the standard is
|
|
available from <http://www.itu.int/rec/T-REC-T.82/en> or
|
|
<http://www.iso.ch/>.
|
|
|
|
Image data encoded with the JBIG algorithm is separated into planes,
|
|
layers, and stripes. Each plane contains one bit per pixel. The number
|
|
of planes stored in a JBIG data stream is the number of bits per
|
|
pixel. Resolution layers are numbered from 0 to D with 0 being the
|
|
layer with the lowest resolution and D the one with the highest. Each
|
|
next higher resolution layer has twice the number of rows and columns.
|
|
Layer 0 is encoded independently of any other data, all other
|
|
resolution layers are encoded as only the difference between the next
|
|
lower and the current layer. For applications that require very quick
|
|
access to parts of an image, it is possible to divide an image into
|
|
several horizontal stripes. All stripes of one resolution layer have
|
|
equal size, except perhaps the final one. The number of stripes of an
|
|
image is equal in all resolution layers and in all bit planes.
|
|
|
|
The compressed data stream specified by the JBIG standard is called a
|
|
bi-level image entity (BIE). A BIE consists of a 20-byte header,
|
|
followed by an optional 1728-byte table (usually not present, except
|
|
in special applications) followed by a sequence of stripe data
|
|
entities (SDE). Each SDE encodes the content of one single stripe in
|
|
one plane of one resolution layer. Between the SDEs, other information
|
|
blocks (called floating marker segments) can also be present. They are
|
|
used to change certain parameters of the algorithm in the middle of an
|
|
image or contain additional application specific information. A BIE
|
|
looks like this:
|
|
|
|
|
|
+------------------------------------------------+
|
|
| |
|
|
| 20-byte header (with image size, #planes, |
|
|
| #layers, stripe size, first layer, options, |
|
|
| SDE ordering, ...) |
|
|
| |
|
|
+------------------------------------------------+
|
|
| |
|
|
| optional 1728-byte table |
|
|
| |
|
|
+------------------------------------------------+
|
|
| |
|
|
| optional floating marker segments |
|
|
| |
|
|
+------------------------------------------------+
|
|
| |
|
|
| stripe data entity |
|
|
| |
|
|
+------------------------------------------------+
|
|
| |
|
|
| optional floating marker segments |
|
|
| |
|
|
+------------------------------------------------+
|
|
| |
|
|
| stripe data entity |
|
|
| |
|
|
+------------------------------------------------+
|
|
...
|
|
+------------------------------------------------+
|
|
| |
|
|
| stripe data entity |
|
|
| |
|
|
+------------------------------------------------+
|
|
|
|
|
|
One BIE can contain all resolution layers of an image, but it is also
|
|
possible to store various resolution layers in several BIEs. The BIE
|
|
header contains the number of the first and the last resolution layer
|
|
stored in this BIE, as well as the size of the highest resolution
|
|
layer stored in this BIE. Progressive coding is deactivated by simply
|
|
storing the image in one single resolution layer.
|
|
|
|
Different applications might have different requirements for the order
|
|
in which the SDEs for stripes of various planes and layers are stored
|
|
in the BIE, so all possible sensible orderings are allowed by the
|
|
standard and indicated by four bits in the header.
|
|
|
|
It is possible to use the raw BIE data stream as specified by the JBIG
|
|
standard directly as the format of a file used for storing images.
|
|
This is what the pbmtojbg, jbgtopbm, pbmtojbg85, and jbgtopbm85
|
|
conversion tools do that are provided in this package as demonstration
|
|
applications. However, as the BIE format has been designed for a large
|
|
number of very different applications, and to allow efficient direct
|
|
processing by special JBIG hardware chip implementations, the BIE
|
|
header contains only the minimum amount of information absolutely
|
|
required by the decompression algorithm. Many features expected from a
|
|
good file format are missing in the BIE data stream:
|
|
|
|
- no "magic code" in the first few bytes to allow identification
|
|
of the file format on a typeless file system and to allow
|
|
automatic distinction from other compression algorithms
|
|
|
|
- no standardized way to encode additional information such as a
|
|
textual description, information about the meaning of various bit
|
|
planes, the physical size and resolution of the document, etc.
|
|
|
|
- a checksum to ensure image integrity
|
|
|
|
- encryption and signature mechanisms
|
|
|
|
- many things more
|
|
|
|
Raw BIE data streams alone may therefore not be a suitable format for
|
|
document archiving and exchange. A standard format for this purpose
|
|
would typically combine a BIE representing the image data with an
|
|
additional header providing auxiliary information into one file.
|
|
Existing established multi-purpose file formats with a rich set of
|
|
auxiliary information attributes like TIFF could be extended easily to
|
|
also hold JBIG compressed data.
|
|
|
|
On the other hand, in e.g. database applications, a BIE might be
|
|
stored directly in a binary variable-length field. Auxiliary
|
|
information would then be stored in other fields of the same record,
|
|
to simplify search operations.
|
|
|
|
|
|
2 Compressing an image
|
|
|
|
2.1 Format of the source image
|
|
|
|
To be processed by the jbig.c encoder, the image has to be present in
|
|
memory as separate bitmap planes. Each byte of a bitmap contains eight
|
|
pixels, where the most significant bit represents the leftmost of
|
|
these. Each line of a bitmap has to be stored in an integral number of
|
|
bytes. If the image width is not an integral multiple of eight, then
|
|
the final byte has to be padded with zero bits.
|
|
|
|
For example the 23x5 pixels large single plane image:
|
|
|
|
.XXXXX..XXX...X...XXX..
|
|
.....X..X..X..X..X.....
|
|
.....X..XXX...X..X.XXX.
|
|
.X...X..X..X..X..X...X.
|
|
..XXX...XXX...X...XXX..
|
|
|
|
is represented by the 15 bytes
|
|
|
|
01111100 11100010 00111000
|
|
00000100 10010010 01000000
|
|
00000100 11100010 01011100
|
|
01000100 10010010 01000100
|
|
00111000 11100010 00111000
|
|
|
|
or in hexadecimal notation
|
|
|
|
7c e2 38 04 92 40 04 e2 5c 44 92 44 38 e2 38
|
|
|
|
This is the format used in binary PBM files and it can also be handled
|
|
directly by the Xlib library of the X Window System.
|
|
|
|
As JBIG can also handle images with multiple bit planes, the jbig.c
|
|
library functions accept and return always arrays of pointers to
|
|
bitmaps with one pointer per plane.
|
|
|
|
For single-plane images, the standard recommends that a 0 pixel
|
|
represents the background and a 1 pixel represents the foreground
|
|
colour of an image, in other words, 0 is white and 1 is black for
|
|
scanned paper documents. For images with several bits per pixel, the
|
|
JBIG standard makes no recommendations about how various colours should
|
|
be encoded.
|
|
|
|
For grey-scale images, by using a Gray code instead of a simple binary
|
|
weighted representation of the pixel intensity, some increase in
|
|
coding efficiency can be reached.
|
|
|
|
A Gray code is also a binary representation of integer numbers, but it
|
|
has the property that the representations of the integer numbers i and
|
|
(i+1) always differ in exactly one bit. For example, the numbers 0 to
|
|
7 can be represented in normal binary code and Gray code as in the
|
|
following table:
|
|
|
|
normal
|
|
number binary code Gray code
|
|
---------------------------------------
|
|
0 000 000
|
|
1 001 001
|
|
2 010 011
|
|
3 011 010
|
|
4 100 110
|
|
5 101 111
|
|
6 110 101
|
|
7 111 100
|
|
|
|
The form of Gray code shown above has the property that the second
|
|
half of the code (numbers 4 - 7) is simply the mirrored first half
|
|
(numbers 3 - 0) with the first bit set to one. This way, arbitrarily
|
|
large Gray codes can be generated quickly by mirroring the above
|
|
example and prefixing the first half with zeros and the second half
|
|
with ones as often as required. In grey-scale images, it is common
|
|
practise to use the all-0 code for black and the all-1 code for white.
|
|
|
|
No matter whether a Gray code or a binary code is used for encoding a
|
|
pixel intensity in several bit planes, it always makes sense to store
|
|
the most significant (leftmost) bit in plane 0, which is transmitted
|
|
first. This way, a decoder could increase the precision of the
|
|
displayed pixel intensities while data is still being received and the
|
|
basic structure of the image will become visible as early as possible
|
|
during the transmission.
|
|
|
|
|
|
2.2 A simple compression application
|
|
|
|
In order to use jbig.c in your application, just link your executable
|
|
with both jbig.c and jbig_ar.c. Both are already combined into
|
|
libjbig.a by the provided Makefile, so on Unix systems, just add
|
|
-ljbig and -L. to the command line options of your compiler. On other
|
|
systems you will probably have to write a new Makefile. Make sure your
|
|
compiler can find the file jbig.h and put the line
|
|
|
|
#include "jbig.h"
|
|
|
|
into your source code.
|
|
|
|
The library interface follows object-oriented programming principles.
|
|
You have to declare a variable (object)
|
|
|
|
struct jbg_enc_state s;
|
|
|
|
which contains the current status of an encoder. Then you initialize
|
|
the encoder by calling the constructor function
|
|
|
|
void jbg_enc_init(struct jbg_enc_state *s, unsigned long x, unsigned long y,
|
|
int pl, unsigned char **p,
|
|
void (*data_out)(unsigned char *start, size_t len,
|
|
void *file),
|
|
void *file);
|
|
|
|
The parameters have the following meaning:
|
|
|
|
s A pointer to the jbg_enc_state structure that you want
|
|
to initialize.
|
|
|
|
x The width of your image in pixels.
|
|
|
|
y The height of your image in pixels (lines).
|
|
|
|
pl the number of bitmap planes you want to encode.
|
|
|
|
p A pointer to an array of pl pointers, where each is again
|
|
pointing to the first byte of a bitmap as described in
|
|
section 2.1.
|
|
|
|
data_out This is a call-back function that the encoder will
|
|
call during the compression process by in order to
|
|
deliver the BIE data to your application. The
|
|
parameters of the function data_out are a pointer
|
|
start to the new block of data being delivered, as
|
|
well as the number len of delivered bytes. The pointer
|
|
file is transparently delivered to data_out, as
|
|
specified in jbg_enc_init(). Typically, data_out will
|
|
write the BIE portion to a file, send it to a network
|
|
connection, or append it to some memory buffer.
|
|
|
|
file A pointer parameter that is passed on to data_out()
|
|
and can be used, for instance, to allow data_out() to
|
|
distinguish by which compression task it has been
|
|
called in multi-threaded applications.
|
|
|
|
In the simplest case, the compression is then started by calling the
|
|
function
|
|
|
|
void jbg_enc_out(struct jbg_enc_state *s);
|
|
|
|
which will deliver the complete BIE to data_out() in several calls.
|
|
After jbg_enc_out has returned, a call to the destructor function
|
|
|
|
void jbg_enc_free(struct jbg_enc_state *s);
|
|
|
|
will release any heap memory allocated by the previous functions.
|
|
|
|
|
|
A minimal example application, which sends the BIE of the above bitmap
|
|
to stdout, looks like this:
|
|
|
|
---------------------------------------------------------------------------
|
|
/* A sample JBIG encoding application */
|
|
|
|
#include <stdio.h>
|
|
#include "jbig.h"
|
|
|
|
void output_bie(unsigned char *start, size_t len, void *file)
|
|
{
|
|
fwrite(start, 1, len, (FILE *) file);
|
|
|
|
return;
|
|
}
|
|
|
|
int main()
|
|
{
|
|
unsigned char bitmap[15] = {
|
|
/* 23 x 5 pixels, "JBIG" */
|
|
0x7c, 0xe2, 0x38, 0x04, 0x92, 0x40, 0x04, 0xe2,
|
|
0x5c, 0x44, 0x92, 0x44, 0x38, 0xe2, 0x38
|
|
};
|
|
unsigned char *bitmaps[1] = { bitmap };
|
|
struct jbg_enc_state se;
|
|
|
|
jbg_enc_init(&se, 23, 5, 1, bitmaps,
|
|
output_bie, stdout); /* initialize encoder */
|
|
jbg_enc_out(&se); /* encode image */
|
|
jbg_enc_free(&se); /* release allocated resources */
|
|
|
|
return 0;
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
This software produces a 42 byte long BIE. (JBIG is not very good at
|
|
compressing extremely small images like in this example, because the
|
|
arithmetic encoder requires some startup data in order to generate
|
|
reasonable statistics which influence the compression process and
|
|
because there is some header overhead.)
|
|
|
|
|
|
2.3 More about compression
|
|
|
|
If jbg_enc_out() is called directly after jbg_enc_init(), the
|
|
following default values are used for various compression parameters:
|
|
|
|
- Only one single resolution layer is used, i.e. no progressive
|
|
mode.
|
|
|
|
- The number of lines per stripe is selected so that approximately
|
|
35 stripes per image are used (as recommended in annex C of the
|
|
standard together with the suggested adaptive template change
|
|
algorithm). However, not less than 2 and not more than 128 lines
|
|
are used in order to stay within the suggested minimum parameter
|
|
support range specified in annex A of the standard).
|
|
|
|
- All optional parts of the JBIG algorithm are activated (TPBON,
|
|
TPDON and DPON).
|
|
|
|
- The default resolution reduction table and the default deterministic
|
|
prediction table are used
|
|
|
|
- The maximal vertical offset of the adaptive template pixel is 0
|
|
and the maximal horizontal offset is 8 (mx = 8, my = 0).
|
|
|
|
In order to change any of these default parameters, additional
|
|
functions have to be called between jbg_enc_init() and jbg_enc_out().
|
|
|
|
In order to activate progressive encoding, it is possible to specify
|
|
with
|
|
|
|
void jbg_enc_layers(struct jbg_enc_state *s, int d);
|
|
|
|
the number d of differential resolution layers which shall be encoded
|
|
in addition to the lowest resolution layer 0. For example, if a
|
|
document with 60-micrometer pixels has to be stored, and the lowest
|
|
resolution layer shall have 240-micrometer pixels, so that a screen
|
|
previewer can directly decompress only the required resolution, then a
|
|
call
|
|
|
|
jbg_enc_layers(&se, 2);
|
|
|
|
will cause three layers with 240, 120 and 60 micrometers resolution to
|
|
be generated.
|
|
|
|
If the application does not know what typical resolutions are used and
|
|
simply wants to ensure that the lowest resolution layer will fit into
|
|
a given maximal window size, then as an alternative, a call to
|
|
|
|
int jbg_enc_lrlmax(struct jbg_enc_state *s, unsigned long mwidth,
|
|
unsigned long mheight);
|
|
|
|
will cause the library to automatically determine the suitable number
|
|
of resolutions so that the lowest resolution layer 0 will not be
|
|
larger than mwidth x mheight pixels. E.g. if one wants to ensure that
|
|
systems with a 640 x 480 pixel large screen can decode the required
|
|
resolution directly, then call
|
|
|
|
jbg_enc_lrlmax(&se, 640, 480);
|
|
|
|
The return value is the number of differential layers selected.
|
|
|
|
After the number of resolution layers has been specified by calls to
|
|
jbg_enc_layers() or jbg_enc_lrlmax(), by default, all these layers
|
|
will be written into the BIE. This can be changed with a call to
|
|
|
|
int jbg_enc_lrange(struct jbg_enc_state *s, int dl, int dh);
|
|
|
|
Parameter dl specifies the lowest resolution layer and dh the highest
|
|
resolution layer that will appear in the BIE. For instance, if layer 0
|
|
shall be written to the first BIE and layer 1 and 2 shall be written
|
|
to a second one, then before writing the first BIE, call
|
|
|
|
jbg_enc_lrange(&se, 0, 0);
|
|
|
|
and before writing the second BIE with jbg_enc_out(), call
|
|
|
|
jbg_enc_lrange(&se, 1, 2);
|
|
|
|
If any of the parameters is negative, it will be ignored. The return
|
|
value is the total number of differential layers that will represent
|
|
the input image. This way, jbg_enc_lrange(&se, -1, -1) can be used to
|
|
query the layer of the full image resolution.
|
|
|
|
A number of other more exotic options of the JBIG algorithm can be
|
|
modified by calling
|
|
|
|
void jbg_enc_options(struct jbg_enc_state *s, int order, int options,
|
|
long l0, int mx, int my);
|
|
|
|
before calling jbg_enc_out().
|
|
|
|
The order parameter can be a combination of the bits JBG_HITOLO,
|
|
JBG_SEQ, JBG_ILEAVE and JBG_SMID and it determines in which order
|
|
the SDEs are stored in the BIE. The bits have the following meaning:
|
|
|
|
JBG_HITOLO Usually, the lower resolution layers are stored before
|
|
the higher resolution layers, so that a decoder can
|
|
already start to display a low resolution version of
|
|
the full image once a prefix of the BIE has been
|
|
received. When this bit is set, however, the BIE will
|
|
contain the higher layers before the lower layers. This
|
|
avoids additional buffer memory in the encoder and is
|
|
intended for applications where the encoder is connected
|
|
to a database which can easily reorder the SDEs before
|
|
sending them to a decoder. Warning: JBIG decoders are
|
|
not expected to support the HITOLO option (e.g. the
|
|
jbig.c decoder currently does not) so you should
|
|
normally not use it.
|
|
|
|
JBG_SEQ Usually, at first all stripes of one resolution layer
|
|
are written to the BIE and then all stripes of the next
|
|
layer, and so on. When the SEQ bit is set however, then
|
|
all layers of the first stripe will be written,
|
|
followed by all layers of the second stripe, etc. This
|
|
option also should normally never be required and is
|
|
not supported by the current jbig.c decoder.
|
|
|
|
JBG_SMID In case there exist several bit planes, then the order of
|
|
the stripes is determined by three loops over all stripes,
|
|
all planes and all layers. When SMID is set, the loop
|
|
over all stripes is the middle loop.
|
|
|
|
JBG_ILEAVE If this bit is set, then at first all layers of one
|
|
plane are written before the encoder starts with the next
|
|
plane.
|
|
|
|
The above description may be somewhat confusing, but the following
|
|
table (see also Table 11 in ITU-T T.82) clarifies how the three bits
|
|
JBG_SEQ, JBIG_ILEAVE and JBG_SMID influence the ordering of the loops
|
|
over all stripes, planes and layers:
|
|
|
|
|
|
Loops:
|
|
JBG_SEQ JBG_ILEAVE JBG_SMID | Outer Middle Inner
|
|
------------------------------------+---------------------------
|
|
0 0 0 | p d s
|
|
0 1 0 | d p s
|
|
0 1 1 | d s p
|
|
1 0 0 | s p d
|
|
1 0 1 | p s d
|
|
1 1 0 | s d p
|
|
|
|
p: plane, s: stripe, d: layer
|
|
|
|
|
|
By default, the order combination JBG_ILEAVE | JBG_SMID is used.
|
|
|
|
The options value can contain the following bits, which activate
|
|
some of the optional algorithms defined by JBIG:
|
|
|
|
JBG_LRLTWO Normally, in the lowest resolution layer, pixels
|
|
from three lines around the next pixel are used
|
|
in order to determine the context in which the next
|
|
pixel is encoded. Some people in the JBIG committee
|
|
seem to have argued that using only 2 lines will
|
|
make software implementations a little bit faster,
|
|
however others have argued that using only two lines
|
|
will decrease compression efficiency by around 5%.
|
|
As you might expect from a committee, now both
|
|
alternatives are allowed and if JBG_LRLTWO is set,
|
|
the slightly faster but 5% less well compressing two
|
|
line alternative is selected. God bless the committees.
|
|
Although probably nobody will ever need this option,
|
|
it has been implemented in jbig.c and is off by
|
|
default.
|
|
|
|
JBG_TPDON This activates the "typical prediction" algorithm
|
|
for differential layers which avoids that large
|
|
areas of equal colour have to be encoded at all.
|
|
This is on by default and there is no good reason to
|
|
switch it off except for debugging or preparing data
|
|
for cheap JBIG hardware that might not support this
|
|
option.
|
|
|
|
JBG_TPBON Like JBG_TPDON this activates the "typical prediction"
|
|
algorithm in the lowest resolution layer. Also activated
|
|
by default.
|
|
|
|
JBG_DPON This bit activates for the differential resolution
|
|
layers the "deterministic prediction" algorithm,
|
|
which avoids that higher resolution layer pixels are
|
|
encoded when their value can already be determined
|
|
with the knowledge of the neighbour pixels, the
|
|
corresponding lower resolution pixels and the
|
|
resolution reduction algorithm. This is also
|
|
activated by default and one reason for deactivating
|
|
it would be if the default resolution reduction
|
|
algorithm were replaced by another one.
|
|
|
|
JBG_DELAY_AT Use a slightly less efficient algorithm to determine
|
|
when an adaptive template change is necessary. With
|
|
this bit set, the encoder output is compatible to the
|
|
conformance test examples in cause 7.2 of ITU-T T.82.
|
|
Then all adaptive template changes are delayed until
|
|
the first line of the next stripe. This option is by
|
|
default deactivated and is only required for passing a
|
|
special compatibility test suite.
|
|
|
|
In addition, parameter l0 in jbg_enc_options() allows you to specify
|
|
the number of lines per stripe in resolution layer 0. The parameters
|
|
mx and my change the maximal offset allowed for the adaptive template
|
|
pixel. JBIG-KIT now supports the full range of possible mx values up
|
|
to 127 in the encoder and decoder, but my is at the moment ignored and
|
|
always set to 0. As the standard requires of all decoder
|
|
implementations only to support maximum values mx = 16 and my = 0,
|
|
higher values should normally be avoided in order to guarantee
|
|
interoperability. The ITU-T T.85 profile for JBIG in fax machines
|
|
requires support for mx = 127 and my = 0. Default is mx = 8 and my =
|
|
0. If any of the parameters order, options, mx or my is negative, or
|
|
l0 is zero, then the corresponding current value remains unmodified.
|
|
|
|
The resolution reduction and deterministic prediction tables can also
|
|
be replaced. However as these options are anyway only for experts,
|
|
please have a look at the source code of jbg_enc_out() and the struct
|
|
members dppriv and res_tab of struct jbg_enc_state for the details of
|
|
how to do this, in case you really need it. The functions
|
|
jbg_int2dppriv and jbg_dppriv2int are provided in order to convert the
|
|
DPTABLE data from the format used in the standard into the more
|
|
efficient format used internally by JBIG-KIT.
|
|
|
|
If you want to encode a grey-scale image, you can use the library
|
|
function
|
|
|
|
void jbg_split_planes(unsigned long x, unsigned long y, int has_planes,
|
|
int encode_planes,
|
|
const unsigned char *src, unsigned char **dest,
|
|
int use_graycode);
|
|
|
|
It separates an image in which each pixel is represented by one or
|
|
more bytes into separate bit planes. The dest array of pointers to
|
|
these bit planes can then be handed over to jbg_enc_init(). The
|
|
variables x and y specify the width and height of the image in pixels,
|
|
and has_planes specifies how many bits per pixel are used. As each
|
|
pixel is represented by an integral number of consecutive bytes, of
|
|
which each contains up to eight bits, the total length of the input
|
|
image array src[] will therefore be x * y * ((has_planes + 7) / 8)
|
|
bytes. The pixels are stored as usually in English reading order, and
|
|
for each pixel the integer value is stored with the most significant
|
|
byte coming first (Bigendian). This is exactly the format used in raw
|
|
PGM files. In encode_planes, the number of bit planes that shall be
|
|
extracted can be specified. This allows for instance to extract only
|
|
the most significant 8 bits of a 12-bit image, where each pixel is
|
|
represented by two bytes, by specifying has_planes = 12 and
|
|
encode_planes = 8. If use_graycode is zero, then the binary code of
|
|
the pixel integer values will be used instead of the Gray code. Plane
|
|
0 contains always the most significant bit.
|
|
|
|
|
|
3 Decompressing an image
|
|
|
|
Like with the compression functions, if you want to use the jbig.c
|
|
library, you have to put the line
|
|
|
|
#include "jbig.h"
|
|
|
|
into your source code and link your executable with libjbig.a.
|
|
|
|
The state of a JBIG decoder is stored completely in a struct and you
|
|
will have to define a variable like
|
|
|
|
struct jbg_dec_state sd;
|
|
|
|
which is initialized by a call to
|
|
|
|
void jbg_dec_init(struct jbg_dec_state *s);
|
|
|
|
After this, you can directly start to pass data from the BIE to the decoder
|
|
by calling the function
|
|
|
|
int jbg_dec_in(struct jbg_dec_state *s, unsigned char *data, size_t len,
|
|
size_t *cnt);
|
|
|
|
The pointer data points to the first byte of a data block with length
|
|
len, which contains bytes from a BIE. It is not necessary to pass a
|
|
whole BIE at once to jbg_dec_in(), it can arrive fragmented in any way
|
|
by calling jbg_dec_in() several times. It is also possible to send
|
|
several BIEs concatenated to jbg_dec_in(), however these then have to
|
|
fit together. If you send several BIEs to the decoder, the lowest
|
|
resolution layer in each following BIE has to be the highest
|
|
resolution layer in the previous BIE plus one and the image sizes and
|
|
number of planes also have to fit together, otherwise jbg_dec_in()
|
|
will return the error JBG_ENOCONT after the header of the new BIE has
|
|
been received completely.
|
|
|
|
If pointer cnt is not NULL, then the number of bytes actually read
|
|
from the data block will be stored there. In case the data block did
|
|
not contain the end of the BIE, then the value JBG_EAGAIN will be
|
|
returned and *cnt equals len.
|
|
|
|
Once the end of a BIE has been reached, the return value of
|
|
jbg_dec_in() will be JBG_EOK. After this has happened, the functions
|
|
and macros
|
|
|
|
unsigned long jbg_dec_getwidth(struct jbg_dec_state *s);
|
|
unsigned long jbg_dec_getheight(struct jbg_dec_state *s);
|
|
int jbg_dec_getplanes(struct jbg_dec_state *s);
|
|
unsigned char *jbg_dec_getimage(struct jbg_dec_state *s, int plane);
|
|
unsigned long jbg_dec_getsize(struct jbg_dec_state *s);
|
|
|
|
can be used to query the dimensions of the now completely decoded
|
|
image and to get a pointer to all bitmap planes. The bitmaps are
|
|
stored as described in section 2.1. The function jbg_dec_getsize()
|
|
calculates the number of bytes which one bitmap requires.
|
|
|
|
The function
|
|
|
|
void jbg_dec_merge_planes(const struct jbg_dec_state *s, int use_graycode,
|
|
void (*data_out)(unsigned char *start, size_t len,
|
|
void *file), void *file);
|
|
|
|
allows you to merge the bit planes that can be accessed individually
|
|
with jbg_dec_getimage() into an array with one or more bytes per pixel
|
|
(i.e., the format provided to jbg_split_planes()). If use_graycode is
|
|
zero, then a binary encoding will be used. The output array will be
|
|
delivered via the callback function data_out, exactly in the same way
|
|
in which the encoder provides the BIE. The function
|
|
|
|
unsigned long jbg_dec_getsize_merged(const struct jbg_dec_state *s);
|
|
|
|
determines how long the data array delivered by jbg_dec_merge_planes()
|
|
is going to be.
|
|
|
|
Before calling jbg_dec_in() the first time, it is possible to specify with
|
|
a call to
|
|
|
|
void jbg_dec_maxsize(struct jbg_dec_state *s, unsigned long xmax,
|
|
unsigned long ymax);
|
|
|
|
an abort criterion for progressively encoded images. For instance if an
|
|
application will display a whole document on a screen which is 1024 x
|
|
768 pixels large, then this application should call
|
|
|
|
jbg_dec_maxsize(&sd, 1024, 768);
|
|
|
|
before the decoding process starts. If the image has been encoded in
|
|
progressive mode (i.e. with several resolution layers), then the
|
|
decoder will stop with a return value JBG_EOK_INTR after the largest
|
|
resolution layer that is still smaller than 1024 x 768. However this
|
|
is no guarantee that the image which can then be read out using
|
|
jbg_dec_getimage(), etc. is really not larger than the specified
|
|
maximal size. The application will have to check the size of the
|
|
image, because the decoder does not automatically apply a resolution
|
|
reduction if no suitable resolution layer is available in the BIE.
|
|
|
|
If jbg_dec_in() returned JBG_EOK_INTR or JBG_EOK, then it is possible
|
|
to continue calling jbg_dec_in() with the remaining data in order to
|
|
either decode the remaining resolution layers of the current BIE or in
|
|
order to add another BIE with additional resolution layers. In both
|
|
cases, after jbg_dec_in() returned JBG_EOK_INTR or JBG_EOK, *cnt is
|
|
probably not equal to len and the remainder of the data block which
|
|
has not yet been processed by the decoder has to be delivered to
|
|
jbg_dec_in() again.
|
|
|
|
If any other return value than JBG_EOK, JBG_EOK_INTR or JBG_EAGAIN
|
|
has been returned by jbg_dec_in(), then an error has occurred and
|
|
|
|
void jbg_dec_free(struct jbg_dec_state *s);
|
|
|
|
should be called in order to release any allocated memory. The
|
|
destructor jbg_dec_free() should of course also be called, once the
|
|
decoded bitmap returned by jbg_dec_getimage() is no longer required
|
|
and the memory can be released.
|
|
|
|
The function
|
|
|
|
const char *jbg_strerror(int errnum);
|
|
|
|
returns a pointer to a short single line test message that explains
|
|
the return value of jbg_dec_in(). This message can be used in order to
|
|
provide the user a brief informative message about what when wrong
|
|
while decompressing a JBIG image. The po/ subdirectory contains *.po
|
|
files that translate the English ASCII strings returned by
|
|
jbg_strerror() into other languages (e.g., for use with GNU gettext).
|
|
The four least-significant bits of the return value of jbg_dec_in()
|
|
may contain additional detailed technical information about the exact
|
|
test that spotted the error condition (see source code for details),
|
|
i.e. more than the text message returned by jbg_strerror() reveals.
|
|
Therefore it may be useful to display the return value itself as a
|
|
hexadecimal number, in addition to the string returned by
|
|
jbg_strerror().
|
|
|
|
The current implementation of the jbig.c decoder has the following
|
|
limitations:
|
|
|
|
- The maximal vertical offset MY of the adaptive template pixel
|
|
must be zero.
|
|
|
|
- HITOLO and SEQ bits must not be set in the order value.
|
|
|
|
- Not more than JBG_ATMOVES_MAX (currently set to 64) ATMOVE
|
|
marker segments can be handled per stripe.
|
|
|
|
- the number D of differential layers must be less than 32
|
|
|
|
None of the above limitations can be exceeded by a JBIG data stream
|
|
that conforms to the ITU-T T.85 application profile for the use of
|
|
JBIG1 in fax machines.
|
|
|
|
The maximum image size that a BIE header (BIH) can indicate is X_D =
|
|
2^32-1 pixels wide, Y_D = 2^32-1 lines high, with P = 255 bits per
|
|
pixel. Such an image would, in uncompressed form, require about 588
|
|
exabytes. Once jbg_dec_in() has received the 20-byte long BIH at the
|
|
start of the BIE, it will call malloc() to allocate enough memory to
|
|
hold the uncompressed image planes. Users may, therefore, want to
|
|
defend their application against excessive image-size parameters in a
|
|
received BIH, by checking X_D, Y_D, and P against appropriate safety
|
|
limits before handing over the BIE header to jbg_dec_in(). BIE headers
|
|
indicating too large images might be abused for denial of service
|
|
attacks, to exhaust the memory of a system (e.g., CVE-2017-9937). To
|
|
manage this risk, the jbig.c decoder will now, by default, return "Not
|
|
enough memory available" (JBG_ENOMEM) if the resulting final image
|
|
layer would occupy more than 2 gigabytes. Users can adjust this limit
|
|
by changing sd->maxmem right after having called jbg_dec_init(&sd).
|
|
The actual amount of memory allocated with malloc() calls during the
|
|
decoding process is somewhat higher (at least 25%) than the limit set
|
|
in sd->maxmem, as the decoder requires additional heap memory that
|
|
depends on the image dimensions.
|
|
|
|
The jbg_dec_in() function will return "Input data stream uses
|
|
unimplemented JBIG features" (JBG_EIMPL | 1) if Y_D equals 0xffffffff,
|
|
which is an extreme value commonly used to encode images according to
|
|
ITU-T T.85 where the height was unknown when the BIH was emitted.
|
|
|
|
All malloc(), realloc() and free() functions called by jbig.c are
|
|
wrapped by the functions checked_malloc(), checked_realloc() and
|
|
checked_free(). These simply call abort() when memory allocation
|
|
fails. Developpers of embedded systems may want to replace them with
|
|
alternative forms of exception handling.
|
|
|
|
There are two more limitations of the current implementation of the
|
|
jbig.c decoder that might cause problems with processing JBIG data
|
|
stream that conform to ITU-T T.85:
|
|
|
|
- The jbig.c decoder was designed to operate incrementally.
|
|
Each received byte is processed immediately as soon as it arrives.
|
|
As a result, it does not look beyond the SDRST/SDNORM at the end
|
|
of all stripes for any immediately following NEWLEN marker that
|
|
might reduce the number of lines encoded by the current stripe.
|
|
However section 6.2.6.2 of ITU-T T.82 says that a NEWLEN marker
|
|
segment "could refer to a line in the immediately preceding stripe
|
|
due to an unexpected termination of the image or the use of only
|
|
such stripe", and ITU-T.85 explicitly suggests the use of this
|
|
for fax machines that start transmission before having encountered
|
|
the end of the page.
|
|
|
|
- The image size initially indicated in the BIE header is used to
|
|
allocate memory for a bitmap of this size. This means that BIEs
|
|
that set initially Y_D = 0xffffffff (as suggested in ITU-T T.85
|
|
for fax machines that do not know the height of the page at the
|
|
start of the transmission) cannot be decoded directly by this
|
|
version.
|
|
|
|
For both issues, there is a workaround available:
|
|
|
|
If you encounter a BIE that has in the header the VLENGTH=1 option bit
|
|
set, then first wait until you have received the entire BIE and stored
|
|
it in memory. Then call the function
|
|
|
|
int jbg_newlen(unsigned char *bie, size_t len);
|
|
|
|
where bie is a pointer to the first byte of the BIE and len its length
|
|
in bytes. This function will scan the entire BIE for the first NEWLEN
|
|
marker segment. It will then take the updated image-height value YD
|
|
from it and use it to overwrite the YD value in the BIE header. The
|
|
jbg_newlen() can return some of the same error codes as jbg_dec_in(),
|
|
namely JBG_EOK if everything went fine, JBG_EAGAIN is the data
|
|
provided is too short to be a valid BIE, JBG_EINVAL if a format error
|
|
was encountered, and JBG_EABORT if an ABORT marker segment was found.
|
|
After having patched the image-height value in the BIE using
|
|
jbg_newlen(), simply hand over the BIE as usual to jbg_dec_in().
|
|
|
|
In general, for applications where NEWLEN markers can appear, in
|
|
particular fax reception, you should consider using the jbig85.c
|
|
decoder instead, as it can process BIEs with NEWLEN markers in a
|
|
single pass.
|
|
|
|
A more detailed description of the JBIG-KIT implementation is
|
|
|
|
Markus Kuhn: Effiziente Kompression von bi-level Bilddaten durch
|
|
kontextsensitive arithmetische Codierung. Studienarbeit, Lehrstuhl
|
|
für Betriebssysteme, IMMD IV, Universität Erlangen-Nürnberg,
|
|
Erlangen, July 1995. (German, 62 pages)
|
|
<http://www.cl.cam.ac.uk/~mgk25/kuhn-sta.pdf>
|
|
|
|
Please quote the above if you use JBIG-KIT in your research project.
|
|
|
|
*** Happy compressing ***
|
|
|
|
[end]
|