Compress Compound Images in H.264/MPEG-4 AVC by Fully Exploiting Spatial Correlation

ABSTRACT :

Compound images consist of text, graphics and natural images, which present strong anisotropic features. It makes existing image coding standards inefficient on compressing them. To solve the problem, this paper proposes a novel coding scheme based on the H.264 intra-frame coding. Two new intra modes are proposed to better exploit spatial correlations in compound images. The first is residual scalar quantization

(RSQ) mode, where intra-predicted residues are directly quantized and entropy coded. The second is base colors and index map (BCIM) mode that can be viewed as an adaptive vector quantization. In this mode, an image block is represented by several representative colors, called as base colors, and an index map to compress. Two new modes as well as previous intra modes in H.264 are selected by the rate-distortion optimization (RDO) method in each block. Experimental results show that the proposed scheme not only improves the coding efficiency even more than 10dB for compound images but also keeps the similar performance as H.264 for natural images.

                    Besides natural images, there are up to millions of artificial visual contents generated by computers every day, such as Web pages, PDF files, captured screens, online games, slides, and so on. They usually are complicated and consist of text, graphics, and natural images. This is why they are called as compound images. When they are displayed in remote clients, wireless projectors or thin clients, how to efficiently compress them has become a prevalent and critical problem in many applications.

                It is well recognized that the state-of-the-art image coding standards (e.g., JPEG, JPEG2000  and the intra-frame coding of H.264) are all designed for natural images. The correlation among samples is mainly exploited by transform (e.g. 2D DCT or wavelet). They generate a compact description in frequency domain if the correlation of samples can be modeled by an isotropic autocorrelation function. However, compound images cannot be modeled as isotropic absolutely.

The text and graphics parts are considerably anisotropic. It makes these image coding standards no longer suitable or inefficient for compound images. In addition, for the compression of pure text images such as binary documents, there are already mature algorithms such as JBIG and JBIG2. But they are not good at handling natural images. Therefore, some schemes have been reported to efficiently compress compound images. They can be categorized into layer-based and block-based approaches. The layer-based approaches (typically DjVu) are mainly proposed to efficiently compress compound images.

Leave a Reply

Your email address will not be published. Required fields are marked *