Skip all navigation

Mail Calendar Contacts Downloads more tools
home Helpdesk Issues News about Search
Print Version   Feeds   Get QR  
Edit 
More Page Actions ↓

DataCompression

< Daemon | Knowledge Base | DigitalTelevision >

This is an introduction to the theory of Data Compression, you may have been looking for practical information.

Data compression is any of various techniques to reduce the number of bits (the quanta of digital information) required to store information. Compression relies on sophisticated mathematics and the detection of recurring patterns and sequences, entropy, information theory and algorithmic theory are some of the many branches of science that bear upon this. Various compression techniques are optimized for differing types of data.

There are two broad categories of compression, lossy and lossless. Lossy compression discards some information permanently, lossless compression optimizes how the information is stored.

Lossless compression can be easily demonstrated in the following hypothetical example:


Runlength Encoding

In this highly simplistic example, the repeated strings of letters are replaced by one letter and a number indicating how many times that letter occurred. As you can see this leads to the A's B's and C's being stored much more efficiently, A4 is much shorter than AAAA. Notably, this crude technique actually increases the storage-size required for the of "F" by one character. More sophisticated types of lossless compression such as the LZ family of techniques used by UNIX's Gzip and Window's info PKZIP and the combined mathematical transformation and entropy coding algorithms used by the popular UNIX compressor Bzip2 mostly avoid this problem. (For more information on UNIX archive files see Support.FileArchiving)

Some types of compression permanently discard information, these techniques are often applied to music, videos and images as most of these contain details that can be removed without being noticed by the viewer or listener. The following image shows an extreme case in which quite a lot of information was discarded from a photograph. The original is provided above for comparison.


Extreme JPEG compression.

The original image was also compressed with JPEG, however at a more reasonable quality setting. The pixilization that you see in the bottom photograph is an extreme effect that was induced for illustration purposes.


Compression Artifacts

The JPEG compression format for images simplifies complex textures and would not normally be used to such an extreme extent. MP3's also use a similar technique, discarding portions of the waveform which cannot readily be perceived by the ear. Video compressors usually use both visual and auditory lossy compression.

If a compression scheme can discard information that a normal human cannot easily perceive, this is referred to as being "perceptually-lossless".

Lossy compression is best suited to media files that contain complex visual or auditory textures, the illustration to the left shows a line-drawing that was damaged by lossy JPEG compression. The JPEG scheme is based on theories relating to subtle gradation and cannot cope with sharp borders.

See Also

Among others

< Daemon | K.B. Index | DigitalTelevision >

This is an article from the Knowledge Base, a project of the Vistua Online Helpdesk to form a body of articles relating to common system topics. You are welcome to contribute to it.


Text last modified on July 05, 2010, at 06:49 PM
You are here: Support » DataCompression


Helpdesk

Report A Problem

FAQ

QR Code datagram

Vistua Hub version 4.1 © MMVI-MMIX Vistua.com. All Rights Reserved. All times UTC. Silk icon-set by Fam Fam Fam

.

About / Contact / Terms / XHTML / CSSSwitch to mobile or touch verion