====== lzop vs compress vs gzip vs bzip2 vs lzma vs lzma2-xz benchmark, reloaded ====== I've had a couple of interesting comments at my [[compress_vs_gzip_vs_bzip2_vs_lzma_vs_lzma2_aka_xz_benchmark|last attempt]] to benchmark those algorithms.\\ So, here is a more complete benchmark, with hopefully more detailed results. ===== Benchmark protocol ===== We are benchmarking all the algorithms supported by recent <color red>tar</color> versions (1.22 was used): ^ program ^ extension ^ version ^ comment ^ supported compression levels ^ | [[http://www.lzop.org/|lzop]] | .lzop | 1.02rc1 | known to be very fast | 1 to 9, but 2 to 6 are equivalent, 3 by default | | [[wp>Compress|compress]] | .Z | 4.2.4 | the legacy UNIX compression algorihm | not configurable | | [[wp>Gzip|gzip]] | .gz (.tgz) | 1.3.12 | replaced compress in recent UNIX-like operating systems | 1 to 9, 6 by default | | [[wp>Bzip2|bzip2]] | .bzip2 (.tbz .tbz2) | 1.0.5 | known to have a better compression ratio than gzip, but much slower | 1 to 9, 9 by default | | [[wp>Lzma|lzma]] | .lzma | 4.999.9beta | new algorithm aiming at high compression ratios | 0 to 9, 6 by default | | [[wp>Xz|xz]] (aka lzma2) | .xz (.txz) | 4.999.9beta | slightly improved version of lzma2, with some new features, for example integrity checking, as seen on the [[wpfr>xz|French Wikipedia page]] | 0 to 9, 6 by default | Benchmark protocol at a glance: * I used the [[ftp://ftp.kernel.org/pub/linux/kernel/v2.4/linux-2.4.0.tar.bz2|Linux 2.4.0 kernel]] archive contents as data to compress. The uncompressed version takes **100 132 718 bytes** of disk space (or 95.5 Mb). * Each algorithm has been tested with all supported compression levels * The resulting archive size has of course been measured * Compression and decompression tests have been run 3 times per algorithm per compression level * RAM memory used has been measured during both compression and decompression * The time elapsed during compression and decompression has been measured * All thoses tests have been done in <color green>/dev/shm</color> (i.e. in memory) to avoid disk I/O overhead * I tried to use the multithreading features of LZMA/LZMA2/XZ, but it's not yet implemented, as reported by the <color red>man</color> and as tested by myself For reference, the following script has been used to automate the benchmark: <code sh |h tar_algos_benchmark.sh |h> #! /bin/sh NBLOOP=3 COMPRESS_OBJECT=linux-2.4.0 memstats() { ( renice 19 $$ >/dev/null 2>&1 while : ; do ps --no-headers -o rss -C $1 || break sleep 1 done | tail -n 1 ) } bench() { for i in $(seq 1 $NBLOOP) ; do trap "rm -f out.$2" EXIT /usr/bin/time -f "DONE: comp $1-$3 ($i) time: %e" tar cf out.$2 $COMPRESS_OBJECT --$1 2>&1 >/dev/null & sleep 1 mem=$(memstats $1) size=$(stat -c '%s' out.$2) echo "... mem: $mem size: $size" echo mkdir tmp_extract_$$ || exit 1 trap "rm -f out.$2 ; rm -Rf tmp_extract_$$" EXIT /usr/bin/time -f "DONE: decomp $1-$3 ($i) time: %e" tar xf out.$2 -C tmp_extract_$$ 2>&1 >/dev/null & sleep 1 mem=$(memstats $1) echo "... mem: $mem" echo rm -f out.$2 rm -Rf tmp_extract_$$ trap - EXIT done } for level in none ; do echo "=== COMPRESS ===" bench compress Z done for level in 1 3 7 8 9 ; do echo "=== LZOP -$level ===" export LZOP="-$level" bench lzop lzo $level done for level in 1 2 3 4 5 6 7 8 9 ; do echo "=== GZIP -$level ===" export GZIP="-$level" bench gzip gz $level done for level in 1 2 3 4 5 6 7 8 9 ; do echo "=== BZIP2 -$level ===" export BZIP2="-$level" bench bzip2 bz2 $level done for level in 0 1 2 3 4 5 6 7 8 9 ; do echo "=== LZMA -$level ===" export XZ_OPT="-$level" bench lzma lzma $level done for level in 0 1 2 3 4 5 6 7 8 9 ; do echo "=== XZ (LZMA2) -$level ===" export XZ_OPT="-$level" bench xz xz $level done </code> ===== Benchmark results ===== Here are the raw (and somewhat unreadable) results: ctime: compression time, cmem: memory used during compression, dtime: decompression time, dmem: memory used during decompression ^ algo ^ size (Mb) ^ ctime (s) ^ cmem (Kb) ^ dtime (s) ^ dmem (Kb) ^ ^ compress | ''39.56'' | ''2.64'' | ''1 124'' | ''1.60'' | ''548'' | ^ lzop-1 | ''36.17'' | ''1.04'' | ''1 004'' | ''0.63'' | ''?'' | ^ lzop-3 | ''36.38'' | ''1.11'' | ''940'' | ''0.65'' | ''?'' | ^ lzop-7 | ''27.07'' | ''13.15'' | ''1 312'' | ''0.70'' | ''?'' | ^ lzop-8 | ''26.74'' | ''27.67'' | ''1 308'' | ''0.65'' | ''?'' | ^ lzop-9 | ''26.73'' | ''33.3'' | ''1 308'' | ''0.60'' | ''?'' | ^ gzip-1 | ''28.72'' | ''2.74'' | ''708'' | ''1.42'' | ''486'' | ^ gzip-2 | ''27.44'' | ''2.90'' | ''708'' | ''1.42'' | ''486'' | ^ gzip-3 | ''26.50'' | ''3.22'' | ''708'' | ''1.40'' | ''484'' | ^ gzip-4 | ''24.77'' | ''3.56'' | ''708'' | ''1.33'' | ''486'' | ^ gzip-5 | ''23.82'' | ''4.43'' | ''718'' | ''1.27'' | ''500'' | ^ gzip-6 | ''23.43'' | ''5.78'' | ''716'' | ''1.29'' | ''488'' | ^ gzip-7 | ''23.33'' | ''6.74'' | ''700'' | ''1.25'' | ''488'' | ^ gzip-8 | ''23.25'' | ''9.82'' | ''692'' | ''1.27'' | ''488'' | ^ gzip-9 | ''23.23'' | ''13.2'' | ''694'' | ''1.25'' | ''486'' | ^ bzip2-1 | ''21.81'' | ''17.5'' | ''1 554'' | ''4.62'' | ''898'' | ^ bzip2-2 | ''20.59'' | ''17.6'' | ''2 336'' | ''4.48'' | ''1 288'' | ^ bzip2-3 | ''20.02'' | ''17.8'' | ''3 120'' | ''4.43'' | ''1 700'' | ^ bzip2-4 | ''19.66'' | ''18.5'' | ''3 900'' | ''4.49'' | ''3 900'' | ^ bzip2-5 | ''19.42'' | ''20.0'' | ''4 688'' | ''4.56'' | ''2 468'' | ^ bzip2-6 | ''19.25'' | ''20.6'' | ''5 468'' | ''4.76'' | ''2 878'' | ^ bzip2-7 | ''19.07'' | ''21.9'' | ''6 256'' | ''5.07'' | ''3 250'' | ^ bzip2-8 | ''18.94'' | ''22.5'' | ''7 040'' | ''5.08'' | ''3 644'' | ^ bzip2-9 | ''18.89'' | ''22.6'' | ''7 820'' | ''5.38'' | ''4 040'' | ^ algo ^ size (Mb) ^ ctime (s) ^ cmem (Kb) ^ dtime (s) ^ dmem (Kb) ^ ^ lzma-0 | ''23.16'' | ''10.3'' | ''1 980'' | ''3.42'' | ''840'' | ^ lzma-1 | ''21.94'' | ''13.1'' | ''2 000'' | ''3.34'' | ''824'' | ^ lzma-2 | ''20.08'' | ''13.1'' | ''5 476'' | ''3.11'' | ''1 272'' | ^ lzma-3 | ''17.24'' | ''60.3'' | ''13 600'' | ''2.44'' | ''1 788'' | ^ lzma-4 | ''16.64'' | ''66.8'' | ''25 376'' | ''2.40'' | ''2 814'' | ^ lzma-5 | ''16.21'' | ''69.2'' | ''48 926'' | ''2.28'' | ''4 858'' | ^ lzma-6 | ''15.62'' | ''90.5'' | ''96 030'' | ''2.21'' | ''8 952'' | ^ lzma-7 | ''15.36'' | ''97.6'' | ''190 260'' | ''2.24'' | ''17 146'' | ^ lzma-8 | ''15.17'' | ''106'' | ''378 688'' | ''2.25'' | ''33 536'' | ^ lzma-9 | ''15.04'' | ''113'' | ''689 956'' | ''2.23'' | ''66 304'' | ^ xz-0 | ''23.16'' | ''10.7'' | ''2 088'' | ''3.63'' | ''864'' | ^ xz-1 | ''21.95'' | ''11.5'' | ''2 066'' | ''3.31'' | ''875'' | ^ xz-2 | ''20.08'' | ''13.2'' | ''5 556'' | ''2.96'' | ''1 300'' | ^ xz-3 | ''17.25'' | ''63.0'' | ''13 684'' | ''2.70'' | ''1 830'' | ^ xz-4 | ''16.64'' | ''65.6'' | ''25 450'' | ''2.60'' | ''2 836'' | ^ xz-5 | ''16.21'' | ''70.0'' | ''49 012'' | ''2.48'' | ''4 886'' | ^ xz-6 | ''15.62'' | ''90.5'' | ''96 112'' | ''2.50'' | ''9 000'' | ^ xz-7 | ''15.36'' | ''97.4'' | ''190 324'' | ''2.40'' | ''17 196'' | ^ xz-8 | ''15.17'' | ''110'' | ''378 740'' | ''2.44'' | ''35 556'' | ^ xz-9 | ''15.05'' | ''117'' | ''690 060'' | ''2.46'' | ''66 326'' | ===== Results analysis ===== ==== The outsiders ==== The <color red>compress</color> algorithm is completely awful: it has the worst compression ratio. Other algorithms perform better, faster, and using less RAM. There's not much more to say: forget this one. The <color red>lzop</color> algorithm is indeed very fast, it can compress the whole kernel tree in about one second. The level 3 (which is the default) is really weird: it has a lower compression ratio **and** a lower compression speed than the level 1! So, it really has no advantages over the level 1. Levels 7, 8 and 9 are totally useless: very slow compression time, and still an awful compression ratio. So, the only interesting level of lzop seems to be 1. Take it if you need blazing speed at the cost of a terrible compression ratio, compared to the other algorithms (you'll also get a low RAM usage for no additional cost). ==== LZMA and XZ (LZMA2) ==== The performance of <color red>lzma</color> and <color red>xz</color> are extremely close. Lzma2 doesn't outperform lzma ("lzma1"), as one might expect : there's no real difference between lzma and lzma2 in terms of compression ratio, compression/decompression speed, or RAM usage. This is because lzma2 has just a few modifications over lzma1, and most of them are not regarding the compression algorithm itself, it just fixes some practical issues lzma1 had (according to the xz man page). The ''.lzma'' format will most likely disappear in a near future in favor of the ''.xz'' format (which is already widely preferred over ''.lzma''). ==== Results ordered by compression ratio ==== In the following table, I've removed <color red>lzma</color> for brevity's sake (if you read the above paragraph, you know why). ctime: compression time, cmem: memory used during compression, dtime: decompression time, dmem: memory used during decompression ^ algo ^ size (Mb) ^ ctime (s) ^ cmem (Kb) ^ dtime (s) ^ dmem (Kb) ^ ^ xz-9 | ''15.05'' | ''117'' | <color darkred>''690 060''</color> | ''2.46'' | <color darkred>''66 326''</color> | ^ xz-8 | ''15.17'' | ''110'' | <color darkred>''378 740''</color> | ''2.44'' | <color darkred>''35 556''</color> | ^ xz-7 | ''15.36'' | ''97.4'' | <color darkred>''190 324''</color> | ''2.40'' | <color darkred>''17 196''</color> | ^ xz-6 | ''15.62'' | ''90.5'' | <color darkred>''96 112''</color> | ''2.50'' | ''9 000'' | ^ xz-5 | ''16.21'' | ''70.0'' | <color darkred>''49 012''</color> | ''2.48'' | ''4 886'' | ^ xz-4 | ''16.64'' | ''65.6'' | <color darkred>''25 450''</color> | ''2.60'' | ''2 836'' | ^ xz-3 | ''17.25'' | <color darkred>''63.0''</color> | ''13 684'' | ''2.70'' | ''1 830'' | ^ bzip2-9 | ''18.89'' | <color darkred>''22.6''</color> | ''7 820'' | ''5.38'' | ''4 040'' | ^ bzip2-8 | ''18.94'' | ''22.5'' | ''7 040'' | ''5.08'' | ''3 644'' | ^ bzip2-7 | ''19.07'' | ''21.9'' | ''6 256'' | ''5.07'' | ''3 250'' | ^ bzip2-6 | ''19.25'' | ''20.6'' | ''5 468'' | ''4.76'' | ''2 878'' | ^ bzip2-5 | ''19.42'' | ''20.0'' | ''4 688'' | ''4.56'' | ''2 468'' | ^ bzip2-4 | ''19.66'' | ''18.5'' | ''3 900'' | ''4.49'' | ''3 900'' | ^ bzip2-3 | ''20.02'' | ''17.8'' | ''3 120'' | ''4.43'' | ''1 700'' | ^ xz-2 | ''20.08'' | ''13.2'' | ''5 556'' | ''2.96'' | ''1 300'' | ^ <color grey>bzip2-2</color> | <color grey>''20.59''</color> | <color grey>''17.6''</color> | <color grey>''2 336''</color> | <color grey>''4.48''</color> | <color grey>''1 288''</color> | ^ <color grey>bzip2-1</color> | <color grey>''21.81''</color> | <color grey>''17.5''</color> | <color grey>''1 554''</color> | <color grey>''4.62''</color> | <color grey>''898''</color> | ^ xz-1 | ''21.95'' | ''11.5'' | ''2 066'' | ''3.31'' | ''875'' | ^ xz-0 | ''23.16'' | ''10.7'' | ''2 088'' | ''3.63'' | ''864'' | ^ <color grey>gzip-9</color> | <color grey>''23.23''</color> | <color grey>''13.2''</color> | <color grey>''694''</color> | <color grey>''1.25''</color> | <color grey>''486''</color> | ^ gzip-8 | ''23.25'' | ''9.82'' | ''692'' | ''1.27'' | ''488'' | ^ gzip-7 | ''23.33'' | ''6.74'' | ''700'' | ''1.25'' | ''488'' | ^ gzip-6 | ''23.43'' | ''5.78'' | ''716'' | ''1.29'' | ''488'' | ^ gzip-5 | ''23.82'' | ''4.43'' | ''718'' | ''1.27'' | ''500'' | ^ gzip-4 | ''24.77'' | ''3.56'' | ''708'' | ''1.33'' | ''486'' | ^ gzip-3 | ''26.50'' | ''3.22'' | ''708'' | ''1.40'' | ''484'' | ^ <color grey>lzop-9</color> | <color grey>''26.73''</color> | <color grey>''33.3''</color> | <color grey>''1 308''</color> | <color grey>''0.60''</color> | <color grey>''?''</color> | ^ <color grey>lzop-8</color> | <color grey>''26.74''</color> | <color grey>''27.67''</color> | <color grey>''1 308''</color> | <color grey>''0.65''</color> | <color grey>''?''</color> | ^ <color grey>lzop-7</color> | <color grey>''27.07''</color> | <color grey>''13.15''</color> | <color grey>''1 312''</color> | <color grey>''0.70''</color> | <color grey>''?''</color> | ^ gzip-2 | ''27.44'' | ''2.90'' | ''708'' | ''1.42'' | ''486'' | ^ gzip-1 | <color darkred>''28.72''</color> | ''2.74'' | ''708'' | ''1.42'' | ''486'' | ^ lzop-1 | <color darkred>''36.17''</color> | ''1.04'' | ''1 004'' | ''0.63'' | ''?'' | ^ <color grey>lzop-3</color> | <color grey>''36.38''</color> | <color grey>''1.11''</color> | <color grey>''940''</color> | <color grey>''0.65''</color> | <color grey>''?''</color> | ^ <color grey>compress</color> | <color grey>''39.56''</color> | <color grey>''2.64''</color> | <color grey>''1 124''</color> | <color grey>''1.60''</color> | <color grey>''548''</color> | ^ algo ^ size (Mb) ^ ctime (s) ^ cmem (Kb) ^ dtime (s) ^ dmem (Kb) ^ The lines in <color grey>grey</color> mean that the current algorithm+level is suboptimal: it has a lower compression ratio **and** an higher compression time than the algorithm+level of the immediately above row. In short: these are combinations you shouldn't use. Two numbers in <color darkred>dark red</color> have a big gap between them, this is to ease readability and pinpoint the major magnitude transitions between the numbers. ==== Some highlights ==== As we already seen, <color red>lzop</color> is the fastest algorithm, but if you're looking for pure speed, you might better want to take a look at <color red>gzip</color> and its lowest compression levels. It's also pretty fast, and achieves a **way** better compression ratio than lzop. The higher level of <color red>gzip</color> (9, which is the default), and the lower levels of <color red>bzip2</color> (1, 2, 3) are outperformed by the lower levels of <color red>xz</color> (0, 1, 2). The level 0 of <color red>xz</color> might not be used, its use is somewhat discouraged in the <color red>man</color>, because its meaning might change in a future version, and select an non-lzma2 algorithm to try to achieve an higher compression speed. The higher levels of <color red>xz</color> (3 and above) might only be used if you want the best compression ratio, and definitely don't care about the enormous time of compression, and gigantic amount of RAM used. The levels 7 to 9 are particularly insane in this regard, while offering you a ridiculously tiny better compression ratio than mid-levels. The <color red>bzip2</color> decompression time is particularly bad, whatever level is used. If you care about the decompression time, better avoid bzip2 entirely, and use <color red>gzip</color> if you prefer speed or <color red>xz</color> if you prefer compression ratio.

 
blog/lzop_vs_compress_vs_gzip_vs_bzip2_vs_lzma_vs_lzma2-xz_benchmark_reloaded.txt · Last modified: 02/04/2010 14:54 by speed47 · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki