I’ve had a couple of interesting comments at my last attempt to benchmark those algorithms.
So, here is a more complete benchmark, with hopefully more detailed results.

1) Benchmark protocol

We are benchmarking all the algorithms supported by recent tar versions (1.22 was used):

programextensionversioncommentsupported compression levels
lzop.lzop1.02rc1known to be very fast1 to 9, but 2 to 6 are equivalent, 3 by default
compress.Z4.2.4the legacy UNIX compression algorithmnot configurable
gzip.gz (.tgz)1.3.12replaced compress in recent UNIX-like operating systems1 to 9, 6 by default
bzip2.bzip2 (.tbz, .tbz2)1.0.5known to have a better compression ratio than gzip, but much slower1 to 9, 9 by default
lzma.lzma4.999.9beta new algorithm aiming at high compression ratios0 to 9, 6 by default
lzma2.xz (.txz)4.999.9betaxz is a compression format, and uses by default the lzma2 algorithm, it has some new features over lzma, for example integrity checking, as seen on the French Wikipedia page0 to 9, 6 by default

Benchmark protocol at a glance:

  • I used the Linux 2.4.0 kernel archive contents as data to compress. The uncompressed version takes 100 132 718 bytes of disk space (or 95.5 Mb).
  • Each algorithm has been tested with all supported compression levels
  • The resulting archive size has of course been measured
  • Compression and decompression tests have been run 3 times per algorithm per compression level
  • RAM memory used has been measured during both compression and decompression
  • The time elapsed during compression and decompression has been measured
  • All thoses tests have been done in /dev/shm (i.e. in memory) to avoid disk I/O overhead
  • I tried to use the multithreading features of LZMA/LZMA2, but it’s not yet implemented, as reported by the man and as tested by myself

For reference, the following script has been used to automate the benchmark:

#! /bin/sh
NBLOOP=3
COMPRESS_OBJECT=linux-2.4.0
 
memstats()
{
  (
  renice 19 $$ >/dev/null 2>&1
  while : ; do
    ps --no-headers -o rss -C $1 || break
    sleep 1
  done | tail -n 1
  )
}
bench()
{
  for i in $(seq 1 $NBLOOP) ; do
    trap "rm -f out.$2" EXIT
    /usr/bin/time -f "DONE: comp $1-$3 ($i) time: %e" tar cf out.$2 $COMPRESS_OBJECT --$1 2>&1 >/dev/null & sleep 1
    mem=$(memstats $1)
    size=$(stat -c '%s' out.$2)
    echo "... mem: $mem size: $size"
    echo
    mkdir tmp_extract_$$ || exit 1
    trap "rm -f out.$2 ; rm -Rf tmp_extract_$$" EXIT
    /usr/bin/time -f "DONE: decomp $1-$3 ($i) time: %e" tar xf out.$2 -C tmp_extract_$$ 2>&1 >/dev/null & sleep 1
    mem=$(memstats $1)
    echo "... mem: $mem"
    echo
    rm -f out.$2
    rm -Rf tmp_extract_$$
    trap - EXIT
  done
}
 
for level in none ; do
  echo "=== COMPRESS ==="
  bench compress Z
done
for level in 1 3 7 8 9 ; do
  echo "=== LZOP -$level ==="
  export LZOP="-$level"
  bench lzop lzo $level
done
for level in 1 2 3 4 5 6 7 8 9 ; do
  echo "=== GZIP -$level ==="
  export GZIP="-$level"
  bench gzip gz $level
done
for level in 1 2 3 4 5 6 7 8 9 ; do
  echo "=== BZIP2 -$level ==="
  export BZIP2="-$level"
  bench bzip2 bz2 $level
done
for level in 0 1 2 3 4 5 6 7 8 9 ; do
  echo "=== LZMA -$level ==="
  export XZ_OPT="-$level"
  bench lzma lzma $level
done
for level in 0 1 2 3 4 5 6 7 8 9 ; do
  echo "=== XZ (LZMA2) -$level ==="
  export XZ_OPT="-$level"
  bench xz xz $level
done

2) Benchmark results

Here are the raw -and somewhat unreadable- results:

ctime: compression time, cmem: memory used during compression
dtime: decompression time, dmem: memory used during decompression

algosize (Mb)ctime (s)cmem (Kb)dtime (s)dmem (Kb)
compress39.562.641 1241.60548
lzop-136.171.041 0040.63?
lzop-336.381.119400.65?
lzop-727.0713.151 3120.70?
lzop-826.7427.671 3080.65?
lzop-926.7333.31 3080.60?
gzip-128.722.747081.42486
gzip-227.442.907081.42486
gzip-326.503.227081.40484
gzip-424.773.567081.33486
gzip-523.824.437181.27500
gzip-623.435.787161.29488
gzip-723.336.747001.25488
gzip-823.259.826921.27488
gzip-923.2313.26941.25486
bzip2-121.8117.51 5544.62898
bzip2-220.5917.62 3364.481 288
bzip2-320.0217.83 1204.431 700
bzip2-419.6618.53 9004.493 900
bzip2-519.4220.04 6884.562 468
bzip2-619.2520.65 4684.762 878
bzip2-719.0721.96 2565.073 250
bzip2-818.9422.57 0405.083 644
bzip2-918.8922.67 8205.384 040
lzma-023.1610.31 9803.42840
lzma-121.9413.12 0003.34824
lzma-220.0813.15 4763.111 272
lzma-317.2460.313 6002.441 788
lzma-416.6466.825 3762.402 814
lzma-516.2169.248 9262.284 858
lzma-615.6290.596 0302.218 952
lzma-715.3697.6190 2602.2417 146
lzma-815.17106378 6882.2533 536
lzma-915.04113689 9562.2366 304
xz-023.1610.72 0883.63864
xz-121.9511.52 0663.31875
xz-220.0813.25 5562.961 300
xz-317.2563.013 6842.701 830
xz-416.6465.625 4502.602 836
xz-516.2170.049 0122.484 886
xz-615.6290.596 1122.509 000
xz-715.3697.4190 3242.4017 196
xz-815.17110378 7402.4435 556
xz-915.05117690 0602.4666 326


3) Results analysis

The outsiders

The compress algorithm is completely awful: it has the worst compression ratio. Other algorithms perform better, faster, and using less RAM. There’s not much more to say: forget this one.

The lzop algorithm is indeed very fast, it can compress the whole kernel tree in about one second. The level 3 (which is the default) is really weird: it has a lower compression ratio and a lower compression speed than the level 1! So, it really has no advantages over the level 1. Levels 7, 8 and 9 are totally useless: very slow compression time, and still an awful compression ratio. So, the only interesting level of lzop seems to be 1. Take it if you need blazing speed at the cost of a terrible compression ratio, compared to the other algorithms (you’ll also get a low RAM usage for no additional cost).

Difference between XZ and LZMA2

Short answer: xz is a format that (currently) only uses the lzma2 compression algorithm.

Long answer: think of xz as a container for the compression data generated by the lzma2 algorithm. We also have this paradigm for video files for example: avi/mkv/mov/mp4/ogv are containers, and xvid/x264/theora are compression algorithms. The confusion is often made because currently, the xz format only supports the lzma2 algorithm (and it’ll remain the default, even if some day, others algorithms may be added). This confusion doesn’t happen with other formats/algorithms, as for example gzip is both a compression algorithm and a format. To be exact, the gzip format only supports to encapsulate data generated by gzip… the compression algorithm. In this article I’ll use “xz” to say “the lzma2 algorithm whose data is being encapsulated by the xz format”. You’ll probably agree it’s way simpler :)

Performance of LZMA vs LZMA2 (XZ)

The performance of lzma and xz are extremely close. Lzma2 doesn’t outperform lzma (“lzma1″), as one might expect : there’s no real difference between lzma and lzma2 in terms of compression ratio, compression/decompression speed, or RAM usage. This is because lzma2 has just a few modifications over lzma1, and most of them are not regarding the compression algorithm itself, it just fixes some practical issues lzma1 had (according to the xz man page). The ”.lzma” format will most likely disappear in a near future in favor of the ”.xz” format (which is already widely preferred over ”.lzma”). And if you have read the above paragraph, yes, lzma1 was both a compression algorithm and a (messy) format. :)

Results ordered by compression ratio

In the following table, I’ve removed lzma for brevity’s sake (if you read the above paragraph, you know why).
You can use the toolbar below to export, filter or print the results.

ctime: compression time, cmem: memory used during compression
dtime: decompression time, dmem: memory used during decompression

algosize (Mb)ctime (s)cmem (Kb)dtime (s)dmem (Kb)
xz-915.05117690 0602.4666 326
xz-815.17110378 7402.4435 556
xz-715.3697.4190 3242.4017 196
xz-615.6290.596 1122.509 000
xz-516.2170.049 0122.484 886
xz-416.6465.625 4502.602 836
xz-317.2563.013 6842.701 830
bzip2-918.8922.67 8205.384 040
bzip2-818.9422.57 0405.083 644
bzip2-719.0721.96 2565.073 250
bzip2-619.2520.65 4684.762 878
bzip2-519.4220.04 6884.562 468
bzip2-419.6618.53 9004.493 900
bzip2-320.0217.83 1204.431 700
xz-220.0813.25 5562.961 300
bzip2-220.5917.62 3364.481 288
bzip2-121.8117.51 5544.62898
xz-121.9511.52 0663.31875
xz-023.1610.72 0883.63864
gzip-923.2313.26941.25486
gzip-823.259.826921.27488
gzip-723.336.747001.25488
gzip-623.435.787161.29488
gzip-523.824.437181.27500
gzip-424.773.567081.33486
gzip-326.503.227081.40484
lzop-926.7333.31 3080.60?
lzop-826.7427.671 3080.65?
lzop-727.0713.151 3120.70?
gzip-227.442.907081.42486
gzip-128.722.747081.42486
lzop-136.171.041 0040.63?
lzop-336.381.119400.65?
compress39.562.641 1241.60548


The lines in grey mean that the current algorithm+level is suboptimal: it has a lower compression ratio and an higher compression time than the algorithm+level of the immediately above row. In short: these are combinations you shouldn’t use.

Two numbers in dark red have a big gap between them, this is to ease readability and pinpoint the major magnitude transitions between the numbers.

Some highlights

As we already seen, lzop is the fastest algorithm, but if you’re looking for pure speed, you might better want to take a look at gzip and its lowest compression levels. It’s also pretty fast, and achieves a way better compression ratio than lzop.

The higher level of gzip (9, which is the default), and the lower levels of bzip2 (1, 2, 3) are outperformed by the lower levels of xz (0, 1, 2).

The level 0 of xz might not be used, its use is somewhat discouraged in the man, because its meaning might change in a future version, and select an non-lzma2 algorithm to try to achieve an higher compression speed.

The higher levels of xz (3 and above) might only be used if you want the best compression ratio, and definitely don’t care about the enormous time of compression, and gigantic amount of RAM used. The levels 7 to 9 are particularly insane in this regard, while offering you a ridiculously tiny better compression ratio than mid-levels.

The bzip2 decompression time is particularly bad, whatever level is used. If you care about the decompression time, better avoid bzip2 entirely, and use gzip if you prefer speed or xz if you prefer compression ratio.