LZ4
Original authorYann Collet
DeveloperYann Collet
Initial release24 April 2011 (2011-04-24)
Stable release
1.10.0[1] Edit this on Wikidata / 22 July 2024; 22 months ago (22 July 2024)
Written inC
Operating systemCross-platform
PlatformPortable
TypeData compression
LicenseSimplified BSD License
Websitelz4.org Edit this at Wikidata
Repository
LZ4 Frame Format
Magic number04 22 4D 18 (new)
02 21 4C 18 (legacy)[2]
Type of formatData compression
Websitehttps://github.com/lz4/lz4/blob/master/doc/lz4_Frame_format.md

LZ4 is a lossless data compression algorithm optimized for fast compression and decompression. It belongs to the LZ77 family of byte-oriented compression schemes.

Features

edit

The LZ4 algorithm provides a good trade-off between speed and compression ratio. Typically, it has a smaller (i.e., worse) compression ratio than the similar LZO algorithm, which in turn is worse than algorithms like DEFLATE. However, LZ4 compression speed is similar to LZO and several times faster than DEFLATE, while decompression speed is significantly faster than LZO.[3]

Design

edit

LZ4 only uses a dictionary-matching stage (LZ77) and, unlike other common compression algorithms, does not combine it with an entropy coding stage (e.g. Huffman coding in DEFLATE).[4][5]

The LZ4 algorithm represents the data as a series of sequences. Each sequence begins with a one-byte token that is broken into two 4-bit fields. The first field represents the number of literal bytes that are to be copied to the output. The second field represents the number of bytes to copy from the already decoded output buffer (with 0 representing the minimum match length of 4 bytes). A value of 15 in either of the bitfields indicates that the length is larger and there is an extra byte of data that is to be added to the length. A value of 255 in these extra bytes indicates that yet another byte is to be added. Hence, arbitrary lengths are represented by a series of extra bytes containing the value 255. The string of literals comes after the token and any extra bytes needed to indicate string length. This is followed by an offset that indicates how far back in the output buffer to begin copying. The extra bytes (if any) of the match-length come at the end of the sequence.[6][7]

Compression can be carried out in a stream or in blocks. Higher compression ratios can be achieved by investing more effort in finding the best matches. This results in both a smaller output and faster decompression.

LZ4 has two frame formats. The legacy format was very restrictive and relied on an external end-of-file signal, which proved to be a problem in Linux initramfs requiring a workaround to handle zero-padding.[8] The new format is a lot more flexible and has its own end-of-frame marker. It resembles the Zstd frame format in design.[9]

Implementation

edit

The reference implementation in C by Yann Collet is licensed under a BSD license. There are ports and bindings in various languages including Java, C#, Rust, and Python.[10] The Apache Hadoop system uses this algorithm for fast compression. LZ4 was also implemented natively in the Linux kernel 3.11.[11] The FreeBSD, Illumos, ZFS on Linux, and ZFS-OSX implementations of the ZFS filesystem support the LZ4 algorithm for on-the-fly compression.[12][13][14][15] Linux supports LZ4 for SquashFS since 3.19-rc1.[16] LZ4 is also supported by the newer zstd command line utility by Yann Collet, as well as a 7-Zip fork called 7-Zip-zstd.[17]

References

edit
  1. ^ "LZ4 v1.10.0 - Multicores edition". 22 July 2024. Retrieved 23 July 2024.
  2. ^ Collet, Yann. "LZ4 Frame Format Description". GitHub. Retrieved 7 October 2020.
  3. ^ Michael Larabel (28 January 2013). "Support For Compressing The Linux Kernel With LZ4". Phoronix. Retrieved 28 August 2015.
  4. ^ Collet, Yann (30 March 2019). "LZ4 Block Format Description". GitHub. Retrieved 9 July 2020. There is no entropy encoder back-end nor framing layer.
  5. ^ DEFLATE Compressed Data Format Specification version 1.3. IETF. doi:10.17487/RFC1951. RFC 1951. Retrieved 9 July 2020.
  6. ^ Yann Collet (26 May 2011). "RealTime Data Compression". Retrieved 28 August 2015.
  7. ^ ticki (25 October 2016). "How LZ4 works". Retrieved 29 June 2017.
  8. ^ "lz4/doc/lz4_Frame_format.md at dev · lz4/lz4". GitHub.
  9. ^ "LZ4 Frame Format Description".
  10. ^ Extremely Fast Compression algorithm http://www.lz4.org on GitHub
  11. ^ Jonathan Corbet (19 July 2013). "Kernel development". LWN.net. Retrieved 28 August 2015.
  12. ^ "FreeBSD 9.2-RELEASE Release Notes". FreeBSD. 13 November 2013. Retrieved 28 August 2015.
  13. ^ "LZ4 Compression". illumos. Archived from the original on 9 October 2018. Retrieved 28 August 2015.
  14. ^ Illumos #3035 LZ4 compression support in ZFS and GRUB on GitHub
  15. ^ "Features: lz4 compression". OpenZFS. Retrieved 28 August 2015.
  16. ^ Phillip Lougher (27 November 2014). "Squashfs: Add LZ4 compression configuration option". Retrieved 28 August 2015.
  17. ^ 7-zip-zstd
edit

📚 Artikel Terkait di Wikipedia

LZ4

Complex 4 LZ4 (compression algorithm), a lossless data compression algorithm 24944 Harish-Chandra, a main-belt asteroid formerly called 1997 LZ4 Led Zeppelin

Zstd

to a BSD + GPLv2 dual license. LZ4 (compression algorithm) – a fast member of the LZ77 family LZFSE – a similar algorithm by Apple used since iOS 9 and

Lossless compression

context modeling. LZ4 – Very fast compression and decompression. Lempel–Ziv–Markov chain algorithm (LZMA) – Very high compression ratio, used by 7zip

Data compression

or line coding, the means for mapping data onto a signal. Data compression algorithms present a space–time complexity trade-off between the bytes needed

LZFSE

author LZ4 compression algorithm – a fast member of the LZ77 family, also available on Apple platforms Bainville, Eric (2016-06-07). "LZFSE compression library

Virtual memory compression

usually uses some sort of LZ class dictionary compression algorithm combined with entropy coding, such as LZO or LZ4, to compress the pages being swapped out

Zram

multiple compression streams and multiple compression algorithms. Compression algorithms include DEFLATE (DEFLATE), LZ4 (LZ4, and LZ4HC "high compression"),

List of archive formats

produces ".j" files. Compression is not a built-in feature of the formats, however, the resulting archive can be compressed with any algorithm of choice. Several