smalLZ4 - optimal LZ4 compression
posted by Stephan Brumme,
updated
You came here to grab just the code or a binary ? Scroll down to the Download section and GIT repository.
A short overview how
smallz4 compares to the original LZ4
and a program called LZ4X (which follows similar ideas as smallz4)
can be found here.And if you are interesting in the inner working, keep on reading and enjoy the ride ...
update August 2, 2018 (smallz4 v1.1):
I improved the compression ratio by better handling of block boundaries (reduced test file
enwik9.lz4 by 647 bytes).Unfortunately, there was a bug in the header of uncompressed blocks - it's fixed.
update August 3, 2018 (smallz4 v1.2):
Experimental support for dictionary compression
Optimal Parsing
In the context of compression, optimal parsing is a multi-pass algorithm to find the smallest output of compressible data.Whenever there are multiple choices of encoding a symbol (or a sequence of symbols), optimal parsing will choose the representation which leads to the overall smallest output.
That means, it may prefer a locally sub-optimal encoding in order to achieve a globally optimal encoding.
Most compression algorithms strives for locally optimal encoding because it's computationally cheaper.
The most basic algorithm, the greedy algorithm, always selects the encoding that fits best "right now".
A significant improvement is called "lazy evaluation" and takes the next group of symbols into consideration and decides which encoding yields the smallest output for the whole group.
In a way, lazy evaluation can be seen as a local version of optimal parsing. It does the same job for a group of symbols instead of the whole file.
Compressing the string
abcde_bcdefgh_abcdefghxxxxxxx returns different file sizes:
$ echo "abcde_bcdefgh_abcdefghxxxxxxx" | lz4 -9 --no-frame-crc | wc -c
43
$ echo "abcde_bcdefgh_abcdefghxxxxxxx" | ./smallz4 | wc -c
41
red characters are literals, i.e. uncompressed data.Green pairs of numbers indicate distance and length of a match.And let's ignore
xxxxxxx from now on (it's only purpose is to hide the fact
that the LZ4 format specification forbids matches at the very end of a file).lz4's result in detail: abcde_(5,4)fgh_(14,5)fghxxxxxxxIt found two matches: The first replaced
bcde by a reference to a sequence we saw 5 bytes ago.That reference "costs" 3 bytes while the four literals would have occupied 4 bytes. We saved a byte here.
The second match replaces
abcde by a reference to the beginning of the file.Again, we need 3 bytes to represent the match while the replaced text was 5 bytes long. Two more bytes saved !
However, this compression is not optimal.
smallz4 can squeeze out another two bytes because it postpones the second match:abcde_(5,4)fgh_a(9,7)xxxxxxxsmallz4 detects that matching abcde would save 2 bytes but matching bcdefgh saves 4 bytes.
Optimal Parsing - Pass 1
For each byte of input data, the longest match will be found and stored in an array calledmatches (starts at about line 560).Additional data structures such as
previousHash and previousExact only speed up the code.A simple brute-force routine will give you the same result - albeit slower.
| position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | (padding) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| character | a |
b |
c |
d |
e |
_ |
b |
c |
d |
e |
f |
g |
h |
_ |
a |
b |
c |
d |
e |
f |
g |
h |
xxxxxxx |
| match distance | 5 | (5) | (5) | (5) | (8) | 14 | 9 | 9 | 9 | 9 | 9 | (9) | (9) | ||||||||||
| match length | 4 | (3) | (2) | (1) | (1) | 5 | 7 | 6 | 5 | 4 | (3) | (2) | (1) |
My program discards these short matches; they aren't stored in
matches.Match length 1 isn't shown in the table above, it's the default value for literals.
Each empty cell in the "match length" row is considered to be 1, with its corresponding match distance being undefined.
Optimal Parsing - Pass 2
In pass 2, all matches are analyzed in reverse. Beginning with the last match we compute how big the compressed output will be - that's what I callcost.You may ask: why in reverse ? Well, the idea ("dynamic programming") is to reduce the problem into a smaller one.
The most simple problem is a single byte. Obviously, it can't be LZ4 compressed.
When taking a few more steps we will arrive at a position where a match was found in the previous pass.
Now we have to decide whether this match is more efficient ("costs less") than outputting its literals.
It boils down to a decision whether cost(match) + cost(everything after the match) is smaller than cost(from here on).
Usually the match is always "cheaper" than literals. The case becomes much more interesting if the longest possible match overlaps with a potential match found later in the text.
That's exactly what we saw in the example given above. Therefore we compute the cost for all potential match length from 4 (LZ4's minimum match length) up to the longest match found.
In conclusion we look for the minimum of cost(match with length x) + cost(everything after x bytes).
The relevant code can be found in
estimateCosts.
It's only parameter matches is modified such that each locally optimal match is replaced by its globally optimal sibling.Often, those two are identical but, as explained, a shorter match (which is locally sub-optimal) can sometimes yield a better output.
Reducing match length below 4 converts a match into a literal.
Obviously, literals take 1 byte. Therefore, the cost of a literal will be 1 plus whatever follows:
cost[i] = cost[i + 1] + 1The basic cost of a match is 1+2=3 bytes: the intial token consumes 1 byte plus 2 bytes to encode the match distance.
However, we can skip over the next position:
cost[i] = cost[i + matchLength] + 1 + 2The optimal match turns out to be the one with the lowest
cost[i] when we vary matchLength between 1 and its original value:
| position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | (padding) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| character | a |
b |
c |
d |
e |
_ |
b |
c |
d |
e |
f |
g |
h |
_ |
a |
b |
c |
d |
e |
f |
g |
h |
xxxxxxx |
| match distance | 5 | 14 | 9 | 9 | 9 | 9 | |||||||||||||||||
| match length | 4 | 5 | 7 | 6 | 5 | 4 | |||||||||||||||||
| cost | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 3 | 3 | 3 | 3 | 3 | 2 | 1 | |
| minimum cost | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 3 | 3 | 3 | 3 | 2 | 1 | |
| optimal length | 4 | 1 | 7 | 6 | 5 | 4 |
estimateCosts will figure out that a match at position 15 is more expensive than a literal:Match:
cost[15] = cost[15 + matchLength] + 1 + 2 = cost[19] + 3 = 6 (see row "cost")Literal:
cost[15] = cost[15 + 1] + 1 = cost[16] + 1 = 4 (see row "minimum cost")Therefore,
match[15].length will be changed from 4 to 1 (its optimal length).Note: there are a few additional adjustments to these formulas in the full source code.
The 15th consecutive literal actually costs 1+1=2 bytes because we have to insert an extra length byte (and after that plus 1 for each 255th byte).
Very long matches take more byte to encode the length (plus 1 after 19 bytes and plus 1 for each 255th thereafter).
Optimal Parsing - Pass 3
Finally, we walk through the matches from the beginning to the end and emit the corresponding LZ4 codes.selectBestMatches is implemented pretty straightforward and can be summarized as follows:
i = 0
while (i < matches.size())
{
if (matches[i] is a literal)
{
add literal to an internal buffer
i++
}
else
{
emit token (which contains match length and number of buffered literals)
emit all buffered literals and clear that buffer
emit match distance
i += matches[i].length
}
}
// if literals buffer isn't empty, emit it, too
Dictionaries (v1.2+)
The first byte of any file can't be LZ4 compressed - because there is nothing that can be referenced.The LZ4 sliding window slowly fills with data when a file is processed, thus "newer data" can reference "older data" and then compression ratio usually improves.
This can be a severely limiting factor for very small files - there is just too little history ("old data") to achieve a proper compression.
LZ4 officially introduced dictionaries in version 1.8.1. A LZ4 dictionary is a plain file with no special format at all. Its contents is just used as the initial sliding window.
For example: HTML5 code should start with
<!doctype html>. A good LZ4 dictionary for HTML5 files should contain those bytes.It's important to note that both the encoder AND the decoder need to have the same dictionary.
The command-line parameter
-D FILE loads the dictionary named FILE. As mentioned before, that parameter (and the same file) must be used for the encoder AND the decoder as well.A dictionary larger than 64k doesn't make any sense since LZ4's sliding window is 64k large.
Dictionaries are only relevant for the first 64k of a compressed file: once a dictionary is loaded into the sliding window and we slowly process incoming data,
old data is slowly shifted out of the sliding window. If your dictionary is 40k then the dictionary's first bytes will disappear from the sliding window after you processed 64k-40k=24k data.
Having such a simple file format, dictionaries can be generated in various ways:
- by hand
- just extracting the first bytes of a typical file
- running a simple statistical analysis on the most frequent phrases
However, compressing multiple similar files might observe large savings.
Live Demo
LZ4 compressor (C++)
Features:- optimal parsing → smaller files than
lz4hc - output is fully LZ4 compatible
- flexible I/O interface
- limited memory consumption (max. 64 MByte), independent of input size
- header-only library: just include a single file without any macros,
#ifdefsand compiler-specific stuff - optional: user-defined compression levels (0 - uncompressed ... 9 - optimal parsing)
- performance: about 5 MByte per second for optimal parsing (Core i7 @ 3.4 GHz,
enwik9test file) - a few workarounds for degenerated data, such as extremely long sequences of repeating characters
- fall back to uncompressed mode if compressed data actually takes more space than uncompressed data
- lots of code comments: you can actually understand what's going on ☺
- XXHash checksums - see here for a simple implementation
- multi-threaded compression
| compressor | settings | bytes | duration |
|---|---|---|---|
| lz4 1.8.2 | (default) | 509,454,838 bytes | 3 seconds |
| lz4 1.8.2 | -9 |
374,839,215 bytes | 39 seconds |
| lz4 1.8.2 | -9 --no-frame-crc -BD |
374,085,998 bytes | 39 seconds |
| LZ4X 1.30 | -9 |
372,068,640 bytes | 127 seconds |
| lz4 1.8.2 | -12 --no-frame-crc -BD |
371,680,440 bytes | 75 seconds |
| smallz4 1.1 | -9 |
371,680,428 bytes | 254 seconds |
Note: LZ4 improved the
-12 algorithm significantly in version 1.8.1 and it produced smaller files than smallz4 1.0.
However, my updated smallz4 1.1 exceeds LZ4's compression ratio.
// //////////////////////////////////////////////////////////
// smallz4.h
// Copyright (c) 2016-2018 Stephan Brumme. All rights reserved.
// see https://create.stephan-brumme.com/smallz4/
//
// "MIT License":
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the Software
// is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
// INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
// PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
// HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
// OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
// SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#pragma once
#include <stdint.h> // uint16_t, uint32_t, ...
#include <cstdlib> // size_t
#include <vector>
/// LZ4 compression with optimal parsing
/** see smallz4.cpp for a basic I/O interface
you can easily replace it by a in-memory version
then all you have to do is:
#include "smallz4.h"
smallz4::lz4(GET_BYTES, SEND_BYTES);
// for more advanced stuff, you can call lz4 with four parameters (incl. max chain length and a dictionary)
**/
class smallz4
{
public:
// read several bytes, see getBytesFromIn() in smallz4.cpp for a basic implementation
typedef size_t (*GET_BYTES) ( void* data, size_t numBytes);
// write several bytes, see sendBytesToOut() in smallz4.cpp for a basic implementation
typedef void (*SEND_BYTES)(const void* data, size_t numBytes);
/// compress everything in input stream (accessed via getByte) and write to output stream (via send)
static void lz4(GET_BYTES getBytes, SEND_BYTES sendBytes,
unsigned int maxChainLength = MaxChainLength) // this function exists for compatibility reasons
{
lz4(getBytes, sendBytes, maxChainLength, std::vector<unsigned char>());
}
/// compress everything in input stream (accessed via getByte) and write to output stream (via send)
static void lz4(GET_BYTES getBytes, SEND_BYTES sendBytes,
unsigned int maxChainLength,
const std::vector<unsigned char>& dictionary) // new interface, supports a predefined dictionary
{
smallz4 obj(maxChainLength);
obj.compress(getBytes, sendBytes, dictionary);
}
/// version string
static const char* const getVersion()
{
return "1.2";
}
// compression level thresholds, made public because I display them in the help screen ...
enum
{
/// greedy mode for short chains (compression level <= 3) instead of optimal parsing / lazy evaluation
ShortChainsGreedy = 3,
/// lazy evaluation for medium-sized chains (compression level > 3 and <= 6)
ShortChainsLazy = 6
};
// ----- END OF PUBLIC INTERFACE -----
private:
// ----- constants and types -----
/// a block can be 4 MB
typedef uint32_t Length;
/// matches must start within the most recent 64k
typedef uint16_t Distance;
enum
{
/// each match's length must be >= 4
MinMatch = 4,
/// last match must not be closer than 12 bytes to the end
BlockEndNoMatch = 12,
/// last 5 bytes must be literals, no matching allowed
BlockEndLiterals = 5,
/// match finder's hash table size (2^HashBits entries, must be less than 32)
HashBits = 20,
/// input buffer size, can be any number but zero ;-)
BufferSize = 64*1024,
/// maximum match distance
MaxDistance = 65535,
/// marker for "no match"
NoPrevious = 0,
/// stop match finding after MaxChainLength steps (default is unlimited => optimal parsing)
MaxChainLength = NoPrevious,
/// refer to location of the previous match (implicit hash chain)
PreviousSize = 1 << 16,
/// maximum block size as defined in LZ4 spec: { 0,0,0,0,64*1024,256*1024,1024*1024,4*1024*1024 }
/// I only work with the biggest maximum block size (7)
MaxBlockSizeId = 7,
MaxBlockSize = 4*1024*1024
// note: xxhash header checksum is precalculated only for 7, too
};
// ----- one and only variable ... -----
/// how many matches are checked in findLongestMatch, lower values yield faster encoding at the cost of worse compression ratio
unsigned int maxChainLength;
// ----- code -----
/// match
struct Match
{
/// true, if long enough
inline bool isMatch() const
{
return length >= MinMatch;
}
/// length of match
Length length;
/// start of match
Distance distance;
};
/// create new compressor (only invoked by lz4)
explicit smallz4(unsigned int newMaxChainLength = MaxChainLength)
: maxChainLength(newMaxChainLength) // => no limit, but can be changed by setMaxChainLength
{
}
/// return true, if the four bytes at *a and *b match
inline static bool match4(const void* const a, const void* const b)
{
return *(const uint32_t*)a == *(const uint32_t*)b;
}
/// find longest match of data[pos] between data[begin] and data[end], use match chain stored in previous
Match findLongestMatch(const unsigned char* const data,
size_t pos, size_t begin, size_t end,
const Distance* const previous) const
{
Match result;
result.length = 1;
// compression level: look only at the first n entries of the match chain
int32_t stepsLeft = maxChainLength;
// pointer to position that is matched against everything in data
const unsigned char* const current = data + pos - begin;
// don't match beyond this point
const unsigned char* const stop = current + end - pos;
// get distance to previous match, abort if 0 => not existing
Distance distance = previous[pos % PreviousSize];
size_t totalDistance = 0;
while (distance != NoPrevious)
{
// too far back ?
totalDistance += distance;
if (totalDistance > MaxDistance)
break;
// prepare next position
distance = previous[(pos - totalDistance) % PreviousSize];
// stop searching on lower compression levels
if (stepsLeft-- <= 0)
break;
// let's introduce a new pointer atLeast that points to the first "new" byte of a potential longer match
const unsigned char* const atLeast = current + result.length + 1;
// the idea is to split the comparison algorithm into 2 phases
// (1) scan backward from atLeast to current, abort if mismatch
// (2) scan forward until a mismatch is found and store length/distance of this new best match
// current atLeast
// | |
// -<<<<<<<< phase 1 <<<<<<<<
// >>> phase 2 >>>
// impossible to find a longer match because not enough bytes left ?
if (atLeast > stop)
break;
// all bytes between current and atLeast shall be identical, compare 4 bytes at once
const unsigned char* compare = atLeast - 4;
bool ok = true;
while (compare > current)
{
// mismatch ?
if (!match4(compare, compare - totalDistance))
{
ok = false;
break;
}
// keep going ...
compare -= 4;
// note: - the first four bytes always match
// - in the last iteration, compare is either current + 1 or current + 2 or current + 3
// - therefore we compare a few bytes twice => but a check to skip these checks is more expensive
}
// mismatch ?
if (!ok)
continue;
// we have a new best match, now scan forward from the end
compare = atLeast;
// fast loop: check four bytes at once
while (compare + 4 <= stop && match4(compare, compare - totalDistance))
compare += 4;
// slow loop: check the last 1/2/3 bytes
while (compare < stop && *compare == *(compare - totalDistance))
compare++;
// store new best match
result.distance = Distance(totalDistance);
result.length = Length (compare - current);
}
return result;
}
/// create shortest output
/** data points to block's begin; we need it to extract literals **/
static std::vector<unsigned char> selectBestMatches(const std::vector<Match>& matches,
const unsigned char* const data)
{
// store encoded data
std::vector<unsigned char> result;
result.reserve(MaxBlockSize);
// indices of current literal run
size_t literalsFrom = 0;
size_t literalsTo = 0; // point beyond last literal of the current run
// walk through the whole block
for (size_t offset = 0; offset < matches.size(); ) // increment inside of loop
{
// get best cost-weighted match
Match match = matches[offset];
// if no match, then count literals instead
if (!match.isMatch())
{
// first literal
if (literalsFrom == literalsTo)
literalsFrom = literalsTo = offset;
// one more literal
literalsTo++;
// ... and definitely no match
match.length = 1;
}
offset += match.length;
const bool lastToken = (offset == matches.size());
// continue if simple literal
if (!match.isMatch() && !lastToken)
continue;
// emit token
// count literals
size_t numLiterals = literalsTo - literalsFrom;
// store literals' length
unsigned char token = (numLiterals < 15) ? (unsigned char)numLiterals : 15;
token <<= 4;
// store match length (4 is implied because it's the minimum match length)
size_t matchLength = match.length - 4;
if (!lastToken)
token |= (matchLength < 15) ? matchLength : 15;
result.push_back(token);
// >= 15 literals ? (extra bytes to store length)
if (numLiterals >= 15)
{
// 15 is already encoded in token
numLiterals -= 15;
// emit 255 until remainder is below 255
while (numLiterals >= 255)
{
result.push_back(255);
numLiterals -= 255;
}
// and the last byte (can be zero, too)
result.push_back((unsigned char)numLiterals);
}
// copy literals
if (literalsFrom != literalsTo)
{
result.insert(result.end(), data + literalsFrom, data + literalsTo);
literalsFrom = literalsTo = 0;
}
// last token doesn't have a match
if (lastToken)
break;
// distance stored in 16 bits / little endian
result.push_back( match.distance & 0xFF);
result.push_back((match.distance >> 8) & 0xFF);
// >= 15+4 bytes matched (4 is implied because it's the minimum match length)
if (matchLength >= 15)
{
// 15 is already encoded in token
matchLength -= 15;
// emit 255 until remainder is below 255
while (matchLength >= 255)
{
result.push_back(255);
matchLength -= 255;
}
// and the last byte (can be zero, too)
result.push_back((unsigned char)matchLength);
}
}
return result;
}
/// walk backwards through all matches and compute number of compressed bytes from current position to the end of the block
/** note: matches are modified (shortened length) if necessary **/
static void estimateCosts(std::vector<Match>& matches)
{
const size_t blockEnd = matches.size();
typedef uint32_t Cost;
// minimum cost from this position to the end of the current block
std::vector<Cost> cost(matches.size(), 0);
// "cost" represents the number of bytes needed
// backwards optimal parsing
size_t posLastMatch = matches.size();
for (int i = (int)matches.size() - (1 + BlockEndLiterals); i >= 0; i--) // ignore the last 5 bytes, they are always literals
{
// watch out for long literal strings that need extra bytes
const Length numLiterals = Length(posLastMatch - i);
// assume no match
Cost minCost = cost[i + 1] + 1;
// an extra byte for every 255 literals required to store length (first 14 bytes are "for free")
if (numLiterals >= 15 && (numLiterals - 15) % 255 == 0)
minCost++;
// if encoded as a literal
Length bestLength = 1;
// analyze longest match
Match match = matches[i];
// match must not cross block borders
if (match.isMatch() && i + match.length + BlockEndLiterals > blockEnd)
match.length = Length(blockEnd - (i + BlockEndLiterals));
// try all match lengths (first short ones)
for (Length length = MinMatch; length <= match.length; length++)
{
// token (1 byte) + offset (2 bytes)
Cost currentCost = cost[i + length] + 1 + 2;
// very long matches need extra bytes for encoding match length
if (length > 18)
currentCost += 1 + (length - 18) / 255;
// better choice ?
if (currentCost <= minCost)
{
// regarding the if-condition:
// "<" prefers literals and shorter matches
// "<=" prefers longer matches
// they should produce the same number of bytes (because of the same cost)
// ... but every now and then it doesn't !
// that's why: too many consecutive literals require an extra length byte
// (which we took into consideration a few lines above)
// but we only looked at literals beyond the current position
// if there are many literal in front of the current position
// then it may be better to emit a match with the same cost as the literals at the current position
// => it "breaks" the long chain of literals and removes the extra length byte
minCost = currentCost;
bestLength = length;
// performance-wise, a long match is usually faster during decoding than multiple short matches
// on the other hand, literals are faster than short matches as well (assuming same cost)
}
// workaround: very long self-referencing matches can slow down the program A LOT
if (match.distance == 1 && match.length > 18 + 255)
{
// assume that longest match is always the best match
bestLength = match.length;
minCost = cost[i + match.length] + 1 + 2 + 1 + (match.length - 18) / 255;
break;
}
}
// remember position of last match to detect number of consecutive literals
if (bestLength >= MinMatch)
posLastMatch = i;
// store lowest cost so far
cost[i] = minCost;
// and adjust best match
matches[i].length = bestLength;
if (bestLength == 1)
matches[i].distance = NoPrevious;
// note: if bestLength is smaller than the previous matches[i].length then there might be a closer match
// which could be more cache-friendly (=> faster decoding)
}
}
/// compress everything in input stream (accessed via getByte) and write to output stream (via send), improve compression with a predefined dictionary
void compress(GET_BYTES getBytes, SEND_BYTES sendBytes, const std::vector<unsigned char>& dictionary) const
{
// ==================== write header ====================
// magic bytes
const unsigned char magic[4] = { 0x04, 0x22, 0x4D, 0x18 };
sendBytes(magic, 4);
// flags
const unsigned char flags = 1 << 6;
sendBytes(&flags, 1);
// max blocksize
const unsigned char maxBlockSizeId = MaxBlockSizeId << 4;
sendBytes(&maxBlockSizeId, 1);
// header checksum (precomputed)
const unsigned char checksum = 0xDF;
sendBytes(&checksum, 1);
// ==================== declarations ====================
// read the file in chunks/blocks, data will contain only bytes which are relevant for the current block
std::vector<unsigned char> data;
// file position corresponding to data[0]
size_t dataZero = 0;
// last already read position
size_t numRead = 0;
// passthru data (but still wrap in LZ4 format)
const bool uncompressed = (maxChainLength == 0);
// last time we saw a hash
const uint32_t HashSize = 1 << HashBits;
const size_t NoLastHash = MaxDistance + 1;
std::vector<size_t> lastHash(HashSize, NoLastHash);
const uint32_t HashMultiplier = 22695477; // taken from https://en.wikipedia.org/wiki/Linear_congruential_generator
const uint8_t HashShift = 32 - HashBits;
// previous position which starts with the same bytes
std::vector<Distance> previousHash (PreviousSize, Distance(NoPrevious)); // long chains based on my simple hash
std::vector<Distance> previousExact(PreviousSize, Distance(NoPrevious)); // shorter chains based on exact matching of the first four bytes
// change buffer size as you like
std::vector<unsigned char> buffer(BufferSize);
// first and last offset of a block (next is end-of-block plus 1)
size_t lastBlock = 0;
size_t nextBlock = 0;
bool parseDictionary = !dictionary.empty();
while (true)
{
// ==================== start new block ====================
// first byte of the currently processed block (std::vector data may contain the last 64k of the previous block, too)
const unsigned char* dataBlock = NULL;
// prepend dictionary
if (parseDictionary)
{
// prepend exactly 64k
const size_t MaxDictionary = 65536;
if (dictionary.size() < MaxDictionary)
{
// add garbage data
size_t unused = 65536 - dictionary.size();
data.resize(unused, 0);
data.insert(data.end(), dictionary.begin(), dictionary.end());
}
else
// copy only the most recent 64k of the dictionary
data.insert(data.end(), dictionary.begin() + dictionary.size() - MaxDictionary, dictionary.end());
nextBlock = data.size();
numRead = data.size();
}
// read more bytes from input
while (numRead - nextBlock < MaxBlockSize)
{
// buffer can be significantly smaller than MaxBLockSize, that's the only reason for this while-block
size_t incoming = getBytes(&buffer[0], buffer.size());
if (incoming == 0)
break;
numRead += incoming;
data.insert(data.end(), buffer.begin(), buffer.begin() + incoming);
}
// no more data ? => WE'RE DONE !
if (nextBlock == numRead)
break;
// determine block borders
lastBlock = nextBlock;
nextBlock += MaxBlockSize;
// not beyond end-of-file
if (nextBlock > numRead)
nextBlock = numRead;
// first byte of the currently processed block (std::vector data may contain the last 64k of the previous block, too)
dataBlock = &data[lastBlock - dataZero];
const size_t blockSize = nextBlock - lastBlock;
// ==================== full match finder ====================
// greedy mode is much faster but produces larger output
const bool isGreedy = (maxChainLength <= ShortChainsGreedy);
// lazy evaluation: if there is a match, then try running match finder on next position, too, but not after that
const bool isLazy = !isGreedy && (maxChainLength <= ShortChainsLazy);
// skip match finding on the next x bytes in greedy mode
size_t skipMatches = 0;
// allow match finding on the next byte but skip afterwards (in lazy mode)
bool lazyEvaluation = false;
// the last literals of the previous block skipped matching, so they are missing from the hash chains
int lookback = (int)dataZero;
if (lookback > BlockEndNoMatch && !parseDictionary)
lookback = BlockEndNoMatch;
if (parseDictionary)
lookback = (int)dictionary.size();
// so let's go back a few bytes
lookback = -lookback;
std::vector<Match> matches(blockSize);
// find longest matches for each position
for (int i = lookback; i < (int)blockSize; i++)
{
// no matches at the end of the block (or matching disabled by command-line option -0 )
if (i + BlockEndNoMatch > (int)blockSize || uncompressed)
continue;
// detect self-matching
if (i > 0 && dataBlock[i] == dataBlock[i - 1])
{
Match prevMatch = matches[i - 1];
// predecessor had the same match ?
if (prevMatch.distance == 1 && prevMatch.length > 1024) // TODO: handle very long self-referencing matches
{
// just copy predecessor without further (expensive) optimizations
prevMatch.length--;
matches[i] = prevMatch;
continue;
}
}
// read next four bytes
uint32_t four = *(uint32_t*)(dataBlock + i);
// convert to a shorter hash
uint32_t hash = (four * HashMultiplier) >> HashShift;
// get last occurrence of these bits
size_t last = lastHash[hash];
// and store current position
lastHash[hash] = i + lastBlock;
int prevIndex = i % PreviousSize;
if (i < 0)
prevIndex = (i + MaxBlockSize) % PreviousSize;
// no predecessor or too far away ?
size_t distance = i + lastBlock - last;
if (last == NoLastHash || distance > MaxDistance)
{
previousHash [prevIndex] = NoPrevious;
previousExact[prevIndex] = NoPrevious;
continue;
}
// build hash chain, i.e. store distance to last match
previousHash[prevIndex] = (Distance)distance;
// skip pseudo-matches (hash collisions) and build a second chain where the first four bytes must match exactly
while (distance != NoPrevious)
{
uint32_t curFour = *(uint32_t*)(&data[last - dataZero]); // may be in the previous block, too
// actual match found, first 4 bytes are identical
if (curFour == four)
break;
// prevent from accidently hopping on an old, wrong hash chain
uint32_t curHash = (curFour * HashMultiplier) >> HashShift;
if (curHash != hash)
{
distance = NoPrevious;
break;
}
// try next pseudo-match
Distance next = previousHash[last % PreviousSize];
// pointing to outdated hash chain entry ?
distance += next;
if (distance > MaxDistance)
{
previousHash[last % PreviousSize] = NoPrevious;
distance = NoPrevious;
break;
}
// closest match is out of range ?
last -= next;
if (next == NoPrevious || last < dataZero)
{
distance = NoPrevious;
break;
}
}
// no match at all ?
if (distance == NoPrevious)
{
previousExact[prevIndex] = NoPrevious;
continue;
}
// store distance to previous match
previousExact[prevIndex] = (Distance)distance;
// no matching if crossing block boundary, just update hash tables
if (i < 0)
continue;
// skip match finding if in greedy mode
if (skipMatches > 0)
{
skipMatches--;
if (!lazyEvaluation)
continue;
lazyEvaluation = false;
}
// and look for longest match
Match longest = findLongestMatch(&data[0], i + lastBlock, dataZero, nextBlock - BlockEndLiterals + 1, &previousExact[0]);
matches[i] = longest;
// no match finding needed for the next few bytes in greedy/lazy mode
if (longest.isMatch() && (isLazy || isGreedy))
{
lazyEvaluation = (skipMatches == 0);
skipMatches = longest.length;
}
}
// dictionary applies only to the first block
parseDictionary = false;
// ==================== estimate costs (number of compressed bytes) ====================
// not needed in greedy mode and/or very short blocks
if (matches.size() > BlockEndNoMatch && maxChainLength > ShortChainsGreedy)
estimateCosts(matches);
// ==================== select best matches ====================
std::vector<unsigned char> block;
if (!uncompressed)
block = selectBestMatches(matches, &data[lastBlock - dataZero]);
// ==================== output ====================
// automatically decide whether compressed or uncompressed
size_t uncompressedSize = nextBlock - lastBlock;
// did compression do harm ?
bool useCompression = block.size() < uncompressedSize && !uncompressed;
// block size
uint32_t numBytes = uint32_t(useCompression ? block.size() : uncompressedSize);
uint32_t numBytesTagged = numBytes | (useCompression ? 0 : 0x80000000);
unsigned char num1 = numBytesTagged & 0xFF; sendBytes(&num1, 1);
unsigned char num2 = (numBytesTagged >> 8) & 0xFF; sendBytes(&num2, 1);
unsigned char num3 = (numBytesTagged >> 16) & 0xFF; sendBytes(&num3, 1);
unsigned char num4 = (numBytesTagged >> 24) & 0xFF; sendBytes(&num4, 1);
if (useCompression)
sendBytes(&block[0], numBytes);
else // uncompressed ? => copy input data
sendBytes(&data[lastBlock - dataZero], numBytes);
// remove already processed data except for the last 64kb which could be used for intra-block matches
if (data.size() > MaxDistance)
{
size_t remove = data.size() - MaxDistance;
dataZero += remove;
data.erase(data.begin(), data.begin() + remove);
}
}
// add an empty block
uint32_t zero = 0;
sendBytes(&zero, 4);
}
};
// //////////////////////////////////////////////////////////
// smallz4.cpp
// Copyright (c) 2016-2018 Stephan Brumme. All rights reserved.
// see https://create.stephan-brumme.com/smallz4/
//
// "MIT License":
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the Software
// is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
// INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
// PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
// HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
// OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
// SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
// suppress warnings when compiled by Visual C++
#define _CRT_SECURE_NO_WARNINGS
#include "smallz4.h"
#include <cstdio> // stdin/stdout/stderr, fopen, ...
#include <cstdlib> // exit
#ifdef _WIN32
#include <io.h> // isatty()
#else
#include <unistd.h> // isatty()
#define _fileno fileno
#define _isatty isatty
#endif
/// error handler
static void error(const char* msg, int code = 1)
{
fprintf(stderr, "ERROR: %s\n", msg);
exit(code);
}
// ==================== I/O INTERFACE ====================
/// input stream, usually stdin
FILE* in = 0;
/// read several bytes and store at "data", return number of actually read bytes (return only zero if end of data reached)
size_t getBytesFromIn(void* data, size_t numBytes)
{
if (data && numBytes > 0)
return fread(data, 1, numBytes, in);
return 0;
}
/// output stream, usually stdout
FILE* out = 0;
/// write a block of bytes
void sendBytesToOut(const void* data, size_t numBytes)
{
if (data && numBytes > 0)
fwrite(data, 1, numBytes, out);
}
// ==================== COMMAND-LINE HANDLING ====================
// show simple help
static void showHelp(const char* program)
{
printf("smalLZ4 %s: compressor with optimal parsing, fully compatible with LZ4 by Yann Collet (see https://lz4.org)\n"
"\n"
"Basic usage:\n"
" %s [flags] [input] [output]\n"
"\n"
"This program writes to STDOUT if output isn't specified\n"
"and reads from STDIN if input isn't specified, either.\n"
"\n"
"Examples:\n"
" %s < abc.txt > abc.txt.lz4 # use STDIN and STDOUT\n"
" %s abc.txt > abc.txt.lz4 # read from file and write to STDOUT\n"
" %s abc.txt abc.txt.lz4 # read from and write to file\n"
" cat abc.txt | %s - abc.txt.lz4 # read from STDIN and write to file\n"
" %s -6 abc.txt abc.txt.lz4 # compression level 6 (instead of default 9)\n"
" %s -f abc.txt abc.txt.lz4 # overwrite an existing file\n"
" %s -f7 abc.txt abc.txt.lz4 # compression level 7 and overwrite an existing file\n"
"\n"
"Flags:\n"
" -0, -1 ... -9 Set compression level, default: 9 (see below)\n"
" -h Display this help message\n"
" -f Overwrite an existing file\n"
" -D [FILE] Load dictionary\n"
"\n"
"Compression levels:\n"
" -0 No compression\n"
" -1 ... -%d Greedy search, check 1 to %d matches\n"
" -%d ... -8 Lazy matching with optimal parsing, check %d to 8 matches\n"
" -9 Optimal parsing, check all possible matches\n"
"\n"
"Written in 2016-2018 by Stephan Brumme https://create.stephan-brumme.com/smallz4/\n"
, smallz4::getVersion()
, program, program, program, program, program, program, program, program,
smallz4::ShortChainsGreedy, smallz4::ShortChainsGreedy,
smallz4::ShortChainsGreedy + 1, smallz4::ShortChainsGreedy + 1);
}
/// parse command-line
int main(int argc, const char* argv[])
{
// show help if no parameters and stdin isn't a pipe
if (argc == 1 && _isatty(_fileno(stdin)) != 0)
{
showHelp(argv[0]);
return 0;
}
unsigned int maxChainLength = 65536; // "unlimited" because search window contains only 2^16 bytes
// overwrite output ?
bool overwrite = false;
// preload dictionary from disk
const char* dictionary = NULL;
// parse flags
int nextArgument = 1;
bool skipArgument = false;
while (argc > nextArgument && argv[nextArgument][0] == '-')
{
int argPos = 1;
while (argv[nextArgument][argPos] != '\0')
{
switch (argv[nextArgument][argPos++])
{
// show help
case 'h':
showHelp(argv[0]);
return 0;
// force overwrite
case 'f':
overwrite = true;
break;
// use dictionary
case 'D':
if (nextArgument + 1 >= argc)
error("no dictionary filename found");
dictionary = argv[nextArgument + 1]; // TODO: any flag immediately after -D causes an error
skipArgument = true;
break;
// set compression level
case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8':
maxChainLength = argv[nextArgument][1] - '0'; // "0" => 0, "1" => 1, ..., "8" => 8
break;
// unlimited hash chain length
case '9':
// default maxChainLength is already "unlimited"
break;
default:
error("unknown flag");
}
}
nextArgument++;
if (skipArgument)
nextArgument++;
}
// default input/output streams
in = stdin; out = stdout;
// input file is given as first parameter or stdin if no parameter is given (or "-")
if (argc > nextArgument && argv[nextArgument][0] != '-')
{
in = fopen(argv[nextArgument], "rb");
if (!in)
error("file not found");
nextArgument++;
}
// output file is given as second parameter or stdout if no parameter is given (or "-")
if (argc == nextArgument + 1 && argv[nextArgument][0] != '-')
{
// check if file already exists
if (!overwrite && fopen(argv[nextArgument], "rb"))
error("output file already exists");
out = fopen(argv[nextArgument], "wb");
if (!out)
error("cannot create file");
}
// load dictionary
std::vector<unsigned char> preload;
if (dictionary != NULL)
{
// open dictionary
FILE* dict = fopen(dictionary, "rb");
if (!dict)
error("cannot open dictionary");
// get dictionary's filesize
fseek(dict, 0, SEEK_END);
size_t dictSize = ftell(dict);
// only the last 64k are relevant
size_t relevant = dictSize < 65536 ? 0 : dictSize - 65536;
fseek(dict, (long)relevant, SEEK_SET);
if (dictSize > 65536)
dictSize = 65536;
// read those bytes
preload.resize(dictSize);
fread(&preload[0], 1, dictSize, dict);
fclose(dict);
}
// and go !
smallz4::lz4(getBytesFromIn, sendBytesToOut, maxChainLength, preload);
return 0;
}
LZ4 decompressor (C99)
Features:- minimal memory consumption: can be configured to use just 64kb (plus a few variables on the stack)
- a single file without any macros,
#ifdefsand compiler-specific stuff - plain C - flexible I/O interface
- support for fixed dictionaries
- about half as fast as reference implementation (lz4.org)
- lots of code comments: you can actually understand what's going on ☺
- XXHash checksum verification - see here for a basic implementation
- skippable frames aren't supported but basic legacy frames can be decoded
// //////////////////////////////////////////////////////////
// smallz4cat.c
// Copyright (c) 2016-2018 Stephan Brumme. All rights reserved.
// see https://create.stephan-brumme.com/smallz4/
//
// "MIT License":
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the Software
// is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
// INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
// PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
// HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
// OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
// SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
// This program is a shorter, more readable, albeit slower re-implementation of lz4cat ( https://github.com/Cyan4973/xxHash )
// compile: gcc smallz4cat.c -O3 -o smallz4cat -Wall -pedantic -std=c99 -s
// The static 8k binary was compiled using Clang and dietlibc (see https://www.fefe.de/dietlibc/ )
// Limitations:
// - skippable frames and legacy frames are not implemented (and most likely never will)
// - checksums are not verified (see https://create.stephan-brumme.com/xxhash/ for a simple implementation)
// Replace getByteFromIn() and sendToOut() by your own code if you need in-memory LZ4 decompression.
// Corrupted data causes a call to error().
// suppress warnings when compiled by Visual C++
#define _CRT_SECURE_NO_WARNINGS
#include <stdint.h> // uint32_t
#include <stdio.h> // stdin/stdout/stderr, fopen, ...
#include <stdlib.h> // exit()
#include <string.h> // memcpy
/// error handler
void error(const char* msg)
{
// smaller static binary than fprintf(stderr, "ERROR: %s\n", msg);
fputs("ERROR: ", stderr);
fputs(msg, stderr);
fputs("\n", stderr);
exit(1);
}
// ==================== I/O INTERFACE ====================
// read one byte from input, see getByteFromIn() for a basic implementation
typedef unsigned char (*GET_BYTE) ();
// write several bytes, see sendBytesToOut() for a basic implementation
typedef void (*SEND_BYTES)(const unsigned char*, unsigned int);
/// input stream, usually stdin
static FILE* in = NULL;
/// read a single byte (with simple buffering)
static unsigned char getByteFromIn()
{
// modify buffer size as you like ... for most use cases, bigger buffer aren't faster anymore - and even reducing to 1 byte works !
#define READ_BUFFER_SIZE 4*1024
static unsigned char readBuffer[READ_BUFFER_SIZE];
static size_t pos = 0;
static size_t available = 0;
// refill buffer
if (pos == available)
{
pos = 0;
available = fread(readBuffer, 1, READ_BUFFER_SIZE, in);
if (available == 0)
error("out of data");
}
// return a byte
return readBuffer[pos++];
}
/// output stream, usually stdout
static FILE* out = NULL;
/// write a block of bytes
static void sendBytesToOut(const unsigned char* data, unsigned int numBytes)
{
if (data != NULL && numBytes > 0)
fwrite(data, 1, numBytes, out);
}
// ==================== LZ4 DECOMPRESSOR ====================
/// decompress everything in input stream (accessed via getByte) and write to output stream (via sendBytes)
void unlz4(GET_BYTE getByte, SEND_BYTES sendBytes, const char* dictionary)
{
// signature
unsigned char signature1 = getByte();
unsigned char signature2 = getByte();
unsigned char signature3 = getByte();
unsigned char signature4 = getByte();
uint32_t signature = (signature4 << 24) | (signature3 << 16) | (signature2 << 8) | signature1;
int isModern = (signature == 0x184D2204);
int isLegacy = (signature == 0x184C2102);
if (!isModern && !isLegacy)
error("invalid signature");
unsigned char hasBlockChecksum = 0;
unsigned char hasContentSize = 0;
unsigned char hasContentChecksum = 0;
unsigned char hasDictionaryID = 0;
if (isModern)
{
// flags
unsigned char flags = getByte();
hasBlockChecksum = flags & 16;
hasContentSize = flags & 8;
hasContentChecksum = flags & 4;
hasDictionaryID = flags & 1;
// only version 1 file format
unsigned char version = flags >> 6;
if (version != 1)
error("only LZ4 file format version 1 supported");
// ignore blocksize
getByte();
if (hasContentSize)
{
// ignore, skip 8 bytes
getByte(); getByte(); getByte(); getByte();
getByte(); getByte(); getByte(); getByte();
}
if (hasDictionaryID)
{
// ignore, skip 4 bytes
getByte(); getByte(); getByte(); getByte();
}
// ignore header checksum (xxhash32 of everything up this point & 0xFF)
getByte();
}
// don't lower this value, backreferences can be 64kb far away
#define HISTORY_SIZE 64*1024
// contains the latest decoded data
unsigned char history[HISTORY_SIZE];
// next free position in history[]
unsigned int pos = 0;
// dictionary compression is a recently introduced feature, just move its contents to the buffer
if (dictionary != NULL)
{
// open dictionary
FILE* dict = fopen(dictionary, "rb");
if (!dict)
error("cannot open dictionary");
// get dictionary's filesize
fseek(dict, 0, SEEK_END);
size_t dictSize = ftell(dict);
// only the last 64k are relevant
size_t relevant = dictSize < 65536 ? 0 : dictSize - 65536;
fseek(dict, (long)relevant, SEEK_SET);
if (dictSize > 65536)
dictSize = 65536;
// read it and store it at the end of the buffer
fread(history + HISTORY_SIZE - dictSize, 1, dictSize, dict);
fclose(dict);
}
// parse all blocks until blockSize == 0
while (1)
{
// block size
uint32_t blockSize = getByte();
blockSize |= (uint32_t)getByte() << 8;
blockSize |= (uint32_t)getByte() << 16;
blockSize |= (uint32_t)getByte() << 24;
// highest bit set ?
unsigned char isCompressed = isLegacy || (blockSize & 0x80000000) == 0;
if (isModern)
blockSize &= 0x7FFFFFFF;
// stop after last block
if (blockSize == 0)
break;
if (isCompressed)
{
// decompress block
uint32_t blockOffset = 0;
while (blockOffset < blockSize)
{
// get a token
unsigned char token = getByte();
blockOffset++;
// determine number of literals
uint32_t numLiterals = (token >> 4) & 0x0F;
if (numLiterals == 15)
{
// number of literals length encoded in more than 1 byte
unsigned char current;
do
{
current = getByte();
numLiterals += current;
blockOffset++;
} while (current == 255);
}
blockOffset += numLiterals;
// copy all those literals
while (numLiterals-- > 0)
{
history[pos++] = getByte();
// flush output buffer
if (pos == HISTORY_SIZE)
{
sendBytes(history, HISTORY_SIZE);
pos = 0;
}
}
// last token has only literals
if (blockOffset == blockSize)
break;
// match distance is encoded by two bytes (little endian)
blockOffset += 2;
uint32_t delta = getByte();
delta |= (uint32_t)getByte() << 8;
// zero isn't allowed
if (delta == 0)
error("invalid offset");
// match length (must be >= 4, therefore length is stored minus 4)
uint32_t matchLength = 4 + (token & 0x0F);
if (matchLength == 4 + 0x0F)
{
unsigned char current;
do // match length encoded in more than 1 byte
{
current = getByte();
matchLength += current;
blockOffset++;
} while (current == 255);
}
// copy match
uint32_t reference = (pos >= delta) ? pos - delta : HISTORY_SIZE + pos - delta;
if (pos + matchLength < HISTORY_SIZE && reference + matchLength < HISTORY_SIZE)
{
// fast copy
if (pos >= reference + matchLength || reference >= pos + matchLength)
{
// non-overlapping
memcpy(history + pos, history + reference, matchLength);
pos += matchLength;
}
else
{
// overlapping
while (matchLength-- > 0)
history[pos++] = history[reference++];
}
}
else
{
// slower copy, have to take care of buffer limits
while (matchLength-- > 0)
{
// copy single byte
history[pos++] = history[reference++];
// cannot write anymore ? => wrap around
if (pos == HISTORY_SIZE)
{
// flush output buffer
sendBytes(history, HISTORY_SIZE);
pos = 0;
}
// cannot read anymore ? => wrap around
if (reference == HISTORY_SIZE)
reference = 0;
}
}
}
// all legacy blocks must be completely filled - except for the last one
if (isLegacy && blockSize < 8*1024*1024)
break;
}
else
{
// copy uncompressed data and add to history, too (if next block is compressed and some matches refer to this block)
while (blockSize-- > 0)
{
// copy a byte ...
history[pos++] = getByte();
// ... until buffer is full => send to output
if (pos == HISTORY_SIZE)
{
sendBytes(history, HISTORY_SIZE);
pos = 0;
}
}
}
if (hasBlockChecksum)
{
// ignore checksum, skip 4 bytes
getByte(); getByte(); getByte(); getByte();
}
}
if (hasContentChecksum)
{
// ignore checksum, skip 4 bytes
getByte(); getByte(); getByte(); getByte();
}
// flush output buffer
sendBytes(history, pos);
}
// ==================== COMMAND-LINE HANDLING ====================
/// parse command-line
int main(int argc, const char* argv[])
{
// default input/output streams
in = stdin; out = stdout;
const char* dictionary = NULL;
// first command-line parameter is our input filename / but ignore "-" which stands for STDIN
for (int parameter = 1; parameter < argc; parameter++)
{
const char* current = argv[parameter];
// dictionary
if (current[0] == '-' && current[1] == 'D')
{
if (parameter + 1 >= argc)
error("no dictionary filename found");
dictionary = argv[++parameter];
continue;
}
// filename
// read from STDIN, default behavior
if (current[0] != '-' && current[1] != '\0')
{
// already have a filename - at most one filename is allowed (except for dictionary) ?
if (in != stdin)
error("can only decompress one file at a time");
// get handle
in = fopen(argv[1], "rb");
if (!in)
error("file not found");
}
}
// and go !
unlz4(getByteFromIn, sendBytesToOut, dictionary);
return 0;
}
Source Code
Git users: scroll down to the repository link Latest release: August 21, 2018, size: 11.3 kBytes, 379 linesCRC32:
1749c0e2MD5:
74863a9241a6482b8b44a579f8345261SHA1:
4d743ed4653bc7e395eae20232370c98404bbf37SHA256:
91462257c958e291ee97a9161ff933cc6b1d05149babe81033eed96b73a87fafLatest release: August 3, 2018, size: 25.8 kBytes, 728 lines
CRC32:
8ea448a5MD5:
c201db7a03e034d452a80284f23bfccaSHA1:
db20ef6631116a61f465dd8428f64424f09ea7d7SHA256:
fa082bf8edeb42010ef1ddcc0d22862c3b727591091248c867306e1947e9d34eLatest release: August 3, 2018, size: 7287 bytes, 230 lines
CRC32:
0050d2a2MD5:
4a8588afa90a32282d3eafad4a72cc4fSHA1:
6da89030896764200d0317fb09697139163ef08fSHA256:
666fc1625c585f963f04aba9248aa5de3a980b5ff4d183e773ab57be7280e88dStay up-to-date:git clone https://create.stephan-brumme.com/smallz4/.git
GitHub mirror:https://github.com/stbrumme/smallz4
Windows executables (x64 only)
(compiled with Visual Studio 2017 Community, no dependencies except kernel32.dll) Latest release: August 3, 2018, size: 110.0 kBytesCRC32:
31e206f7MD5:
09b511b1b101a8201a6e356764df3f35SHA1:
f7d37a4a1a0cc3215b109658e80293df273e6903SHA256:
9e3d2592d23763b16c7bdd1c5dfb9ca7535d5c66194427d97c1ff7f6086f8c0eprevious versions:
smallz4cat-v1.1.exe (August 3, 2018)
smallz4cat-v1.0.exe (October 25, 2016)
smallz4cat-v0.4.exe (August 17, 2016)
Latest release: August 3, 2018, size: 126.0 kBytes
CRC32:
fe002c1dMD5:
f05333da1aa7af76ccd64f88a7403d48SHA1:
48c68a317b97fe0a03eb2861f17ec6a9ae2c1d27SHA256:
4857813a661eb2b223e9f2dd3ea0a51a5fcc37553ac069730e8d32208049cb61previous versions:
smallz4-v1.1.exe (August 3, 2018)
smallz4-v1.0.exe (November 2, 2016)
smallz4-v0.6.exe (September 9, 2016)
smallz4-v0.5.exe (September 1, 2016)
smallz4-v0.4.exe (August 17, 2016)
Linux executables (x32 and x64)
(static linking:g++ -O3 -s -Wall -pedantic -static, non-static builds are much smaller !)
Latest release: August 3, 2018, size: 796.4 kBytesCRC32:
5f2ec51cMD5:
dfed842911e1eccce32d284a941ccf91SHA1:
0eb64eaea8f401cecc7c7ce3efb98e5bfd87bf21SHA256:
c5eba7ab57b148e4bb03bed9bd3647a56d723fddde3a964efb508b9f1a8a6de3previous versions:
smallz4-v1.1 (August 2, 2018)
smallz4-v1.0 (October 25, 2016)
smallz4-v0.6 (September 9, 2016)
Latest release: August 3, 2018, size: 651.5 kBytes
CRC32:
39544feeMD5:
5d816cc8c2585ae4df291c085a23a1deSHA1:
8fad1188e8374bbc328f9bdc44b687d863dc71b8SHA256:
db8ef76c2a3c4829009ff6bcea9e17ab182ebef4fd2f2167fb655c2fe7159b9eprevious versions:
smallz4-x32-v1.1 (August 2, 2018)
smallz4-x32-v1.0 (October 25, 2016)
smallz4-x32-v0.6 (September 9, 2016)
Changelog
- version 1.2
- latest and greatest
- August 3, 2018
- experimental support for dictionary compression
- version 1.1
- August 2, 2018
- bugfix: uncompressed blocks has wrong flag
- improved compression edge on block boundaries
- version 1.0
- October 25, 2016
- MIT license, added Makefile
- code remained unchanged
- version 0.6
- September 9, 2016
- refactored compression part to be a header-only library
- version 0.5
- September 1, 2016
- faster match finder, about 20% speed-up (thanks to Stefan Atev for inspiration)
- version 0.4
- August 17, 2016
- fixed bug in cost estimation
- added lazy matching
- version 0.3
- August 12, 2016
- initial version
