flexiGIF - Lossless GIF/LZW optimization

posted September 15, 2018 by Stephan Brumme, updated October 15, 2018

Quick Overview

flexiGIF shrinks GIF files by optimizing their compression scheme (LZW algorithm).
No visual information is changed and the output is 100% pixel-wise identical to the original file - that's why it's called "lossless optimization".
And the results are still GIF files you can open/view with any standard tool.
Animated GIFs are supported as well.

Most files can be reduced by about 2%. I'm not aware of any open-source software outperforming flexiGIF.

The only downside: it may takes several seconds or even minutes for a medium-sized GIF image.
That's several magnitudes slower than other GIF encoders.
Your GIF decoder isn't affected at all - probably it even becomes faster because it has less input to process.

flexiGIF does NOT optimize palette, strip extensions and/or reduces color information.
There are many sophisticated tools out there, such as ImageMagick, that excel at that job.
flexiGIF is designed to be a tool used after you ran those image-processing programs.

Proposed toolchain:

create your GIF
run an image-optimizer, such as Gifsicle
let flexiGIF optimize the LZW bitstream

flexiGIF is a command-line tool and can be easily used with custom scripts, etc.
Keep in mind that flexiGIF's compression is very (!) slow, magnitudes slower than a standard GIF encoder.

Lossless compression

Unlike the algorithms behind JPEG or PNG, there is is no adjustable compression level in LZ78/LZW compressed data.
So I had to come up with something else ... and I choose not to play around with image specific optimizations.
Instead, I extract the LZW data stream from GIF files and recompress it.

flexiGIF combines two techniques invented by other people - the following ideas are not my original work (I wish they were ...).
To my knowledge, flexiGIF is the first free program to implement both.

How does it work ?

The LZW algorithm is more than 3 decades old - in 1984 Welch improved the LZ78 by Lempel and Ziv (obviously invented in 1978).
Up to now, almost all GIF/LZW encoders work the same way (loosely based on Wikipedia article of GIF):

fill a dictionary with all available symbols, typically that's 0..255
find the longest string S in the dictionary matching the current input
add S plus the next symbol to the dictionary, emit the dictionary index of S
if the dictionary is full, then reset the dictionary (see step 1)
go to step 2

Often it takes just a few lines of code to implement that algorithm.
At the moment, flexiGIF has >2,500 lines of code because it tweaks the algorithms in a few subtle ways ... read on !

Flexible Dictionary Reset

The GIF standard defines a maximum dictionary size of 2^12 = 4096 entries.
Then a so-called clear code should be inserted into the bitstream in order to reset the dictionary to its initial state.
Almost all GIF encoders execute step 4 when the dictionary contains 4096 entries, a few do it a little bit earlier (4093, e.g. IrfanView).

However, the GIF standard also allows to "skip" step 4: if the dictionary is full then you should (not "you must" !) reset the dictionary.
It's explicitly written on the very first page of the GIF standard ("deferred clear code"):
https://www.w3.org/Graphics/GIF/spec-gif89a.txt

What does it mean ? If the dictionary is full, then you are not allowed to execute step 3 (add strings).
But you can still reference old dictionary entries, it becomes static.
If they still are good matches for the current input, then they can provide a nice compression ratio.
Moreover, they most likely exceed the compression ratio of a recently resetted dictionary.
On the other hand, a full dictionary requires longer codes (GIF: 12 bits).

flexiGIF runs a brute-force search to find the best spot to reset the dictionary:

sometimes it makes sense to reset the dictionary after a few pixels (if pixels' colors change rapidly)
sometimes the old dictionary is great for several thousand pixels/bytes

Flexible Matching

In the aforementioned algorithm, step 2 reads: "find the longest string".
This isn't the whole truth, though. Almost all GIF encoders have so-called "greedy" match finders and actually look for the longest match.
But there are certain situations where the longest match is not the best match.
Let's consider a situation where the dictionary contains the strings a, b, aa, ab, aaa, abb.
If the input is aaabb, then a greedy matcher splits it into three groups (aaa)(b)(b).
A flexible, non-greedy matcher chooses (aa)(abb) saving one token.
It accepts a sub-optimal match for the first two bytes because in that case the next match will be much longer.

Since each match creates a new dictionary entry, too, flexible matching causes the LZW dictionary to contain duplicates:
if (aa) was chosen then (aa)+a = aaa must be added to the dictionary even though it's already there.
Quite often flexible matching saves a few bytes but unfortunately in a substantial number of cases the opposite is true:
the sub-optimal dictionary worsens the compression ratio.

The encoder's flexible matching is always slower than greedy matching. It doesn't make a difference during decoding, though.

Note: my flexible matcher is limited to two matches (one-step lookahead). It's not optimal parsing such as my smalLZ4 tool.

Previous Work

There are a few scientific papers about the described algorithms.
In my eyes the most understandable document was written by Nayuki ( https://www.nayuki.io/page/gif-optimizer-java ).
It comes with full Java source code but it's extremely memory-hungry and quite slow. Only flexible dictionary reset is implemented.

Flexible matching is part of pretty much every practical LZ77 compression program, such as GZIP.
It's often called one-step lookahead.

Code

Everything was written in C++ from scratch. No external libraries are required.
The code can be compiled with GCC, CLang and Visual C++.
I haven't tested flexiGIF on big-endian systems.

Command-line options

Usage: flexigif [options] INPUTFILE OUTPUTFILE

flexiGIF actually produces good results if you use no options at all (just the two filenames).

Most parameters have a short version (such as -h) and a long version (--help) which is more readable.
Multiple short versions can be merged if they have no numeric value, e.g. -f -s -v is the same as -fsv.

-h --help
Display help screen.

-p --prettygood
Uses a set of parameters that typically give the best results. Recommended for most inputs but may be REALLY slow.

-a=x --alignment=x
My brute-force search starts at every pixel which is a multiple of this value.
If -a=1 then every possible block splitting is analyzed.
Higher values speed up the program and reduce memory consumption (-a=10 is often 10x faster).

-d=x --dictionary=x
Maximum size of the LZW dictionary before a restart is forced.
Typically left unchanged unless you encounter compatibility issues with certain GIF decoders, then -d=4093 may help (see -c, too).

-t=x --maxtokens=x
In order to speed up the brute-force search, a maximum number of matches per block can be defined.
It's extremely rare to find GIF images with more than 20000 matches before resetting the dictionary (=default value).
Often -t=10000 is enough, too, and about twice as fast.

-c --compatible
Try to emit output that respects bugs/limitations of certain GIF decoders.
At the moment, -c is equivalent to -d=4093.

-l --deinterlace
Deinterlace GIF images.
Note: not implemented for animated GIFs yet.

-g --greedy
Enable greedy match search (which is default behavior anyway)

-n=x --nongreedy=x
Enable non-greedy match search where x is the minimum match length to be considered.
Default is -n=2 which means every match with at least two pixels/bytes are checked whether a non-greedy search gives better results.
Higher values cause less duplicates in the dictionary which sometimes improves compression ratio.

-m=x --minimprovement=x
Minimum number of bytes saved by a non-greedy match (requires parameter -n, too).
Default is -m=1

-i --info
Analyze internal structure of INPUTFILE. No need provide OUTPUTFILE because everything is written to STDOUT.
It's more or less a debugging switch and the format of the displayed information may change at any time.

-f --force
Overwrite OUTPUTFILE if it already exists.

-r --splitruns
Allow partial matching of long runs of the same byte (requires parameter -n).
Many GIF files have large areas with the same color.
Non-greedy search often leads to very bad dictionaries when those "long runs" of the same color are split.
Enabling this option is useful only for very few images.

-u=x --userdefined=x
flexiGIF doesn't look for the best dictionary reset position - instead, you provide them !
x is an ascendingly sorted list, e.g. -u=500,2000,9000
It's considered a debugging option.

-s --summary
When finished, compare filesize of INPUTFILE and OUTPUTFILE.

-v --verbose
Show debug messages. It's more or less a debugging switch and the format of the displayed information may change at any time.

-q --quiet
No output at all, except when an error occurred.

-Z
INPUTFILE and OUTPUTFILE are stored in .Z file format instead of GIF

-y --immediately
Avoid initial dictionary reset (clear code) and start immediately with compressed data.
This options typically saves a byte.
However, a few popular GIF decoders can't properly handle GIF without an initial dictionary reset.

Limitations

My .Z encoder and decoder support only dictionary restarts if the current LZW code size is 16 bits.

Code

(scroll down to download the source code - or click on the green bars to view the file in your browser)

The compression engine doesn't know anything about GIFs - it only optimizes the LZW bitstream:

show LzwEncoder.h


     // //////////////////////////////////////////////////////////
     // LzwEncoder.h
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     #pragma once
     
     #include <vector>
     #include <array>
     
     using std::size_t;
     
     /// compress data with LZW algorithm and a user-defined block-splitting algorithm
     class LzwEncoder
     {
     public:
       /// uncompressed data
       typedef std::vector<unsigned char> RawData;
       /// encoded/compressed data
       typedef std::vector<bool> BitStream;
     
       /// set uncompressed data
       explicit LzwEncoder(const RawData& data, bool isGif = true);
     
       /// optimization parameters
       struct OptimizationSettings
       {
         /// bits per LZW code
         unsigned char minCodeSize;
         /// for compatibility: first LZW code sent to output is the clear code
         bool startWithClearCode;
         /// display internal stuff while compressing data
         bool verbose;
     
         /// if zero => greedy search else non-greedy search
         bool greedy;
         /// minimum match length considered for non-greedy search
         unsigned int  minNonGreedyMatch;
         /// minimum number of bytes saved when a non-greedy match is used
         unsigned int  minImprovement;
     
         /// maximum number of dictionary entries
         unsigned int  maxDictionary;
         /// maximum number of tokens per block (huge values severely affect compression speed)
         unsigned int  maxTokens;
         /// if true then non-greedy matching is allowed to analyze a string of identical bytes
         bool splitRuns;
     
         /// alignment
         unsigned int alignment;
     
         /// read-only access to m_best (gives small speed up for final pass when emitBitStream=true)
         bool readOnlyBest;
         /// don't recompute non-greedy infos when greedy search found no greedy matches
         bool avoidNonGreedyAgain;
       };
     
       /// optimize a single block and update results in m_best
       BitStream optimizePartial(unsigned int from, unsigned int maxLength, bool emitBitStream, bool isFinal, OptimizationSettings optimize);
       /// determine best block boundaries based on results of optimizePartial() and call merge()
       BitStream optimize(OptimizationSettings optimize);
     
       /// optimize if block boundaries are known
       BitStream merge(std::vector<unsigned int> restarts, OptimizationSettings optimize);
     
     private:
       /// add string at m_data[from...from+length] to the dictionary and return its code
       int          addCode  (unsigned int from, unsigned int length);
       /// return length of longest match, beginning at m_data[from], limited to maxLength
       unsigned int findMatch(unsigned int from, unsigned int maxLength);
     
       /// add bits to BitStream
       static void          add(BitStream& stream, unsigned int token, unsigned char numBits);
       /// get minimum number of bits to represent token
       static unsigned char getMinBits(unsigned int token);
     
       // ----- debugging code -----
       RawData debugDecode(int code) const;
       int     findCode   (unsigned int from, unsigned int maxLength) const;
     
       /// uncompressed data
       RawData m_data;
     
       // ----- temporary data structures -----
       /// gather statistics of a single compressed block of data
       struct BestBlock
       {
         /// current block: number of bytes (uncompressed input)
         unsigned int length;
         /// current block: number of bits of compressed output
         unsigned int bits;
         /// this plus all following blocks: total number of bits
         unsigned long long totalBits;
         /// current block: number of LZW codes
         unsigned int tokens;
         /// number of non-greedy matches
         unsigned int nongreedy;
         /// true if block's last match isn't greedy
         bool         partial;
     
         BestBlock()
         : length(0), bits(0), totalBits(0),
           tokens(0), nongreedy(0), partial(false)
         {}
       };
       /// all block of compressed data where m_best[x] is the optimum block (regarding all following blocks) which starts at input byte x
       std::vector<BestBlock> m_best;
     
       /// for each LZW code store the LZW codes of its children (which is the same as its parent plus one byte, -1 => undefined/no child)
       typedef std::vector<std::array<int, 256> > Dictionary;
       Dictionary    m_dictionary;
       /// number of valid entries in m_dictionary
       unsigned int  m_dictSize;
       /// maximum number of codes in the dictionary
       unsigned int  m_maxDictionary;
       /// GIF = 12, LZW = 16
       unsigned char m_maxCodeLength;
     
       /// true, if encode as GIF
       bool          m_isGif;
     };

show LzwEncoder.cpp


     // //////////////////////////////////////////////////////////
     // LzwEncoder.cpp
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     // Nayuki implemented a similar encoder in Java two years earlier:
     // https://www.nayuki.io/page/gif-optimizer-java
     // (albeit much slower and without non-greedy parsing)
     
     #include "LzwEncoder.h"
     
     #define ALLOW_VERBOSE
     #ifdef  ALLOW_VERBOSE
     #include <iostream>
     #include <iomanip>
     #endif
     
     // local stuff
     namespace
     {
       /// indicates an invalid entry in the dictionary
       const int Unknown = -1;
     }
     
     /// set uncompressed data
     LzwEncoder::LzwEncoder(const RawData& data, bool isGif)
     : m_data(data),
       m_dictionary(),
       m_dictSize(0),
       m_maxCodeLength(isGif ? 12 : 16),
       m_maxDictionary(), // see a few lines below
       m_isGif(isGif)
     {
       m_maxDictionary = (1 << m_maxCodeLength) - 1;
     }
     
     
     /// add string at m_data[from...from+length] to the dictionary and return its code
     int LzwEncoder::addCode(unsigned int from, unsigned int length)
     {
       // first literal
       int code = m_data[from++];
     
       // walk through dictionary in order to find the code for the matching without the last byte
       for (unsigned int i = 1; i < length; i++)
       {
         unsigned char oneByte = m_data[from++];
         code = m_dictionary[code][oneByte];
       }
     
       // the new string is a known code plus a new byte (the last one)
       if (from < m_data.size())
       {
         unsigned char lastByte = m_data[from];
         // insert at the end of the dictionary
         if (m_dictSize < m_maxDictionary)
         {
           // don't overwrite (needed for non-greedy algorithm where a code isn't unique)
           if (m_dictionary[code][lastByte] == Unknown)
             m_dictionary[code][lastByte] = m_dictSize;
     
           m_dictSize++;
         }
       }
     
       return code;
     }
     
     
     /// return length of longest match, beginning at m_data[from], limited to maxLength
     unsigned int LzwEncoder::findMatch(unsigned int from, unsigned int maxLength)
     {
       // there is always a LZW code for the first byte
       int code = m_data[from++];
     
       // try to match second, third, ... byte, too
       for (unsigned int length = 1; length < maxLength; length++)
       {
         unsigned char oneByte = m_data[from++];
         code = m_dictionary[code][oneByte];
         // no continuation => return number of matching byte
         if (code == Unknown)
           return length;
       }
     
       // matched all the way (quite unlikely, yet possible)
       return maxLength;
     }
     
     
     // for debugging only, not used in release compilation
     LzwEncoder::RawData LzwEncoder::debugDecode(int code) const
     {
       RawData result;
       if (code == Unknown)
         return result;
       if (code >= (int)m_dictionary.size())
         return result; // actually an error ...
     
       // brute-force search ...
       int search = code;
       while (code > 255)
       {
         if (search > code)
           search = code;
     
         search--;
         if (search < 0)
           break;
     
         for (unsigned int next = 0; next <= 255; next++)
           if (m_dictionary[search][next] == code)
           {
             result.insert(result.begin(), char(next));
             code = search;
             break;
           }
       }
     
       result.insert(result.begin(), char(code));
       return result;
     }
     
     // for debugging only, not used in release compilation
     int LzwEncoder::findCode(unsigned int from, unsigned int maxLength) const
     {
       // there is always a LZW code for the first byte
       int code = m_data[from++];
     
       // try to match second, third, ... byte, too
       for (unsigned int length = 1; length < maxLength; length++)
       {
         unsigned char oneByte = m_data[from++];
         int prevCode = code;
         code = m_dictionary[code][oneByte];
         // no continuation => return number of matching byte
         if (code == Unknown)
           return prevCode;
       }
     
       // matched all the way (quite unlikely, yet possible)
       return code;
     }
     
     
     /// optimize a single block and update results in m_best
     LzwEncoder::BitStream LzwEncoder::optimizePartial(unsigned int from, unsigned int maxLength, bool emitBitStream, bool isFinal, OptimizationSettings optimize)
     {
       // allocate memory
       if (m_best.empty())
         m_best.resize(m_data.size() / optimize.alignment + 1 + 1);
     
       // length of current block
       unsigned int length = (unsigned int)m_data.size() - from;
       if (length > maxLength && maxLength != 0)
         length = maxLength;
     
       // LZW encoded data
       BitStream result;
       if (emitBitStream)
         result.reserve(length * m_maxCodeLength); // reserve a bit too much ... just to avoid any reallocations
     
       // index for m_best
       if (optimize.alignment == 0)
         optimize.alignment = 1;
       unsigned int fromAligned = from / optimize.alignment;
     #ifdef ALLOW_VERBOSE
       if (from % optimize.alignment != 0)
       {
         std::cerr << "optimizePartial is not allowed to start at a non-aligned offset (" << from << ", alignment=" << optimize.alignment << ")" << std::endl;
         throw "non-aligned block start is forbidden";
       }
     #endif
       // no recomputation in second pass of --prettygood mode if previous non-greedy search was unsuccessful
       if (optimize.greedy                    &&
           optimize.avoidNonGreedyAgain       &&
          !emitBitStream                      &&
           m_best[fromAligned].nongreedy == 0 &&
           m_best[fromAligned].length    >  0)
         return result;
     
       // special codes
       const unsigned int clear       = 1 << optimize.minCodeSize;
       const unsigned int endOfStream = clear + 1;
     
       // initialize dictionary
       std::array<int, 256> children;
       children.fill(Unknown);
       m_dictionary.resize(m_maxDictionary);
       for (unsigned int i = 0; i < m_maxDictionary; i++)
         m_dictionary[i] = children;
     
       if (m_isGif)
         m_dictSize = clear + 2;
       else
         m_dictSize = clear + 1; // .Z format: no endOfStream
     
       // initialize counters
       unsigned int  numBits   = 0;
       unsigned int  numTokens = 0;
       unsigned int  numNonGreedyMatches = 0;
     
       // total length of the current match, initially no match
       unsigned int  matchLength = 0;
       // bits per code
       unsigned char codeSize = getMinBits(m_dictSize);
     
       // process input
       unsigned int lastPos = from + length - 1;
       for (unsigned int i = from; i <= lastPos; i++)
       {
         // number of already processed bytes
         unsigned int numBytes = i - from + 1;
     
         // ----- match finding -----
     
         // need new match ?
         if (matchLength == 0)
         {
           // if blocks become too large then it's quite unlikely to find a better compression
           if (optimize.maxDictionary > 0 && m_dictSize >= optimize.maxDictionary)
             break; // some broken decoders can't handle a full dictionary, too
           if (optimize.maxTokens     > 0 && numTokens  >= optimize.maxTokens)
             break;
     
           // ----- the next 50+ lines to try optimize matchLength (greedy/non-greedy) -----
     
           // total number of bytes left in the current block
           unsigned int remaining = length + from - i;
           // find longest match (greedy), must not exceed number of available bytes
           matchLength = findMatch(i, remaining);
     
           // non-greedy lookahead
           bool tryNonGreedy = !optimize.greedy;
           if (matchLength == 1 || matchLength < optimize.minNonGreedyMatch)
             tryNonGreedy = false;
           // non-greedy search doesn't make sense when we're very close to the end
           if (i + matchLength + 4 >= m_data.size()) // the number 4 was chosen randomly ...
             tryNonGreedy = false;
     
           // flexible parsing
           if (tryNonGreedy)
           {
             // don't split long runs of the same pixels
             if (!optimize.splitRuns)// && matchLength >= 2)
             {
               // cheap check of first and last byte
               unsigned int lastMatchByte = matchLength - 1;
               bool allTheSame = (m_data[i] == m_data[i + lastMatchByte]);
     
               // and the rest of them until a mismatch is found
               for (unsigned int scan = 1; scan + 1 < lastMatchByte && allTheSame; scan++)
                 allTheSame = (m_data[i] == m_data[i + scan]);
               // all pixels are identical ?
               if (allTheSame)
                 tryNonGreedy = false;
             }
     
             // greedy matching after the current match
             unsigned int second  = findMatch(i + matchLength, remaining - matchLength);
             // sum of these two greedy matches
             unsigned int best    = matchLength + second;
             // look for an improvement
             unsigned int atLeast = best + optimize.minImprovement;
     
             // store length of currently best greedy/non-greedy match
             unsigned int choice = matchLength;
             // try all non-greedy lengths
             for (unsigned int shorter = matchLength - 1; shorter > 0; shorter--)
             {
               // greedy match of everything that follows
               unsigned int next = findMatch(i + shorter, remaining - shorter);
               unsigned int sum  = shorter + next;
               // longer ?
               if (sum >= atLeast && sum > best)
               {
                 best   = shorter + next;
                 choice = shorter;
               }
             }
     
             // found a better sequence than greedy match ?
             if (choice < matchLength)
             {
     #ifdef ALLOW_VERBOSE
               /*if (emitBitStream && optimize.verbose)
                 std::cout << "use non-greedy @ " << i << " \t before " << matchLength << "+" << second << "=" << matchLength+second
                                                       << "\tnow " << choice << "+" << (best - choice) << "=" << best
                                                       << " \t => +" << (best - (matchLength + second))
                                                       << std::endl;
               */
     #endif
               matchLength = choice;
               numNonGreedyMatches++;
             }
           } // end of flexible parsing
     
           // ----- LZW code generation -----
     
           // need one more bit per code ?
           if (m_dictSize < m_maxDictionary)
           {
             unsigned int threshold = m_dictSize - 1;
             // detect powerOfTwo (see https://bits.stephan-brumme.com/isPowerOfTwo.html )
             if ((threshold & (threshold - 1)) == 0 && codeSize < m_maxCodeLength) // but never use more than 12 (or 16) bits
             {
               codeSize++;
               // undo if first token (.Z format)
               if (!m_isGif && threshold == 256)
                 codeSize--;
             }
           }
     
           // update dictionary
           unsigned int code = addCode(i, matchLength);
           // append code to LZW stream
           if (emitBitStream)
             add(result, code, codeSize);
     
           // counters
           numBits += codeSize;
           numTokens++;
         }
     
         // "eat" next byte of the current match (until it's 0 and we look for the next match)
         matchLength--;
     
         // ----- cost evaluation -----
     
         // cost already determined ?
         if (optimize.readOnlyBest)
           continue;
     
         // don't update m_best if no block optimization started at the current slot
         bool isLastByte   = (i + 1 == m_data.size());
         // look at compression result of the remaining bytes
         unsigned int next =  i + 1;
         unsigned int nextAligned = next;
         if (optimize.alignment > 1)
           nextAligned = (next + optimize.alignment - 1) / optimize.alignment; // find current m_best index
         if (!isLastByte && m_best[nextAligned].totalBits == 0)
           continue;
     
         // save cost information only on aligned addresses (except for the last bytes)
         if (optimize.alignment > 1 && numBytes % optimize.alignment != 0)
         {
           //if (!isFinal)
             //continue;
           if (!isLastByte)
             continue;
         }
     
         // assuming the block would end here, a few extra bits are needed
         unsigned int add = codeSize; // clear / end-of-stream
         // increase code size just for the clear / end-of-stream code ?
         size_t threshold = m_dictSize - 1;
         if ((threshold & (threshold - 1)) == 0 && codeSize < m_maxCodeLength)
           add++;
     
         if (!m_isGif)
         {
           // TODO: only allow restart if clear code can be encoded in 16 bits
           if (!isLastByte && codeSize < 16)
             continue;
     
           // no endOfStream token in .Z file format
           if (isLastByte)
             add = 0;
     
           // fill last byte
           if (numBits % 8 != 0)
             add += 8 - (numBits % 8);
     
           // dictionary resets are followed by a bunch of zeros
           if (!isLastByte)
           {
             unsigned int tokensPlusClear = numTokens + 1;
             // compress' LZW must be aligned to 8 tokens
             unsigned int mod8 = tokensPlusClear & 7;
             unsigned int gap  = mod8 == 0 ? 0 : 8 - mod8;
             add += codeSize * gap;
           }
         }
     
         // true if the we are currently "inside" a match
         bool isPartial = (matchLength > 0);
     
         // compute final cost
         unsigned int       trueBits  = numBits  + add;
         unsigned long long totalBits = trueBits + m_best[nextAligned].totalBits;
     
         // better path ? (or no path found at all so far)
         BestBlock& best = m_best[fromAligned];
         if (best.totalBits == 0 || best.totalBits >= totalBits)
         {
           best.bits      = trueBits;
           best.totalBits = totalBits;
           best.length    = numBytes;
           best.tokens    = numTokens;
           best.partial   = isPartial;
           best.nongreedy = numNonGreedyMatches;
     
           // note: if there are multiple paths with the same cost, then longer blocks are preferred
           //       to change this, replace "best.totalBits >  totalBits"
           //                            by "best.totalBits >= totalBits"
           //       but keep in mind that a dictionary restart costs CPU cycles during decoding
         }
       }
     
       // done
       if (emitBitStream)
       {
         // end of block: emit either clear or endOfStream code
         codeSize = getMinBits(m_dictSize - 1);
         if (m_isGif)
         {
           add(result, isFinal ? endOfStream : clear, codeSize);
         }
         else
         {
           // only clear code, no endOfStream
           if (!isFinal)
           {
             add(result, clear, codeSize);
             numTokens++;
           }
     
           // fill current byte
           while (result.size() % 8 != 0)
           {
             result.push_back(0);
             numBits++;
           }
     
           // a block's number of token has to be a multiple of 8
           if (!isFinal)
           {
             if (codeSize != 16)
             {
     #ifdef ALLOW_VERBOSE
               std::cerr << "LZW code size is " << (int)codeSize << std::endl;
               throw "flexiGIF currently only supports restarts with code size = 16";
     #endif
               //throw "experimental code: currently only 16 bit clear codes supported";
               // some experimental values for numZeros if codeSize < 16:
               // 20/9 => 3
               // 21/9 => 2
               // 22/9 => 1
               // 30/9 => 2
               // 31/9 => 1
               // 50/9 => 3
             }
     
             // same formulas as in LzwDecoder
             unsigned int mod8 = numTokens & 7;
             unsigned int gap  = mod8 == 0 ? 0 : 8 - mod8;
             // add zeros
             unsigned int numZeros = codeSize * gap / 8;
     
     #ifdef ALLOW_VERBOSE
             //std::cout << "pad " << numZeros << " zeros" << std::endl;
     #endif
             for (unsigned int bit = 0; bit < 8*numZeros; bit++)
               result.push_back(0);
           }
         }
       }
     
       // error checking
       if (m_best[fromAligned].length == 0)
       {
     #ifdef ALLOW_VERBOSE
         //std::cerr <<  << std::endl;
         //throw "current block didn't yield any optimization infos";
     #endif
       }
     
       return result;
     }
     
     
     /// determine best block boundaries based on results of optimizePartial() and call merge()
     LzwEncoder::BitStream LzwEncoder::optimize(OptimizationSettings optimize)
     {
       // find shortest path
       unsigned int pos     = 0;
       unsigned int aligned = 0;
       std::vector<unsigned int> restarts;
       while (pos < m_data.size())
       {
         unsigned int length = m_best[aligned].length;
         // no continuation ?
         if (length == 0)
         {
     #ifdef ALLOW_VERBOSE
           if (optimize.verbose)
             std::cerr << "gap @ " << pos << " (aligned=" << aligned << ")" << std::endl;
     #endif
           throw "you need to choose a smaller alignment value or increase the token limit because there is a gap between two blocks";
         }
     
         pos    += length;
         aligned = pos / optimize.alignment;
         restarts.push_back(pos);
       }
     
       // LZW compress along the shortest path
       return merge(restarts, optimize);
     }
     
     
     /// optimize if block boundaries are known
     LzwEncoder::BitStream LzwEncoder::merge(std::vector<unsigned int> restarts, OptimizationSettings optimize)
     {
       // final result
       BitStream result;
       result.reserve(m_data.size() * 3); // basic heuristic to estimate memory consumption
     
       // optional: prepend clear code
       if (optimize.startWithClearCode && m_isGif)
       {
         // it's 2^minCodeSize which is a bunch of zeros followed by a one
         for (unsigned int i = 0; i < optimize.minCodeSize; i++)
           result.push_back(false);
         result.push_back(true);
       }
     
       // check number of restarts
       if (restarts.empty())
         return result;
       if (restarts.back() < m_data.size())
         restarts.push_back((unsigned int)m_data.size());
     
       // statistics
       std::vector<unsigned int> sizes;
       unsigned int totalSize = 0;
     
       // verbose mode displays garbage if using predefined block sizes
       bool verbose = optimize.verbose;
       if (m_best.empty() || m_best[0].bits == 0)
         verbose = false;
     
       unsigned int pos = 0;
       for (size_t i = 0; i < restarts.size(); i++)
       {
         // dummy entry for new block at pos 0
         if (restarts[i] == 0)
           continue;
     
         bool isFinal = (i == restarts.size() - 1);
         unsigned int length = restarts[i] - pos;
     
         // switch to faster settings
         if (!m_best.empty())
         {
           optimize.greedy = (m_best[pos / optimize.alignment].nongreedy == 0); // speed-up if non-greedy search was enabled but not successful in this block
           if (optimize.greedy)
             optimize.avoidNonGreedyAgain = true;
         }
         optimize.readOnlyBest = true;
     
         // compute current block
         BitStream block = optimizePartial(pos, length, true, isFinal, optimize);
         if (block.empty() && length > 0)
         {
           std::cerr << "internal error, block @ " << pos << " with length " << length << " is empty" << std::endl;
           throw "optimization failed due to an internal error";
         }
         // add to previous stuff
         result.insert(result.end(), block.begin(), block.end());
     
         unsigned int current = (unsigned int)block.size();
         sizes.push_back(current);
         totalSize += current;
     
     #ifdef ALLOW_VERBOSE
         if (verbose)
         {
           unsigned int aligned = (pos + optimize.alignment - 1) / optimize.alignment;
           BestBlock    block   = m_best[aligned];
           const char*  pixel   = m_isGif ? "pixel" : "byte";
           std::cout << "cost @ " << pos << " \t=> bits=" << sizes[i] << " \t" << pixel << "s=" << block.length << "\ttokens=" << block.tokens
                     << "\tbits/" << pixel << "=" << float(sizes[i]) / block.length
                     //<< "\testimated_bits=" << block.bits << " "
                     << (block.bits == sizes[i] ? "" : "?")
                     << "\tnon-greedy=" << block.nongreedy
                     << (block.partial       ? ", last match is partial" : "")
                     << std::endl;
         }
     #endif
     
         pos = restarts[i];
       }
     
     #ifdef ALLOW_VERBOSE
       if (optimize.verbose)
         std::cout << "finished, now " << result.size() << " LZW bits = " << (result.size() + 7) / 8 << " bytes" << std::endl;
     #endif
     
       return result;
     }
     
     
     /// add bits to BitStream
     void LzwEncoder::add(BitStream& stream, unsigned int token, unsigned char numBits)
     {
       while (numBits-- > 0)
       {
         stream.push_back((token & 1) != 0);
         token >>= 1;
       }
     }
     
     
     /// get minimum number of bits to represent token
     unsigned char LzwEncoder::getMinBits(unsigned int token)
     {
       unsigned char result = 0;
       do
       {
         token >>= 1;
         result++;
       } while (token != 0);
     
       return result;
     }

Actually, files needs to be decompressed first ...

show LzwDecoder.h


     // //////////////////////////////////////////////////////////
     // LzwDecoder.h
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     #pragma once
     
     #include "BinaryInputBuffer.h"
     
     #include <vector>
     #include <string>
     
     /// decode a GIF file, support for writing the same file with optimized LZW bitstream
     /** errors throw an exception (const char*) **/
     class LzwDecoder
     {
     public:
       // -------------------- data types --------------------
       /// a continuous block of bytes
       typedef std::vector<unsigned char> Bytes;
     
       /// parse LZW bitstream
       explicit LzwDecoder(BinaryInputBuffer& input, bool isGif = true, unsigned char minCodeSize = 8, unsigned char maxCodeSize = 12, unsigned int expectedNumberOfBytes = 16*1024);
     
       /// return minimum LZW code size
       unsigned char getCodeSize() const;
       /// return decompressed data
       const Bytes&  getBytes()    const;
     
       /// for statistics only: true number of compressed bits
       unsigned int  getNumCompressedBits() const;
     
       /// show debug output
       static bool verbose;
     
     private:
       /// decompress data, first parameter is a hint to avoid memory reallocations, GIFs are limited to code size 12, .Z => 16
       void  decompress(unsigned int expectedNumberOfBytes, unsigned char minCodeSize, unsigned char maxCodeSize);
     
       /// read bits from current LZW block, start with next block if needed
       unsigned int getLzwBits(unsigned char numBits);
     
       /// tree node for an LZW code: its parent code and its last byte
       struct BackReference
       {
         int  length;   // match length
         int  previous; // code of parent (contains everything except for the last byte), if negative then there is no parent
         unsigned char last; // last byte of the match
     
         int   pos;      // position of a match in the output stream (often there are several candidates)
       };
       /// convert code to bytes and store in buffer
       void decode(Bytes& buffer, unsigned int code, const std::vector<BackReference>& lut) const;
     
       /// read file bit-by-bit
       BinaryInputBuffer& m_input;
       /// decoded file (pixels/bytes)
       Bytes         m_bytes;
     
       /// if true, then GIF's LZW data is grouped in blocks of 255 bytes each
       bool          m_isGif;
       /// minimum bits per LZW code
       unsigned char m_codeSize;
       /// LZW data is split in blocks, keep track of the remaining bits in the current block
       unsigned int  m_bitsLeftInCurrentBlock;
       /// total number of bits of original LZW compressed image (without block lengths)
       unsigned int  m_numBitsOriginalLZW;
     };

show LzwDecoder.cpp


     // //////////////////////////////////////////////////////////
     // LzwDecoder.cpp
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/smallzwflexigif-lossless-gif-lzw-optimization/
     
     #include "LzwDecoder.h"
     
     #include <fstream>
     
     #define ALLOW_VERBOSE
     #ifdef  ALLOW_VERBOSE
     #include <iostream>
     #include <iomanip>
     #endif
     
     namespace
     {
       /// placeholder for "no parent code"
       const int          NoPrevious    = -1;
       /// GIFs have a valid end-of-stream token but not compress' LZW variant
       const unsigned int NoEndOfStream = 0xFFFFFFFF;
     }
     
     // statics
     bool LzwDecoder::verbose = false;
     
     
     /// parse LZW bitstream
     LzwDecoder::LzwDecoder(BinaryInputBuffer& input, bool isGif,
                            unsigned char minCodeSize, unsigned char maxCodeSize,
                            unsigned int expectedNumberOfBytes)
     : m_input(input),
       m_bytes(),
       m_isGif(isGif),
       m_codeSize(0),
       m_bitsLeftInCurrentBlock(0),
       m_numBitsOriginalLZW(0)
     {
       decompress(expectedNumberOfBytes, minCodeSize, maxCodeSize);
     }
     
     
     /// return minimum LZW code size
     unsigned char LzwDecoder::getCodeSize() const
     {
       return m_codeSize;
     }
     
     
     /// return decompressed data
     const LzwDecoder::Bytes& LzwDecoder::getBytes() const
     {
       return m_bytes;
     }
     
     
     /// for statistics only: true number of compressed bits
     unsigned int LzwDecoder::getNumCompressedBits() const
     {
       return m_numBitsOriginalLZW;
     }
     
     
     /// convert code to bytes and store in buffer
     void LzwDecoder::decode(Bytes& buffer, unsigned int code, const std::vector<BackReference>& lut) const
     {
       // single byte: exists in lookup table only
       if (lut[code].length == 1)
       {
         buffer.push_back(lut[code].last);
         return;
       }
     
       // old iterative decoder
       int length = lut[code].length;
       // resize buffer
       buffer.resize(buffer.size() + length);
       unsigned char* pos = &buffer[buffer.size() - 1];
       // copy bytes while walking backwards
       while (length-- > 0)
       {
         *pos-- = lut[code].last;
         code   = lut[code].previous;
       }
     }
     
     
     /// decompress data, first parameter is a hint to avoid memory reallocations, GIFs are limited to code size 12, .Z => 16
     void LzwDecoder::decompress(unsigned int expectedNumberOfBytes, unsigned char minCodeSize, unsigned char maxCodeSize)
     {
       // initial bits per token
       m_codeSize = minCodeSize;
     
     #ifdef ALLOW_VERBOSE
       if (verbose && m_isGif)
         std::cout << ", " << (int)minCodeSize << " bits" << std::endl;
       const char* pixel = m_isGif ? "pixel" : "byte";
     #endif
     
       // special codes
       const unsigned int clear       = 1 << minCodeSize;
       const unsigned int endOfStream = m_isGif ? clear + 1 : NoEndOfStream;
       const unsigned int maxColor    = clear - 1;
       const unsigned int MaxToken    = 1 << maxCodeSize;
     
       /// temporary look-up table for decompression
       std::vector<BackReference> lut(m_isGif ? clear + 2 : clear + 1);
       lut.reserve(MaxToken);
       // set initial contents
       for (unsigned int i = 0; i <= maxColor; i++)
       {
         lut[i].previous = NoPrevious;
         lut[i].last     = (unsigned char)i;
         lut[i].length   = 1;
         lut[i].pos      = NoPrevious;
       }
     
       unsigned char codeSize = minCodeSize + 1;
     
       m_bitsLeftInCurrentBlock = 0;
     
       // delete old data
       m_bytes.clear();
       m_bytes.reserve(expectedNumberOfBytes);
     
       // pass through first token
       int prevToken = NoPrevious;
       int token     = getLzwBits(codeSize);
       while (token == clear)
         token = getLzwBits(codeSize);
     
       if (token >= (int)lut.size())
       {
     #ifdef ALLOW_VERBOSE
         std::cerr << "found initial token " << token << " but only " << lut.size() << " dictionary entries" << std::endl;
     #endif
         throw "invalid token";
       }
     
       if (token != endOfStream)
         m_bytes.push_back((unsigned char)token);
     
       // counters
       unsigned int numTokensTotal = 1;
       unsigned int numTokensBlock = 1;
       unsigned int numBitsBlock   = codeSize;
       unsigned int numBitsTotal   = codeSize;
       unsigned int uptoLastBlock  = 0;
     
       while (token != endOfStream)
       {
         // one more bit per code ?
         unsigned int powerOfTwo = 1 << codeSize;
         if (lut.size() == powerOfTwo && codeSize < maxCodeSize)
           codeSize++;
     
         // quick hack: compress' LZW algorithm doesn't have an end-of-file code
         if (!m_isGif && codeSize > m_input.getNumBitsLeft())
           break; // abort if not enough bits left
     
         // next token
         prevToken = token;
         token = getLzwBits(codeSize);
         if (token > (int)lut.size())
         {
     #ifdef ALLOW_VERBOSE
           std::cerr << "found token " << token << " (" << (int)codeSize << " bits, "
                     << pixel << " " << m_bytes.size() << ") but only " << lut.size() << " dictionary entries" << std::endl;
     #endif
           throw "invalid token";
         }
     
         // counters
         numTokensTotal++;
         numTokensBlock++;
         numBitsBlock += codeSize;
         numBitsTotal += codeSize;
     
         // reset dictionary ?
         bool reset = false;
         while (token == clear)
         {
     #ifdef ALLOW_VERBOSE
           if (verbose)
           {
             std::cout << "restart @ "  << std::setw(7) << m_bytes.size()
                       << "    \tbits=" << std::setw(8) << numBitsTotal << " +" << numBitsBlock
                       << std::setprecision(3) << std::fixed
                       << "   \tbits/"  << pixel << "=" << numBitsBlock / float(m_bytes.size() - uptoLastBlock)
                       << "   \ttokens=+" << numTokensBlock << "/" << numTokensTotal
                       << "   \tdict="  << std::setw(4) << lut.size() << std::endl;
           }
     #endif
     
           // delete all codes with 2+ bytes
           if (m_isGif)
           {
             lut.resize(clear + 2);
           }
           else
           {
             // one entry less => no end-of-file token
             lut.resize(clear + 1);
     
             // bits leftover in the current byte must be ignored
             if (m_numBitsOriginalLZW % 8 != 0)
             {
               unsigned int skipBits = 8 - (m_numBitsOriginalLZW % 8);
               getLzwBits(skipBits);
             }
     
             // continue with next byte, remove "garbage" bytes, too
             // a block's number of token has to be a multiple of 8
             unsigned int mod8 = numTokensBlock & 7;
             unsigned int gap  = mod8 == 0 ? 0 : 8 - mod8;
             // the following does the same computation in one line (I like that weird bit magic ...)
             //unsigned int gap = (-(int)numTokensBlock) & 7; // from https://www.ioccc.org/2015/mills2/hint.html
     
             while (gap-- > 0)
               m_input.getBits(codeSize); // remember, a token takes 16 bits when approaching the LZW restart
     
             // by the way: the GZIP sources have this formula (it doesn't compute the offset directly, though, and accounts for cases where <=16 bits/token):
             //posbits = ((posbits-1) + ((n_bits<<3)-(posbits-1+(n_bits<<3))%(n_bits<<3)));
           }
     
           // reset code size
           codeSize = minCodeSize + 1;
     
           // fetch first token
           prevToken = NoPrevious;
           token = getLzwBits(codeSize);
     
           // counters
           numTokensTotal++;
           numTokensBlock = 1;
           numBitsBlock   = codeSize;
           numBitsTotal  += codeSize;
           uptoLastBlock  = (unsigned int)m_bytes.size();
     
           // copy first token to output
           if (token > (int)maxColor)
             throw "block starts with an undefined value/color";
           m_bytes.push_back((unsigned char)token);
     
           reset = true;
         }
     
         if (reset)
           continue;
     
         // done ?
         if (token == endOfStream)
           break;
     
         // new LZW code
         BackReference add;
         add.previous = prevToken;
         add.length   = lut[prevToken].length + 1;
         add.pos      = (unsigned int)m_bytes.size();
     
         // look up token in dictionary
         if (token >= (int)lut.size())
         {
           // broken stream ?
           if (token != lut.size())
             throw "unknown token found";
           if (lut.size() >= MaxToken)
             throw "dictionary too large";
     
           // unknown token:
           // output LAST + LAST[0]
           // add    LAST + LAST[0]
           decode(m_bytes, prevToken, lut);
           add.last = m_bytes[add.pos];
           m_bytes.push_back(add.last);
         }
         else
         {
           // known token:
           // output TOKEN
           // add    LAST + TOKEN[0]
           // add new chunk to table (but no more than 2^12)
           decode(m_bytes, token, lut);
           add.last = m_bytes[add.pos];
         }
     
         // add LZW code to the dictionary:
         // the if-condition is required in case the dictionary is full and there hasn't been a clear-token
         if (lut.size() < MaxToken)
           lut.push_back(add);
       }
     
     #ifdef ALLOW_VERBOSE
       if (verbose)
         std::cout << "finish  @ "  << std::setw(7) << m_bytes.size()
                   << "    \tbits=" << std::setw(8) << numBitsTotal << " +" << numBitsBlock
                   << std::setprecision(3) << std::fixed
                   << "   \tbits/"  << pixel << "=" << std::setw(4) << numBitsBlock / float(m_bytes.size() - uptoLastBlock)
                   << "   \ttokens=+" << numTokensBlock << "/" << numTokensTotal
                   << "   \tdict="  << std::setw(4) << lut.size() << std::endl;
     #endif
     
       // skip remaining bits
       unsigned int unusedBits = m_bitsLeftInCurrentBlock;
       // process files where a few bytes are wasted
       if (unusedBits >= 8)
       {
     #ifdef ALLOW_VERBOSE
         std::cerr << "too many bits left in current block (" << unusedBits << ")" << std::endl;
     #endif
         while (unusedBits > 8)
         {
           getLzwBits(8);
           unusedBits -= 8;
         }
       }
       // ... and now actually skip the unused bits
       token = getLzwBits(unusedBits); // TODO: isn't it always 0 for non-GIFs ?
     
       // GIF only: a zero-sized block must follow
       if (m_isGif)
       {
         if (m_input.getByte() != 0)
           throw "LZW data is not properly terminated";
       }
     
       // don't include the "garbage" bits in the final count
       m_numBitsOriginalLZW -= unusedBits;
     
       // okay, we're done !
     }
     
     
     /// read bits from current LZW block, start with next block if needed
     unsigned int LzwDecoder::getLzwBits(unsigned char numBits)
     {
       // degenerated case
       if (numBits == 0)
         return 0;
     
       m_numBitsOriginalLZW += numBits;
     
       // compress has a simple LZW format, just read the bits ...
       if (!m_isGif)
         return m_input.getBits(numBits);
     
       // ... but GIF encapsulates the data in blocks of at most 255, each block is preceded by a length byte
     
       // enough bits available in the current block ?
       if (numBits <= m_bitsLeftInCurrentBlock)
       {
         m_bitsLeftInCurrentBlock -= numBits;
         return m_input.getBits(numBits);
       }
     
       // cross block border
       unsigned int  low   = 0;
       unsigned char shift = 0;
       // fetch all bits from current block ("low" bits of the token)
       if (m_bitsLeftInCurrentBlock > 0)
       {
         low      = m_input.getBits(m_bitsLeftInCurrentBlock);
         shift    = m_bitsLeftInCurrentBlock;
         numBits -= m_bitsLeftInCurrentBlock;
       }
     
       // read size of new block
       m_bitsLeftInCurrentBlock = 8 * m_input.getByte();
       if (m_bitsLeftInCurrentBlock < numBits)
         throw "too few bits available in unlzw";
     
       // and get remaining bits ("high" bits)
       unsigned int high = m_input.getBits(numBits) << shift;
       m_bitsLeftInCurrentBlock -= numBits;
     
       return low | high;
     }

GIF-related stuff:

show GifImage.h


     // //////////////////////////////////////////////////////////
     // GifImage.h
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     #pragma once
     
     #include "BinaryInputBuffer.h"
     
     #include <vector>
     #include <string>
     #include <utility> // for std::pair
     
     /// decode a GIF file, support for writing the same file with optimized LZW bitstream
     /** errors throw an exception (const char*) **/
     class GifImage
     {
     public:
       // -------------------- data types --------------------
       /// a continuous block of bytes
       typedef std::vector<unsigned char> Bytes;
     
       /// RGB color
       struct Color
       {
         unsigned char red;
         unsigned char green;
         unsigned char blue;
       };
     
       /// extension IDs
       enum ExtensionType
       {
         PlainText      = 0x01,
         GraphicControl = 0xF9,
         Comment        = 0xFE,
         Application    = 0xFF
       };
     
       /// a single frame
       struct Frame
       {
         /// simple copy of frame's header
         Bytes         rawHeader;
     
         /// extensions
         std::vector<std::pair<ExtensionType, Bytes> > extensions;
     
         /// each frame's bits per token
         unsigned char codeSize;
         /// pixels / indices
         Bytes         pixels;
     
         /// frame's upper left corner (relative to the global image)
         unsigned int  offsetLeft;
         unsigned int  offsetTop;
         /// frame size
         unsigned int  width;
         unsigned int  height;
     
         /// true if colors are sorted
         bool          isSorted;
         /// true if interlaced
         bool          isInterlaced;
         /// file position of interlaced flag (TODO: my code's broken for animated GIFs)
         unsigned int  posInterlaced;
         /// local color map, stored
         std::vector<Color> localColorMap;
     
         /// original LZW size (in bits)
         unsigned int  numLzwBits;
       };
     
       // -------------------- methods --------------------
       /// load file
       explicit GifImage(const std::string& filename);
     
       /// number of frames (or 1 if not animated)
       unsigned int  getNumFrames() const;
       /// return decompressed data (indices for local/global color map)
       const GifImage::Frame& getFrame(unsigned int frame = 0) const;
     
       /// color depth (bits per pixel)
       unsigned char getColorDepth() const;
     
       /// replace LZW data with optimized data and write to disk (bitDepth = 0 means "take value from m_colorDepth")
       unsigned int  writeOptimized(const std::string& filename, const std::vector<std::vector<bool> >& bits, unsigned char bitDepth = 0);
     
       /// convert from non-interlaced to interlaced (and vice versa)
       void setInterlacing(bool makeInterlaced);
     
       /// for debugging only: store image data in PPM format
       bool dumpPpm(const std::string& filename, unsigned int frame = 0) const;
       /// for debugging only: store indices
       bool dumpIndices(const std::string& filename, unsigned int frame = 0) const;
     
       /// show debug output
       static bool verbose;
     
     private:
       /// read signature GIF 87a/89a
       void parseSignature();
       /// global image parameters (constant for all frame)
       void parseGlobalDescriptor();
       /// GIF extensions (e.g. animation settings)
       void parseExtensions     (Frame& frame);
       /// local image parameters
       void parseLocalDescriptor(Frame& frame);
       /// final bytes of the image
       void parseTerminator();
     
       /// read 16 bits from m_input, little endian
       unsigned short getWord();
     
       /// the header will remain untouched
       Bytes         m_rawHeader;
       /// the last byte will remain untouched, too
       Bytes         m_rawTrailer; // contains just one byte, it's always 0x3B
     
       /// file version ("GIF87a" or "GIF89a")
       std::string   m_version;
       /// width  (in pixels)
       unsigned int  m_width;
       /// height (in pixels)
       unsigned int  m_height;
       /// bits per color
       unsigned char m_colorDepth;
       /// true if colors are sorted
       bool          m_isSorted;
       /// number of colors
       unsigned int  m_sizeGlobalColorMap;
       /// palette index of background color
       unsigned char m_backgroundColor;
       /// aspect ratio
       unsigned char m_aspectRatio;
     
       /// true, if animated
       bool          m_isAnimated;
     
       /// global color map
       std::vector<Color> m_globalColorMap;
     
       /// simple wrapper to read file bit-wise
       BinaryInputBuffer  m_input;
     
       /// decompressed frames (indices for local/global color map)
       std::vector<Frame> m_frames;
     };

show GifImage.cpp


     // //////////////////////////////////////////////////////////
     // GifImage.cpp
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/smallzwflexigif-lossless-gif-lzw-optimization/
     
     // official GIF spec: https://www.w3.org/Graphics/GIF/spec-gif89a.txt
     
     #include "GifImage.h"
     #include "LzwDecoder.h"
     
     #include <fstream>
     
     #define ALLOW_VERBOSE
     #ifdef  ALLOW_VERBOSE
     #include <iostream>
     #include <iomanip>
     #endif
     
     // statics
     bool GifImage::verbose = false;
     
     /// load file
     GifImage::GifImage(const std::string& filename)
     : m_rawHeader(),
       m_rawTrailer(),
       m_version(),
       m_width(0),
       m_height(0),
       m_colorDepth(0),
       m_isSorted(false),
       m_sizeGlobalColorMap(0),
       m_backgroundColor(0),
       m_aspectRatio(0),
       m_isAnimated(false),
       m_globalColorMap(),
       m_input(filename),
       m_frames()
     {
       try
       {
         if (m_input.empty())
           throw "file not found or empty";
     
         // parse header
         parseSignature();
         parseGlobalDescriptor();
     
     #ifdef ALLOW_VERBOSE
         if (verbose)
           std::cout << "'" << filename << "' image size " << m_width << "x" << m_height << ", " << (1 << m_colorDepth) << " colors" << std::endl;
     #endif
     
         // copy global header
         size_t numBytesHeader = m_input.getNumBytesRead();
         std::ifstream rawFile(filename.c_str(), std::ios::in | std::ios::binary);
         m_rawHeader.resize(numBytesHeader);
         rawFile.read((char*)&m_rawHeader[0], numBytesHeader);
     
         unsigned int totalLzwBits = 0;
     
         // decompress LZW
         while (true)
         {
           unsigned int bytesReadSoFar = m_input.getNumBytesRead();
     
           // technically it's impossible to encounter the end-of-file marker before the first frame
           unsigned char marker = m_input.peekBits(8);
           if (marker == 0x3B)
             break;
     
     #ifdef ALLOW_VERBOSE
           if (verbose)
             std::cout << "decompress frame " << (m_frames.size() + 1) << ": ";
     #endif
     
           Frame frame;
     
           // parse frame header
           parseExtensions     (frame);
           parseLocalDescriptor(frame);
     
           // and copy frame header
           unsigned int frameHeaderSize = m_input.getNumBytesRead() - bytesReadSoFar;
           frame.rawHeader.resize(frameHeaderSize);
           rawFile.seekg(bytesReadSoFar);
           rawFile.read((char*)&frame.rawHeader[0], frameHeaderSize);
     
           // decode LZW stream
           unsigned char minCodeSize = m_input.getByte();
           unsigned char maxCodeSize = 12; // constant value according to spec
           LzwDecoder::verbose = GifImage::verbose;
           LzwDecoder lzw(m_input, true, minCodeSize, maxCodeSize, m_width * m_height);
           frame.pixels     = lzw.getBytes();
           frame.codeSize   = lzw.getCodeSize();
           totalLzwBits    += lzw.getNumCompressedBits();
           frame.numLzwBits = lzw.getNumCompressedBits();
     
           // yeah, finished another frame ...
           m_frames.push_back(frame);
         }
     
         // read/copy the last bytes of the file, too
         m_rawTrailer = { (unsigned char)m_input.peekBits(8) };
         parseTerminator();
     
         if (!m_input.empty() != 0)
           throw "there is still some data left ...";
     
     #ifdef ALLOW_VERBOSE
         if (verbose)
         {
           const char* frames = (m_frames.size() == 1) ? "frame" : "frames";
           std::cout << m_frames.size() << " " << frames << ", " << totalLzwBits << " bits, " << m_frames.front().pixels.size() << " pixels plus " << numBytesHeader << " header bytes" << std::endl;
         }
     #endif
       }
       catch (const char* e)
       {
     #ifdef ALLOW_VERBOSE
         std::cout << "ERROR: " << (e ? e : "(unknown exception)") << std::endl;
     #else
         throw;
     #endif
       }
     }
     
     
     /// number of frames (or 1 if not animated)
     unsigned int GifImage::getNumFrames() const
     {
       return (unsigned int)m_frames.size();
     }
     
     
     /// return decompressed data (indices for local/global color map)
     const GifImage::Frame& GifImage::getFrame(unsigned int frame) const
     {
       if (frame >= m_frames.size())
         throw "invalid frame number";
     
       return m_frames.at(frame);
     }
     
     
     /// color depth (bits per pixel)
     unsigned char GifImage::getColorDepth() const
     {
       return m_colorDepth;
     }
     
     
     /// replace LZW data with optimized data and write to disk (bitDepth = 0 means "take value from m_colorDepth")
     unsigned int GifImage::writeOptimized(const std::string& filename, const std::vector<std::vector<bool> >& bits, unsigned char bitDepth)
     {
       std::ofstream file(filename.c_str(), std::ios::out | std::ios::binary);
       // write original header
       file.write((char*)&m_rawHeader[0], m_rawHeader.size());
     
       // bits per pixel/code
       if (bitDepth == 0)
         bitDepth = m_colorDepth;
     
       for (unsigned int frame = 0; frame < bits.size(); frame++)
       {
         // frame header
         file.write((char*)&m_frames[frame].rawHeader[0], m_frames[frame].rawHeader.size());
     
         // minCodeSize
         file << m_frames[frame].codeSize;
     
         // iterate over bits
         const std::vector<bool>& current = bits[frame];
         unsigned int pos = 0;
         while (pos < current.size())
         {
           // each block contains at most 255 bytes (=255*8 bits)
           unsigned int bitsCurrentBlock = (unsigned int)current.size() - pos;
           const unsigned int MaxBitsPerBlock = 255*8;
           if (bitsCurrentBlock > MaxBitsPerBlock)
             bitsCurrentBlock = MaxBitsPerBlock;
     
           // size in bytes, round up
           unsigned char blockSize = (bitsCurrentBlock + 7) / 8;
           file << blockSize;
     
           // merge single bits, write to disk
           for (unsigned int i = 0; i < blockSize; i++)
           {
             // create one byte
             unsigned char oneByte = 0;
             for (unsigned char bit = 0; bit < 8 && pos < current.size(); bit++, pos++)
               if (current[pos])
                 oneByte |= 1 << bit;
     
             // and write to disk
             file << oneByte;
           }
         }
     
         // add an empty block after each image
         file << (unsigned char) 0;
       }
     
       // write terminator
       file.write((char*)&m_rawTrailer[0], m_rawTrailer.size());
     
       // and we're done
       unsigned int filesize = (unsigned int)file.tellp();
       file.close();
       return filesize;
     }
     
     
     /// for debugging only: store image data in PPM format
     bool GifImage::dumpPpm(const std::string& filename, unsigned int frame) const
     {
       const Frame& current = m_frames.at(frame);
       if (current.width  != m_width ||
           current.height != m_height)
         throw "PPM for partial frames not supported yet";
     
       // header
       std::ofstream file(filename.c_str(), std::ios::out | std::ios::binary);
       if (!file)
         return false;
     
       file << "P6\n"
            << m_width << " " << m_height << "\n"
            << "255\n";
     
       // color mapping for current frame
       std::vector<Color> colorMap = m_globalColorMap;
       for (unsigned int i = 0; i < current.localColorMap.size(); i++)
         colorMap[i] = current.localColorMap[i];
     
       // convert indices to RGB
       const Bytes& pixels = current.pixels;
       for (unsigned int i = 0; i < pixels.size(); i++)
       {
         Color color = colorMap[pixels[i]];
         file << color.red
              << color.green
              << color.blue;
       }
     
       return file.good();
     }
     
     
     /// for debugging only: store indices
     bool GifImage::dumpIndices(const std::string& filename, unsigned int frame) const
     {
       const Frame& current = m_frames.at(frame);
       if (current.width  != m_width ||
           current.height != m_height)
         throw "dumping indices of partial frames not supported yet";
     
       // current.pixels are a consecutive memory block, just dump to disk
       std::ofstream file(filename.c_str(), std::ios::out | std::ios::binary);
       file.write((const char*)&current.pixels[0], current.pixels.size());
       return file.good();
     }
     
     
     /// convert from non-interlaced to interlaced (and vice versa)
     void GifImage::setInterlacing(bool makeInterlaced)
     {
       if (m_frames.size() != 1)
         throw "code doesn't work yet for animated GIFs";
     
       unsigned int posInterlaced = m_frames.front().posInterlaced;
       if (posInterlaced == 0)
         throw "interlaced bit not found";
     
       if (m_isAnimated)
         throw "interlacing in animation not supported yet";
     
       // interlacing doesn't matter for a single line
       if (m_height <= 1)
         return;
     
       const unsigned char mask = 0x40;
       bool isInterlaced = (m_rawHeader[posInterlaced] & mask) != 0;
       // keep current interlacing mode ?
       if (isInterlaced == makeInterlaced)
         return;
     
       // line order:
       // A) every 8th row, beginning with 0th row
       // B) every 8th row, beginning with 4th row
       // C) every 4th row, beginning with 2nd row
       // D) every 2nd row, beginning with 1st row
     
       for (unsigned int frame = 0; frame < m_frames.size(); frame++)
       {
         Bytes& current = m_frames[frame].pixels;
     
         if (makeInterlaced)
         {
           // non-interlaced => interlaced
     
           // set flag
           m_rawHeader[posInterlaced] |= mask;
     
           // re-order lines
           Bytes interlaced;
           interlaced.reserve(current.size());
           for (unsigned int y = 0; y < m_height; y += 8)
             interlaced.insert(interlaced.end(), current.begin() + y * m_width, current.begin() + y * m_width + m_width);
           for (unsigned int y = 4; y < m_height; y += 8)
             interlaced.insert(interlaced.end(), current.begin() + y * m_width, current.begin() + y * m_width + m_width);
           for (unsigned int y = 2; y < m_height; y += 4)
             interlaced.insert(interlaced.end(), current.begin() + y * m_width, current.begin() + y * m_width + m_width);
           for (unsigned int y = 1; y < m_height; y += 2)
             interlaced.insert(interlaced.end(), current.begin() + y * m_width, current.begin() + y * m_width + m_width);
           current = interlaced;
         }
         else
         {
           // interlaced => non-interlaced
     
           // unset flag
           m_rawHeader[posInterlaced] &= ~mask;
     
           // re-order lines
           Bytes interlaced = current;
           current.clear();
     
           // offsets of the first line of each line
           unsigned int pass0 = 0;
           unsigned int pass1 = (m_height + 8 - 1) / 8;
           // same as
           //pass1 = pass0;
           //for (unsigned int y = 0; y < m_height; y += 8)
           //  pass1++;
     
           unsigned int pass2 = pass1 + (m_height + 8 - 5) / 8;
           //pass2 = pass1;
           //for (unsigned int y = 4; y < m_height; y += 8)
           //  pass2++;
     
           unsigned int pass3 = pass2 + (m_height + 4 - 1) / 4;
           //pass3 = pass2;
           //for (unsigned int y = 2; y < m_height; y += 4)
             //pass3++;
     
           for (unsigned int y = 0; y < m_height; y++)
             switch (y % 8)
             {
             case 0:
               current.insert(current.end(), interlaced.begin() + pass0 * m_width, interlaced.begin() + (pass0 + 1) * m_width);
               pass0++;
               break;
     
             case 4:
               current.insert(current.end(), interlaced.begin() + pass1 * m_width, interlaced.begin() + (pass1 + 1) * m_width);
               pass1++;
               break;
     
             case 2: case 6:
               current.insert(current.end(), interlaced.begin() + pass2 * m_width, interlaced.begin() + (pass2 + 1) * m_width);
               pass2++;
               break;
     
             case 1: case 3: case 5: case 7:
               current.insert(current.end(), interlaced.begin() + pass3 * m_width, interlaced.begin() + (pass3 + 1) * m_width);
               pass3++;
               break;
             }
         }
       }
     }
     
     
     /// read signature GIF 87a/89a
     void GifImage::parseSignature()
     {
       const unsigned int NumBytes = 6;
       m_version.resize(NumBytes);
       for (unsigned int i = 0; i < NumBytes; i++)
         m_version[i] = m_input.getByte();
     
       // always starts with "GIF"
       if (m_version[0] != 'G' ||
           m_version[1] != 'I' ||
           m_version[2] != 'F')
         throw "invalid file signature";
     
       // version is either "87a" or "89a"
       if ( m_version[3] != '8' ||
           (m_version[4] != '7' && m_version[4] != '9') ||
            m_version[5] != 'a')
         throw "invalid GIF version, only 87a and 89a supported";
     }
     
     
     /// global image parameters (constant for all frame)
     void GifImage::parseGlobalDescriptor()
     {
       // get size (16 bits stored little endian)
       m_width  = getWord();
       m_height = getWord();
     
       // bits per color => 8 if 256 colors
       m_colorDepth  = m_input.getBits(3) + 1;
       m_sizeGlobalColorMap = 1 << m_colorDepth;
     
       // unused: true if colors are sorted descendingly (by "importance")
       m_isSorted = m_input.getBool();
       m_input.removeBits(3); // skip 3 bite
     
       // has palette ?
       bool hasGlobalColorMap = m_input.getBool();
       if (!hasGlobalColorMap)
         m_sizeGlobalColorMap = 0;
     
       m_backgroundColor = m_input.getByte();
       m_aspectRatio     = m_input.getByte();
     
       // read global palette
       m_globalColorMap.resize(m_sizeGlobalColorMap);
       for (unsigned int i = 0; i < m_sizeGlobalColorMap; i++)
       {
         Color color;
         color.red   = m_input.getByte();
         color.green = m_input.getByte();
         color.blue  = m_input.getByte();
     
         m_globalColorMap[i] = color;
       }
     }
     
     
     /// GIF extensions (e.g. animation settings)
     void GifImage::parseExtensions(Frame& frame)
     {
       while (true)
       {
         // each extension starts with 0x21
         unsigned char marker = m_input.peekBits(8);
         if (marker != 0x21)
           return;
         m_input.removeBits(8);
     
         // get extension type
         ExtensionType identifier = (ExtensionType)m_input.getBits(8);
     
         if (identifier == GraphicControl)
         {
           //std::cout << "HACK: assume animated" << std::endl;
           m_isAnimated = true;
         }
     
         // read all its parts (usually just one part)
         Bytes data;
         while (true)
         {
           unsigned char length = m_input.getByte();
           // last part ?
           if (length == 0)
             break;
     
           // skip contents
           for (unsigned char i = 0; i < length; i++)
             data.push_back(m_input.getByte());
         };
     
         frame.extensions.push_back(std::make_pair(identifier, data));
       };
     }
     
     
     /// local image parameters
     void GifImage::parseLocalDescriptor(Frame& frame)
     {
       unsigned char identifier = m_input.getByte();
       if (identifier != 0x2C)
         throw "expected local descriptor, but not found";
     
       // frame dimensions
       frame.offsetLeft = getWord();
       frame.offsetTop  = getWord();
       frame.width      = getWord();
       frame.height     = getWord();
     
       // HACK: doesn't work for animations
       frame.posInterlaced = m_input.getNumBytesRead();
     
       // color map related stuff
       unsigned int sizeLocalColorMap = 1 << (m_input.getBits(3) + 1);
       m_input.removeBits(2);
       frame.isSorted        = m_input.getBool();
       frame.isInterlaced    = m_input.getBool();
       bool hasLocalColorMap = m_input.getBool();
       if (!hasLocalColorMap)
         sizeLocalColorMap = 0;
     
     #ifdef ALLOW_VERBOSE
       if (verbose)
       {
         std::cout << frame.width << "x" << frame.height << " located at " << frame.offsetLeft << "x" << frame.offsetTop;
         if (frame.isInterlaced)
           std::cout << ", interlaced";
         if (hasLocalColorMap)
           std::cout << ", local color map size=" << sizeLocalColorMap;
       }
     #endif
     
       // copy RGB colors
       frame.localColorMap.resize(sizeLocalColorMap);
       for (unsigned int i = 0; i < sizeLocalColorMap; i++)
       {
         Color color;
         color.red   = m_input.getByte();
         color.green = m_input.getByte();
         color.blue  = m_input.getByte();
     
         frame.localColorMap[i] = color;
       }
     }
     
     
     /// final bytes of the image
     void GifImage::parseTerminator()
     {
       unsigned char identifier = m_input.getByte();
       if (identifier != 0x3B)
         throw "invalid terminator";
     }
     
     
     /// read 16 bits, little endian
     unsigned short GifImage::getWord()
     {
       unsigned short low  = m_input.getByte();
       unsigned short high = m_input.getByte();
       return low + (high << 8);
     }

And the same for .Z file format:

show Compress.h


     // //////////////////////////////////////////////////////////
     // Compress.h
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     #pragma once
     
     #include "BinaryInputBuffer.h"
     
     #include <vector>
     #include <string>
     
     /// decode a .Z compressed file (created by the old compress unix tool)
     /** errors throw an exception (const char*) **/
     class Compress
     {
     public:
       // -------------------- data types --------------------
       /// a continuous block of bytes
       typedef std::vector<unsigned char> Bytes;
     
       // -------------------- methods --------------------
       /// load file
       explicit Compress(const std::string& filename, bool loadAsUncompressedIfWrongMagicBytes = false);
     
       /// replace LZW data with optimized data and write to disk
       unsigned int writeOptimized(const std::string& filename, const std::vector<bool>& bits);
     
       /// get uncompressed contents
       const Bytes& getData() const;
     
       /// for debugging only: save uncompressed data
       bool dump(const std::string& filename) const;
     
       /// show debug output
       static bool verbose;
     
     private:
       /// first two bytes of every .Z file
       enum
       {
         MagicByte1 = 0x1F,
         MagicByte2 = 0x9D
       };
     
       /// settings of the original file (third byte of that file)
       unsigned char     m_settings;
     
       /// simple wrapper to read file bit-wise
       BinaryInputBuffer m_input;
       /// uncompressed bytes
       Bytes             m_data;
     };

show Compress.cpp


     // //////////////////////////////////////////////////////////
     // Compress.cpp
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/smallzwflexigif-lossless-gif-lzw-optimization/
     
     // I couldn't find a proper spec of the .Z file format
     // my code is loosely based on GZIP's LZW implementation
     // https://research.cs.wisc.edu/wpis/examples/pcca/gzip/gzip.c
     // other resources:
     // https://wiki.wxwidgets.org/Development:_Z_File_Format
     // https://www.ioccc.org/2015/mills2/hint.html
     
     #include "Compress.h"
     #include "LzwDecoder.h"
     
     #include <fstream>
     
     #define ALLOW_VERBOSE
     #ifdef  ALLOW_VERBOSE
     #include <iostream>
     #include <iomanip>
     #endif
     
     // statics
     bool Compress::verbose = false;
     
     /// load file
     Compress::Compress(const std::string& filename, bool loadAsUncompressedIfWrongMagicBytes)
     : m_settings(0),
       m_input(filename),
       m_data()
     {
       try
       {
         if (m_input.empty())
           throw "file not found or empty";
     
         // check magic bytes
         bool isZ = (m_input.getByte() == MagicByte1);
         isZ     |= (m_input.getByte() == MagicByte2);
     
         // has proper magic bytes ?
         if (isZ)
         {
           // compression settings
           m_settings = m_input.getByte();
           // default format is "block mode", where the highest bit is set
           if ((m_settings & 0x80) == 0)
             throw "only .Z block mode supported";
           // unused bits, must be zero
           if ((m_settings & 0x60) != 0)
             throw "unknown .Z format flag found";
     
           // maximum bits per LZW code, almost always 16
           unsigned char maxBits = m_settings & 0x1F;
     
           // get filesize
           std::fstream in(filename.c_str(), std::ios::in | std::ios::binary);
           in.seekg(0, in.end);
           unsigned int filesize = (unsigned int)in.tellg();
           unsigned int expected = 3 * filesize; // crude heuristic for size of uncompressed data
           in.close();
     
           // and decompress !
           LzwDecoder::verbose = Compress::verbose;
           LzwDecoder lzw(m_input, false, 8, maxBits, expected);
           m_data = lzw.getBytes();
         }
         else
         {
           // should it have a .Z file ?
           if (!loadAsUncompressedIfWrongMagicBytes)
             throw "file is not a .Z compressed file (magic bytes don't match)";
     
           // just read from disk
           std::fstream in(filename.c_str(), std::ios::in | std::ios::binary);
           // get filesize
           in.seekg(0, in.end);
           size_t numBytes = (size_t)in.tellg();
           // allocate memory and read everything at once
           m_data.resize(numBytes);
           in.seekg(0, in.beg);
           in.read((char*)&m_data[0], numBytes);
         }
       }
       catch (const char* e)
       {
     #ifdef ALLOW_VERBOSE
         std::cout << "ERROR: " << (e ? e : "(unknown exception)") << std::endl;
     #endif
       }
     }
     
     
     /// replace LZW data with optimized data and write to disk
     unsigned int Compress::writeOptimized(const std::string& filename, const std::vector<bool>& bits)
     {
       std::ofstream file(filename.c_str(), std::ios::out | std::ios::binary);
       // write magic bytes
       file.put((char)MagicByte1);
       file.put((char)MagicByte2);
       // and settings
       file.put((char)m_settings);
     
       unsigned int pos = 0;
       // merge single bits, write to disk
       while (pos < bits.size())
       {
         // create one byte
         unsigned char oneByte = 0;
         for (unsigned char bit = 0; bit < 8; bit++, pos++)
           if (pos < bits.size() && bits[pos])
             oneByte |= 1 << bit;
     
         // and write to disk
         file << oneByte;
       }
     
       // and we're done
       unsigned int filesize = (unsigned int)file.tellp();
       file.close();
       return filesize;
     }
     
     
     /// get uncompressed contents
     const Compress::Bytes& Compress::getData() const
     {
       return m_data;
     }
     
     
     /// for debugging only: save uncompressed data
     bool Compress::dump(const std::string& filename) const
     {
       std::ofstream file(filename.c_str(), std::ios::out | std::ios::binary);
       file.write((char*)&m_data[0], m_data.size());
       return file.good();
     }

A simple helper class to read files bitwise:

show BinaryInputBuffer.h


     // //////////////////////////////////////////////////////////
     // BinaryInputBuffer.h
     // Copyright (c) 2011-2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/disclaimer.html
     //
     
     #pragma once
     
     #pragma warning (disable: 4530) // missing exception handling
     #include <string>
     #include <fstream>
     
     /// read a file bit-wise
     class BinaryInputBuffer
     {
     public:
       /// read from file
       explicit BinaryInputBuffer(const std::string& filename);
     
       /// get number of bytes read so far
       unsigned int   getNumBytesRead() const;
       /// get number of bits  still available
       unsigned int   getNumBitsLeft()  const;
       /// true, if no bits left
       bool           empty()           const;
     
       /// take a look at the next bits without advancing the file pointer
       unsigned int   peekBits  (unsigned char numBits); // at most 16 bits
       /// increment buffer pointers/offsets
       void           removeBits(unsigned char numBits); // at most 16 bits
       /// get the next bits and increment buffer pointers/offsets
       unsigned int   getBits   (unsigned char numBits); // at most 16 bits
       /// get 8 bits
       unsigned char  getByte();
       /// get a single bit
       bool           getBool();
     
     private:
       /// simple I/O buffering, faster than C lib's built-in functions
       unsigned char  getBufferedByte();
     
       /// bit/byte file stream
       std::ifstream m_stream;
       /// total number of bytes read so far
       unsigned int  m_read;
       /// total bits left (initially, it's 8*m_size)
       unsigned int  m_bitsLeft;
       /// store bits until next byte boundary
       unsigned int  m_bitBuffer;
       /// number of valid bits in _bitBuffer
       unsigned char m_bitBufferSize;
     
       /// default buffer size
       enum { CacheSize = 1024 };
     
       /// buffer
       unsigned char m_cache[CacheSize];
       /// position of next byte
       unsigned int  m_cacheOffset;
       /// position beyond last valid byte
       unsigned int  m_cacheSize;
     };

show BinaryInputBuffer.cpp


     // //////////////////////////////////////////////////////////
     // BinaryInputBuffer.cpp
     // Copyright (c) 2011-2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/disclaimer.html
     //
     
     #include "BinaryInputBuffer.h"
     
     #include <cassert>
     
     /// read from file
     BinaryInputBuffer::BinaryInputBuffer(const std::string& filename)
     : m_stream(),
       m_read(0),
       m_bitsLeft(0),
       m_bitBuffer(0),
       m_bitBufferSize(0),
       m_cache(),
       m_cacheOffset(0),
       m_cacheSize(0)
     {
       // open file
       m_stream.open(filename.c_str(), std::ios::in | std::ios::binary);
       if (!m_stream)
         return;
     
       // get file size
       m_stream.seekg(0, std::ios_base::end);
       unsigned int size = (unsigned int)m_stream.tellg();
       m_stream.seekg(0, std::ios_base::beg);
     
       // load first segment (or even the complete file)
       m_cacheSize = CacheSize;
       if (m_cacheSize > size)
         m_cacheSize = size;
       m_stream.read((char*)m_cache, m_cacheSize);
     
       // compute total number of bits
       m_bitsLeft = 8 * size;
     }
     
     
     /// get number of bytes read so far
     unsigned int BinaryInputBuffer::getNumBytesRead() const
     {
       return m_read;
     }
     
     
     /// get number of bits  still available
     unsigned int BinaryInputBuffer::getNumBitsLeft() const
     {
       return m_bitsLeft;
     }
     
     
     /// true, if no bits left
     bool BinaryInputBuffer::empty() const
     {
       return m_bitsLeft == 0;
     }
     
     
     /// take a look at the next bits without modifying the buffer
     unsigned int BinaryInputBuffer::peekBits(unsigned char numBits)
     {
       assert(numBits <= 16);
       assert(numBits <= m_bitsLeft);
     
       // move 8 bits from stream to buffer
       while (m_bitBufferSize < numBits)
       {
         // get one byte from stream and add to internal buffer
         unsigned int byte = getBufferedByte();
         m_bitBuffer       |= byte << m_bitBufferSize;
         m_bitBufferSize   += 8;
       }
     
       // return desired bits
       unsigned int bitMask = (1 << numBits) - 1;
       return m_bitBuffer & bitMask;
     }
     
     
     /// get the next bits and increment buffer pointers/offsets
     unsigned int BinaryInputBuffer::getBits(unsigned char numBits)
     {
       unsigned int result = peekBits(numBits);
       removeBits(numBits);
       return result;
     }
     
     
     /// get 8 bits
     unsigned char BinaryInputBuffer::getByte()
     {
       return (unsigned char)getBits(8);
     }
     
     
     /// get a single bit
     bool BinaryInputBuffer::getBool()
     {
       return getBits(1) == 1;
     }
     
     
     /// increment buffer pointers/offsets
     void BinaryInputBuffer::removeBits(unsigned char numBits)
     {
       // if more bits needs to be removed than are actually available in the buffer
       if (m_bitBufferSize < numBits)
         peekBits(numBits);
     
       // adjust buffers and counters
       m_bitBuffer    >>= numBits;
       m_bitBufferSize -= numBits;
       m_bitsLeft      -= numBits;
     }
     
     
     /// simple I/O buffering, faster than C lib's built-in functions
     unsigned char BinaryInputBuffer::getBufferedByte()
     {
       // (re-)fill buffer
       if (m_cacheOffset >= m_cacheSize)
       {
         // read as much as possible (it has to fit into the buffer, though)
         size_t numRead = (m_bitsLeft + 7) / 8;
         if (numRead > m_cacheSize)
           numRead = m_cacheSize;
     
         m_stream.read((char*)m_cache, numRead);
         m_cacheOffset = 0;
       }
     
       // count number of read bytes
       m_read++;
     
       // single byte
       return m_cache[m_cacheOffset++];
     }

And finally the main program which parses command-line parameters:

show flexiGIF.cpp


     // //////////////////////////////////////////////////////////
     // flexigif.cpp
     // Copyright (c) 2018 Stephan Brumme. All rights reserved.
     // see https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/
     
     #include "GifImage.h"
     #include "Compress.h"
     #include "LzwDecoder.h"
     #include "LzwEncoder.h"
     
     #include <vector>
     #include <iostream>
     #include <iomanip>
     #include <cstdlib>
     #include <ctime>
     
     namespace
     {
       // ----- constants -----
       /// program version
       const char* Version = "2018.11a";
       /// return codes
       enum ReturnCode
       {
         NoError                 =  0,
         GenericException        =  1,
         NotImplemented          =  2,
         ParameterOutOfRange     =  3,
         InvalidParameter        =  4,
         MissingParameter        =  5,
         UnknownParameter        =  6,
         ContradictingParameters =  7,
         MoreThanTwoFilenames    =  8,
         SameFile                =  9,
         DontOverwrite           = 10,
         NoFrameFound            = 11,
         OnlyForGifs             = 12,
         DebuggingMode           = 99
       };
     
       /// default values of the optimizer
       enum DefaultValues
       {
         GifMaxToken      =  20000,
         LzwMaxToken      = 100000,
         GifMaxDictionary =   4096,
         LzwMaxDictionary =  65536,
         GifMaxDictionaryCompatible = GifMaxDictionary - 3, // 4093
         Alignment        =      1,
         MinImprovement   =      1,
         MinNonGreedy     =      2
       };
     }
     
     
     /// terminate program, show help
     void help(const std::string& errorMsg = "", int errorCode = NoError, bool showHelp = true)
     {
       // error
       if (!errorMsg.empty())
         std::cout << "ERROR: " << errorMsg << std::endl;
     
       // default help
       if (showHelp)
         std::cout << std::endl
                   << "flexiGIF " << Version << ", written by Stephan Brumme" << std::endl
                   << "Usage: flexigif [options] INPUTFILE OUTPUTFILE" << std::endl
                   << std::endl
                   << "Options:" << std::endl
                   << " -p    --prettygood         try greedy search plus non-greedy with '-a=" << Alignment << " -n=" << MinNonGreedy << " -m=" << MinImprovement << " -d=" << GifMaxDictionary << "' => typically the best results" << std::endl
                   << " -a=x  --alignment=x        blocks starts at multiples of x (default is -a=1 => best compression but may be slow)" << std::endl
                   << " -d=x  --dictionary=x       maximum size of the LZW dictionary (default is -d=" << GifMaxDictionary << ", 0 means \"maximum\")" << std::endl
                   << " -t=x  --maxtokens=x        maximum number of tokens per block (default is -t=" << GifMaxToken << ")" << std::endl
                   << " -c    --compatible         create files that should be more compatible to faulty decoders" << std::endl
                   << " -l    --deinterlace        ensure that output is not interlaced" << std::endl
                   << " -g    --greedy             enable     greedy match search (default)" << std::endl
                   << " -n=x  --nongreedy=x        enable non-greedy match search, x is the minimum match length (default is -n=" << MinNonGreedy << ")" << std::endl
                   << " -m=x  --minimprovement=x   minimum number of bytes saved by a non-greedy match (requires parameter -n, default is -m=" << MinImprovement << ")" << std::endl
                   << " -i    --info               analyze internal structure of INPUTFILE" << std::endl
                   << " -f    --force              overwrite OUTPUTFILE if it already exists" << std::endl
                   << " -r    --splitruns          allow partial matching of long runs of the same byte (requires parameter -n)" << std::endl
                   << " -u=x  --userdefined=x      don't search but set custom block boundaries, x is an ascendingly sorted list, e.g. -u=500,2000,9000" << std::endl
                   << " -s    --summary            when finished, compare filesize of INPUTFILE and OUTPUTFILE" << std::endl
                   << " -v    --verbose            show debug messages" << std::endl
                   << " -q    --quiet              no output during compression" << std::endl
                   << " -Z                         INPUTFILE and OUTPUTFILE are stored in .Z file format instead of .gif" << std::endl
                   << " -b=x  --benchmark=x        benchmark GIF decoder, x stands for the number of iterations (default: x=100)" << std::endl
                   << " -y    --immediately        avoid initial clear code and start immediately with compressed data" << std::endl
                   //<< "      --ppm=x               store x-th frame in PPM format in OUTPUTFILE" << std::endl
                   //<< "      --indices=x           store x-th frame's indices in OUTPUTFILE" << std::endl
                   //<< "      --compress            INPUTFILE isn't compressed yet and OUTPUTFILE will be a .Z file" << std::endl
                   //<< "      --decompress          store decompressed contents of INPUTFILE, which must be a .Z file, in OUTPUTFILE" << std::endl
                   << " -h    --help               display this help screen" << std::endl
                   << std::endl
                   << "See https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/ for more infos" << std::endl;
     
       exit(errorCode);
     }
     
     
     /// let's go !
     int main(int argc, char** argv)
     {
       // no parameters at all ?
       if (argc == 1)
         help();
     
       // filenames
       std::string input;
       std::string output;
     
       // other parameters
       bool isGif       = true;  // if false then .Z file format
       bool inputInfo   = false; // just one filename: show some compression details of input file
       bool showSummary = false; // after finishing recompression: display how many bytes were saved
       bool overwrite   = false; // overwrite an existing OUTPUTFILE
       bool quiet       = false; // no console output
       bool verbose     = false; // lots of console output
       bool deinterlace = false; // deinterlace a GIF image
     
       bool smartGreedy = false;
       bool benchmark   = false; // decompress INPUTFILE several times and measure throughput in MB/sec
       unsigned int iterations = 10;  // benchmark only: repeat x times
       bool showDecompressed = false; // dump a frame in PPM format to OUTPUTFILE
       bool showIndices      = false; // dump a frame's indices to OUTPUTFILE
       unsigned int ppmFrame = 0;     // only relevant if showDecompressed is true
       bool compressZ   = false; // decompress INPUTFILE to OUTPUTFILE (.Z format only)
       bool decompressZ = false; // compress INPUTFILE to OUTPUTFILE (.Z format)
     
       std::vector<unsigned int> predefinedBlocks; // insert clear codes at these user-defined positions
     
       // optimizer
       LzwEncoder::OptimizationSettings optimize;
       optimize.alignment           = Alignment;
       optimize.verbose             = false;
       optimize.greedy              = true;
       optimize.minImprovement      = MinImprovement;
       optimize.minNonGreedyMatch   = MinNonGreedy;
       optimize.splitRuns           = false;
       optimize.maxDictionary       = 0;
       optimize.maxTokens           = GifMaxToken;
       optimize.startWithClearCode  = true;
       optimize.readOnlyBest        = false;
       optimize.avoidNonGreedyAgain = false;
     
       // parse parameters
       std::string current;
       size_t      currentPos = 0; // if several short arguments are merged into one, e.g. "-v -s -f" => "-vsf"
     
       // arguments 1 ... argc-1
       int parameter = 0;
       while (parameter < argc)
       {
         // long parameters cannot be merged (such as --force)
         if ((current.size() > 2 && current[0] == '-' && current[1] == '-') ||
             (currentPos + 1 >= current.size()))
           current.clear();
     
         // next parameter ?
         if (current.empty() || currentPos + 1 >= current.size())
         {
           parameter++;
           if (parameter == argc)
             break;
         }
     
         // read argument
         if (current.empty())
         {
           current = argv[parameter];
           if (current.size() >= 2 && current[0] == '-' && current[1] != '-')
             currentPos = 1;
           else
             currentPos = 0;
         }
         else
         {
           // several short parameters were merged
           currentPos++;
         }
     
         // if single-letter argument
         char currentShort = current[currentPos];
     
         // no option => must be a filename
         if (current[0] != '-')
         {
           // first comes the input file
           if (input.empty())
           {
             input = current;
             current.clear();
             continue;
           }
           // then the output file
           if (output.empty())
           {
             output = current;
             current.clear();
             continue;
           }
     
           // but at most two filenames ...
           help("more than two filenames specified", MoreThanTwoFilenames);
         }
     
         // help
         if (currentShort == 'h' || current == "--help")
           help();
     
         // info
         if (currentShort == 'i' || current == "--info")
         {
           inputInfo = true;
           continue;
         }
     
         // summary
         if (currentShort == 's' || current == "--summary")
         {
           if (quiet)
             help("flag -s (show summary) contradicts -q (quiet)", ContradictingParameters, false);
           showSummary = true;
           continue;
         }
     
         // overwrite
         if (currentShort == 'f' || current == "--force")
         {
           overwrite = true;
           continue;
         }
     
         // verbose
         if (currentShort == 'v' || current == "--verbose")
         {
           if (quiet)
             help("flag -v (verbose) contradicts -q (quiet)", ContradictingParameters, false);
     
           verbose             = true;
           optimize.verbose    = true;
           GifImage::verbose   = true;
           Compress::verbose   = true;
           LzwDecoder::verbose = true;
           continue;
         }
     
         // quiet
         if (currentShort == 'q' || current == "--quiet")
         {
           if (verbose)
             help("flag -q (quiet) contradicts -v (verbose)",      ContradictingParameters, false);
           if (showSummary)
             help("flag -q (quiet) contradicts -s (show summary)", ContradictingParameters, false);
     
           quiet = true;
           continue;
         }
     
         // greedy
         if (currentShort == 'g' || current == "--greedy")
         {
           optimize.greedy = true;
           continue;
         }
     
         // split runs
         if (currentShort == 'r' || current == "--splitruns")
         {
           optimize.splitRuns = true;
           continue;
         }
     
         // deinterlace
         if (currentShort == 'l' || current == "--deinterlace")
         {
           deinterlace = true;
           continue;
         }
     
         // good settings
         if (currentShort == 'p' || current == "--prettygood")
         {
           smartGreedy             = true;
           optimize.greedy         = false;
           optimize.minImprovement = MinImprovement;
           optimize.maxDictionary  = GifMaxDictionary;
           optimize.maxTokens      = GifMaxToken;
           optimize.avoidNonGreedyAgain = true;
           continue;
         }
     
         // enhance compatibility
         if (currentShort == 'c' || current == "--compatible")
         {
           optimize.maxDictionary = GifMaxDictionaryCompatible; // 4093
           optimize.greedy        = true;
           optimize.startWithClearCode = true;
           continue;
         }
     
         // avoid initial clear code (relevant for GIFs only)
         if (currentShort == 'y' || current == "--immediately")
         {
           optimize.startWithClearCode = false;
           continue;
         }
     
         // compress' .Z file format
         if (currentShort == 'Z')
         {
           isGif = false;
           continue;
         }
     
         // decompress .Z file, nothing else
         if (current == "--decompress")
         {
           decompressZ = true;
           isGif       = false; // implicit -Z flag
           continue;
         }
     
         // INPUTFILE isn't compressed (applies to .Z files only)
         if (current == "--compress")
         {
           compressZ = true;
           isGif     = false; // implicit -Z flag
           continue;
         }
     
         // adjustable parameters
         bool hasValue = false;
         int  value = 0;
         std::string strValue;
         size_t splitAt = current.find('=');
         if (splitAt != std::string::npos)
         {
           // get right-hand side and store number/string in variables value/strValue
           strValue = current.substr(splitAt + 1);
           value    = atoi(strValue.c_str());
           // keep only left-hand side of parameter
           current.resize(splitAt);
           // no other argument may be merged with this one
           currentPos = current.size();
     
           hasValue = !strValue.empty();
         }
     
         // alignment/blockstart
         if (current == "-a" || current == "--alignment")
         {
           if (value <= 0)
             help("parameter -a/--alignment cannot be zero", ParameterOutOfRange, false);
     
           optimize.alignment = (unsigned int)value;
           continue;
         }
     
         // maximum size of dictionary
         if (current == "-d" || current == "--dictionary")
         {
           if (value <= 0)
             help("parameter -d/--dictionary cannot be zero", ParameterOutOfRange, false);
     
           optimize.maxDictionary = (unsigned int)value;
           continue;
         }
     
         // maximum number of tokens per block
         if (current == "-t" || current == "--maxtokens")
         {
           if (value < 0)
             value = 0; // no limit
     
           optimize.maxTokens = (unsigned int)value;
           continue;
         }
     
         // non-greedy minimum improvement
         if (current == "-m" || current == "--minimprovement")
         {
           if (value <= 0)
             help("parameter -m/--minimprovement cannot be zero", ParameterOutOfRange, false);
     
           optimize.minImprovement = (unsigned int)value;
           continue;
         }
     
         // non-greedy match length
         if (currentShort == 'n' || current == "--nongreedy")
         {
           optimize.greedy = false;
           optimize.minNonGreedyMatch = hasValue ? (unsigned int)value : MinNonGreedy;
     
           if (value < 2)
             help("parameter -n/--nongreedy cannot be less than 2", ParameterOutOfRange, false);
     
           continue;
         }
     
         // predefined block boundaries
         if (current == "-u" || current == "--userdefined")
         {
           // syntax: a,b,c where a/b/c are decimal numbers
           predefinedBlocks.push_back(0);
           for (size_t i = 0; i < strValue.size(); i++)
           {
             char oneByte = strValue[i];
             // next number follows ?
             if (oneByte == ',')
             {
               predefinedBlocks.push_back(0);
               continue;
             }
     
             if (oneByte < '0' || oneByte > '9')
               help("invalid syntax for parameter -u/--userdefined: it must be a sorted list of numbers", InvalidParameter, false);
     
             // process digit
             predefinedBlocks.back() *= 10;
             predefinedBlocks.back() += oneByte - '0';
           }
     
           // check whether the list is in ascending order (duplicates are disallowed, too)
           for (size_t i = 1; i < predefinedBlocks.size(); i++)
             if (predefinedBlocks[i - 1] >= predefinedBlocks[i])
               help("invalid syntax for parameter -u/--userdefined: it must be a sorted list of numbers", InvalidParameter, false);
     
           continue;
         }
     
         // benchmark, user-defined number of iterations
         if (current == "-b" || current == "--benchmark")
         {
           benchmark  = true;
           iterations = hasValue ? value : 100; // default: decode 100x
           if (value < 1)
             help("parameter -b/--benchmark cannot be zero", ParameterOutOfRange, false);
           continue;
         }
     
         // PPM output of a GIF frame of LZW decompression of Z file (TODO: jsut debugging code, not in public interface yet)
         if (current == "--ppm")
         {
           showDecompressed = true;
           ppmFrame = hasValue ? value : 1; // first frame by default
           continue;
         }
     
         // PPM output of a GIF frame of LZW decompression of Z file (TODO: jsut debugging code, not in public interface yet)
         if (current == "--indices")
         {
           showIndices = true;
           ppmFrame = hasValue ? value : 1; // first frame by default
           continue;
         }
     
         // whoopsie ...
         help("unknown parameter " + current, 1);
       }
     
       try
       {
         // auto-detect .Z files
         if (input.size() > 2 &&
             input[input.size() - 2] == '.' &&
             input[input.size() - 1] == 'Z')
           isGif = false;
     
         if (!isGif)
         {
           // increase token limit
           if (optimize.maxTokens == GifMaxToken)
             optimize.maxTokens = LzwMaxToken;
         }
     
         // check parameter combinations
         if (optimize.splitRuns && optimize.greedy)
           help("parameter -r requires -n", MissingParameter);
     
         // only one parameter: a file name => automatically switch to "info mode"
         if (argc == 2 && !input.empty())
           inputInfo = true;
     
         // show GIF/LZW infos about input file
         if (inputInfo)
         {
           if ( input .empty())
             help("no filename provided", MissingParameter, false);
           if (!output.empty())
             help("too many filenames provided (accepting only one)", MoreThanTwoFilenames, false);
     
           // just load and parse
           if (isGif)
           {
             GifImage::verbose = true;
             GifImage info(input);
           }
           else
           {
             Compress::verbose = true;
             Compress info(input);
           }
           return NoError;
         }
     
         // benchmark
         if (benchmark)
         {
           if (input.empty())
             help("missing INPUTFILE", MissingParameter, false);
     
           std::cout << "benchmarking '" << input << "' ..." << std::endl
                     << "decoding file, " << iterations << " iterations" << std::endl;
     
           clock_t start = clock();
     
           unsigned int numDecodedFrames = 0;
           unsigned long long numPixels  = 0;
     
           for (unsigned int i = 0; i < iterations; i++)
           {
             if (isGif)
             {
               // parse file
               GifImage gif(input);
     
               // error during decoding
               if (gif.getNumFrames() == 0)
                 help("no frames found in " + input, NoFrameFound, false);
     
               // statistics
               numDecodedFrames += gif.getNumFrames();
               for (unsigned int frame = 0; frame < gif.getNumFrames(); frame++)
                 numPixels += gif.getFrame(frame).pixels.size();
     
               // disable verbose output for the 2..n iteration
               GifImage::verbose = false;
             }
             else
             {
               // parse file
               Compress lzw(input, compressZ);
               numDecodedFrames++;
               numPixels += lzw.getData().size();
               // disable verbose output for the 2..n iteration
               Compress::verbose = false;
             }
           }
     
           clock_t finish   = clock();
           float seconds    = (finish - start) / float(CLOCKS_PER_SEC);
           float perFile    = seconds / iterations;
           float perFrame   = seconds / numDecodedFrames;
           float throughput = numPixels / seconds;
           std::cout   << std::fixed
                       << "elapsed:    " << std::setw(8) << std::setprecision(6) << seconds  << " seconds" << std::endl
                       << "per file:   " << std::setw(8) << std::setprecision(6) << perFile  << " seconds" << std::endl;
           if (iterations != numDecodedFrames)
             std::cout << "per frame:  " << std::setw(8) << std::setprecision(6) << perFrame << " seconds" << std::endl;
           std::cout   << "throughput: " << std::setw(8) << std::setprecision(3) << throughput / 1000000 << " megapixel/second" << std::endl;
           return NoError;
         }
     
         if (input .empty())
           help("missing INPUTFILE",  MissingParameter);
         if (output.empty())
           help("missing OUTPUTFILE", MissingParameter);
     
         // same name ?
         if (input == output)
           help("INPUTFILE and OUTPUTFILE cannot be the same filename", SameFile, false);
     
         // don't overwrite by default
         if (!overwrite)
         {
           std::fstream checkOutput(output.c_str());
           if (checkOutput)
             help("OUTPUTFILE already exists, please use -f to overwrite an existing file", DontOverwrite, false);
         }
     
         // store a single frame in PPM format
         if (showDecompressed || showIndices)
         {
           GifImage gif(input);
     
           // use 0-index internally instead of 1-index
           ppmFrame--;
           if (ppmFrame >= gif.getNumFrames()) // note: if ppmFrame was 0 => -1 => 0xFFFFFF... => a very large number
             help("please specify a valid frame number", ParameterOutOfRange);
     
           // write PPM (or plain indices)
           bool ok = false;
           if (showDecompressed)
             ok = gif.dumpPpm(output, ppmFrame);
           else
             ok = gif.dumpIndices(output, ppmFrame);
     
           return ok ? NoError : DontOverwrite;
         }
     
         // decompress .Z file
         if (decompressZ)
         {
           Compress lzw(input, compressZ);
           if (lzw.dump(output))
             return NoError;
           else
             return DontOverwrite;
         }
     
         // -------------------- process input --------------------
     
         clock_t start = clock();
     
         if (!quiet)
           std::cout << "flexiGIF " << Version << ", written by Stephan Brumme" << std::endl;
         if (verbose)
         {
           std::cout << "used options:";
     
           std::cout << " -a=" << optimize.alignment;
           if (optimize.startWithClearCode)
             std::cout << " -c";
           if (optimize.maxDictionary > 0)
             std::cout << " -d=" << optimize.maxDictionary;
           if (overwrite)
             std::cout << " -f";
           if (deinterlace)
             std::cout << " -l";
           if (!optimize.greedy)
             std::cout << " -m=" << optimize.minImprovement;
           if (!optimize.greedy)
             std::cout << " -n=" << optimize.minNonGreedyMatch;
           if (smartGreedy)
             std::cout << " -p";
           if (quiet)
             std::cout << " -q";
           if (optimize.splitRuns && !optimize.greedy)
             std::cout << " -r";
           if (showSummary)
             std::cout << " -s";
           std::cout << " -t=" << optimize.maxTokens;
           if (verbose)
             std::cout << " -v";
           if (!isGif)
             std::cout << " -Z";
           if (compressZ)
             std::cout << " --compress";
           if (decompressZ)
             std::cout << " --decompress";
           if (showIndices)
             std::cout << " --indices=" << ppmFrame;
           if (showDecompressed)
             std::cout << " --ppm=" << ppmFrame;
     
           std::cout << std::endl;
         }
     
         if (verbose)
           std::cout << std::endl << "===== decompress '" << input << "' =====" << std::endl;
     
         if (isGif) // ---------- recompress GIF frames ----------
         {
           // load GIF
           GifImage gif(input);
     
           // error during decoding ?
           if (gif.getNumFrames() == 0)
             help("no frames found in " + input, NoFrameFound, false);
     
           // determine minCodeSize, often 8 (up to 256 colors)
           optimize.minCodeSize = 1;
           // look for largest byte in each frame
           unsigned char maxValue = 0;
           const unsigned char EightBits = 0x80;
           for (unsigned int frame = 0; frame < gif.getNumFrames() && maxValue < EightBits; frame++)
           {
             // get bytes / pixels
             const GifImage::Bytes& indices = gif.getFrame(frame).pixels;
             for (size_t i = 0; i < indices.size(); i++)
               // larger ?
               if (maxValue < indices[i])
               {
                 maxValue = indices[i];
                 // at least 8 bits ? => maximum
                 if (maxValue >= EightBits)
                   break;
               }
           }
           // compute number of bits
           while (maxValue >= (1 << optimize.minCodeSize))
             optimize.minCodeSize++;
           // codeSize = 1 is not allowed by spec, even b/w images have codeSize = 2
           if (optimize.minCodeSize == 1)
             optimize.minCodeSize = 2;
     
           // de-interlace non-animated GIFs
           if (deinterlace)
           {
             if (gif.getNumFrames() > 1)
               help("de-interlacing is not supported yet for animated GIFs", NotImplemented, false);
             gif.setInterlacing(false);
           }
     
           if (gif.getNumFrames() > 1 && !predefinedBlocks.empty())
             help("user-defined block boundaries are not allowed for animated GIFs", NotImplemented, false);
     
           // -------------------- generate output --------------------
     
           if (verbose)
             std::cout << std::endl << "===== compression in progress ... =====" << std::endl;
     
           // optimize all frames
           unsigned int numFrames = gif.getNumFrames();
           std::vector<std::vector<bool> > optimizedFrames;
           for (unsigned int frame = 0; frame < numFrames; frame++)
           {
             // get original LZW bytes
             const GifImage::Frame& current = gif.getFrame(frame);
             const std::vector<unsigned char>& indices = current.pixels;
             LzwEncoder encoded(indices, isGif);
             optimize.minCodeSize = current.codeSize;
     
             // store optimized LZW bytes
             LzwEncoder::BitStream optimized;
     
             // look for optimal block boundaries
             if (predefinedBlocks.empty())
             {
               unsigned int lastDisplay = 0;
               int pos = (int)indices.size() - 1;
               for (int i = pos; i >= 0; i--)
               {
                 // only if block start is aligned
                 if (i % optimize.alignment != 0)
                   continue;
     
                 // show progress
                 if (!quiet && (i / optimize.alignment) % 8 == 0 && clock() != lastDisplay)
                 {
                   unsigned int percentage = 100 - (100 * i / indices.size());
                   std::cout << "    \rframe " << frame+1 << "/" << numFrames << " (" << indices.size() << " pixels): "
                             << percentage << "% done";
     
                   // ETA
                   clock_t now     = clock();
                   float elapsed   = (now - start) / float(CLOCKS_PER_SEC);
                   float estimated = elapsed * 100 / (percentage + 0.000001f) - elapsed;
     
                   if (elapsed > 3 && numFrames == 1 && estimated >= 1)
                     std::cout << " (after " << (int)elapsed << "s, about " << (int)estimated << "s left)";
                   std::cout << std::flush;
     
                   lastDisplay = now;
                 }
     
                 // estimate cost
                 encoded.optimizePartial(i, 0, false, true, optimize);
     
                 // repeat estimation, this time with non-greedy search
                 if (smartGreedy && !optimize.greedy)
                 {
                   optimize.greedy = true;
                   encoded.optimizePartial(i, 0, false, true, optimize);
                   optimize.greedy = false;
                 }
               }
     
               if (!quiet)
                 std::cout << "                            " << std::endl;
     
               // final bitstream for current image
               optimized = encoded.optimize(optimize);
             }
             else
             {
               // remove invalid block boundaries (or should it be an ERROR ?)
               while (predefinedBlocks.back() > indices.size())
                 predefinedBlocks.pop_back();
     
               // to simplify code, include start and end of file as boundaries, too
               if (predefinedBlocks.empty() || predefinedBlocks.front() != 0)
                 predefinedBlocks.insert(predefinedBlocks.begin(), 0);
               if (predefinedBlocks.back() != indices.size())
                 predefinedBlocks.push_back(indices.size());
     
               // avoid certain optimizer settings that might cause incomplete images
               optimize.maxTokens     = 0;
               optimize.maxDictionary = 0;
     
               optimized = encoded.merge(predefinedBlocks, optimize);
             }
     
             optimizedFrames.push_back(optimized);
           }
     
           // write to disk
           gif.writeOptimized(output, optimizedFrames, optimize.minCodeSize);
         }
         else // ---------- .Z file format ----------
         {
           if (!predefinedBlocks.empty())
             help("predefined blocks not implemented yet for .Z files", NotImplemented);
     
           // disable GIF-only optimizations
           optimize.startWithClearCode = false;
     
           Compress lzw(input, compressZ);
     
           // get LZW bytes
           const std::vector<unsigned char>& bytes = lzw.getData();
           LzwEncoder encoded(bytes, isGif);
     
           // adjust optimizer:
           // always the full ASCII alphabet
           optimize.minCodeSize = 8;
           // dictionary limit is 2^16 instead of 2^12
           if (optimize.maxDictionary == GifMaxDictionary || optimize.maxDictionary == GifMaxDictionaryCompatible)
             optimize.maxDictionary = LzwMaxDictionary; // 0 is an option, too, ... it disables the limit check
     
           if (verbose)
             std::cout << std::endl << "===== compression in progress ... =====" << std::endl;
     
           // store optimized LZW bytes
           LzwEncoder::BitStream optimized;
     
           // look for optimal block boundaries
           unsigned int percentageDone = 0;
           int pos = (int)bytes.size() - 1;
           for (int i = pos; i >= 0; i--)
           {
             // only if block start is aligned
             if (i % optimize.alignment != 0)
               continue;
     
             // show progress
             float percentage = 100 - (100 * float(i) / bytes.size());
             if (percentage != percentageDone && !quiet)
             {
               // ETA
               clock_t now     = clock();
               float elapsed   = (now - start) / float(CLOCKS_PER_SEC);
               float estimated = elapsed * 100 / (percentage + 0.000001f) - elapsed;
     
               std::cout << "    \r" << (int)percentage << "% done";
               if (elapsed > 3 && estimated >= 1)
                 std::cout << " (after " << (int)elapsed << "s, about " << (int)estimated << "s left)";
               std::cout << std::flush;
     
               percentageDone = (unsigned int)percentage;
             }
     
             // estimate cost
             encoded.optimizePartial(i, 0, false, true, optimize);
           }
     
           if (!quiet)
             std::cout << "                            " << std::endl;
     
           // write to disk
           optimized = encoded.optimize(optimize);
           lzw.writeOptimized(output, optimized);
         }
     
         // -------------------- bonus output :-) --------------------
         if (showSummary)
         {
           // measure duration
           clock_t finish  = clock();
           float   seconds = (finish - start) / float(CLOCKS_PER_SEC);
     
           // get filesizes
           std::fstream in (input .c_str(), std::ios::in | std::ios::binary);
           std::fstream out(output.c_str(), std::ios::in | std::ios::binary);
           in .seekg(0, in .end);
           out.seekg(0, out.end);
           int before = (int)in .tellg();
           int now    = (int)out.tellg();
     
           if (verbose)
             std::cout << std::endl << "===== done ! =====" << std::endl;
     
           // smaller, larger ?
           int diff   = before - now;
           if (diff == 0)
             std::cout << "no optimization found for '" << input << "', same size as before (" << now << " bytes).";
           if (diff >  0)
             std::cout << "'" << output << "' is " <<  diff << " bytes smaller than '" << input << "' (" << now << " vs " << before << " bytes) => "
                       << "you saved " << std::fixed << std::setprecision(3) << (diff * 100.0f / before) << "%.";
           if (diff <  0)
           {
             std::cout << "'" << output << "' is " << -diff << " bytes larger than '" << input << "' (" << now << " vs " << before << " bytes).";
             if (optimize.alignment > 1 || optimize.greedy)
               std::cout << " Please use more aggressive optimization settings.";
           }
     
           std::cout << " Finished after " << std::fixed << std::setprecision(2) << seconds << " seconds." << std::endl;
         }
       }
       catch (const char* e)
       {
         std::cerr << "ERROR: " << (e ? e : "(no message)") << std::endl;
         return GenericException;
       }
       catch (std::exception& e)
       {
         std::cerr << "ERROR: " << (e.what() ? e.what() : "(no message)") << std::endl;
         return GenericException;
       }
       catch (...)
       {
         std::cerr << "ERROR: unknown exception caught" << std::endl;
         return GenericException;
       }
     
       return NoError;
     }

Or download these precompiled binaries (static builds, need no libraries/DLLs):

Windows (x32 and x64):
flexiGIF.2018.11a.exe (latest release, November 19, 2018) / 270.0 kb
flexiGIF.2018.10b.exe (older version, October 17, 2018) / 268.5 kb
flexiGIF.2018.10a.exe (older version, October 15, 2018) / 269.0 kb
flexiGIF.2018.09b.exe (older version, September 26, 2018) / 259.0 kb
flexiGIF.2018.09a.exe (older version, September 16, 2018) / 252.5 kb

Linux (x64 only):
flexiGIF.2018.11a (latest release, November 19, 2018) / 1,271.0 kb
flexiGIF.2018.10b (older version, October 16, 2018) / 1,270.5 kb
flexiGIF.2018.10a (older version, October 15, 2018) / 1,270.5 kb
flexiGIF.2018.09b (older version, September 26, 2018) / 1,143.4 kb
flexiGIF.2018.09a (older version, September 17, 2018) / 1,354.1 kb

Changelog

version 2018.11a
- November 19, 2018
- bugfix: final pass sometimes aborted without emitting any bits (regression from October release)
version 2018.10b
- October 16, 2018
- bugfix: default dictionary limit for .Z files was 2¹² when it should have been 2¹⁶
version 2018.10a
- October 15, 2018
- -p automatically chooses between greedy and non-greedy mode (GIFs now about 0.2% smaller)
- fixed bug in cost estimation which causes sub-optimal dictionary reset points
- added support for .Z format
- reduced memory consumption for -a=x if x > 1
version 2018.09b
- September 26, 2018
- source code made available
- fixed bug with animated GIFs where the minimum bit size varied between frames
version 2018.09a
- September 15, 2018
- initial version

Gallery

I downloaded the GIF images found on the Wikipedia page about the GIF file format.
The program Gifsicle is almost always recommended as the best GIF optimizer. flexiGIF beats it consistently:

image	program	bytes
	flexiGIF 2018.10a (parameters: `-p`)	55,799 bytes
	(original)	56,752 bytes (+953 bytes, +1.71%)

image	program	bytes
	flexiGIF 2018.10a (parameters: `-p`, (using Gifsicle's output))	339,011 bytes
	gifsicle 1.91	347,919 bytes (+8,908 bytes, +2.63%)
	flexiGIF 2018.09a	352,615 bytes (+13,604 bytes, +4.01%)
	(original)	363,021 bytes (+24,010 bytes, +7.08%)

image	program	bytes
	flexiGIF 2018.10a (parameters: none)	280 bytes
	gifsicle 1.91	291 bytes (+11 bytes, +3.93%)
	(original)	291 bytes (+11 bytes, +3.93%)

image	program	bytes
	flexiGIF 2018.10a (parameters: `-p -m=1`)	52,663 bytes
	gifsicle 1.91	54,268 bytes (+1,605 bytes, +3.05%)
	(original)	54,310 bytes (+1,647 bytes, +3.13%)

bonus image:

image	program	bytes
	flexiGIF 2018.10a (parameters: `-p -m=1`)	167,333 bytes
	gifsicle 1.91	170,958 bytes (+3,625 bytes, +2.17%)
	(original / Irfanview 4.51)	171,064 bytes (+3,731 bytes, +2.23%)

`.Z` file format

There is no official documentation of the .Z file format. I looked at the source code of gzip, ncompress and Chris Mills' 2015 IOCCC program in order to figure out how it works.
The latter is a decompressor written in just 256 byte of completely obfuscated C.

The .Z header is very simple: just the two magic bytes 0x1F and 0x9D followed by a "settings" byte:

the highest bit should be set

"block mode"
that means, a dictionary reset is allowed
my program supports only "block mode"

the lowest 4 bits encode the highest number of bits per LZW code

at most 16 (even though it's possible to encode up to 31 but that's not supported by any program
the shortest LZW code has 9 bits

all other bits should be zero

LZW compression works pretty much the same way like for GIFs with a few exceptions:

the initial code size is always 9 bits - independent of input data (GIF: images with few colors may have a smaller code size)
a dictionary may contain up to 2²⁶entries (GIF: 2¹²)
there is no dedicated end-of-file code, the first "free" LZW code is 257 (GIF: 258 for 8 bit images)
a clear code is followed by a gap (see below)

I don't have a full understanding of the padding after a dictionary reset, here is what I found out:

if the bits of the clear code don't end on a byte boundary you pad it with zeroed bits
the follows a number of zeroed bytes, such that:

gzip computes the next position as posbits = ((posbits-1) + ((n_bits<<,3)-(posbits-1+(n_bits<<,3))%(n_bits<<,3)))
if the clear code is 16 bits long, then this means: round up the number of tokens to the next multiple of 8 ...
... and insert as many zeroed tokens. Since the clear code consumed 16 bits, the following "zero tokens" consume 16 bits each, too
pseudocode: int numZeroBytes = 8 - (numTokens % 8); if (numZeroBytes == 8) numZeroBytes = 0; numZeroBytes *= 2;
note: all of this is only true for 16-bits clear codes !

Obviously there can't be a gallery for .Z files ... but at here are some numbers of the quite popular enwik9 dataset:

software	size	parameters	algorithm	comment
ncompress	443,200,353 bytes	(none)	Lempel-Ziv-Welch	quite fast: 19s
flexiGIF 2018.10b	412,450,465 bytes	`-a=2048 -t=250000`	Lempel-Ziv-Welch with flexible dictionary restarts	`-a=1 -t=600000` would be better but even slower
gzip	322,591,995 bytes	`-9`	Lempel-Ziv 77 with Huffman	second fastest: 51 seconds
cmix v16	116,912,035 bytes	(none)	context mixing with dictionary	slowest by far ...

Source Code

Git users: scroll down to the repository link

Download LzwEncoder.h

Latest release: October 11, 2018, size: 4377 bytes, 121 lines

CRC32: 8be33dc9
MD5: cf2d04852fa032fc30f44080597ace73
SHA1: 6b1d1c95e9a48ef3ba04a1e33e62dfd33f6556aa
SHA256:e0b18f27821a019e8147cd3d6faaea0435942acc99da5076b0284e88dd132110

Download LzwEncoder.cpp

Latest release: November 19, 2018, size: 18.8 kBytes, 626 lines

CRC32: 335f281b
MD5: d4cbaa014f7d4203959a8d93975fe770
SHA1: 96e90380bc41d077efe039eed633aedbbbeee57e
SHA256:78f94144a07a13a19f3d9205219b7c3b334824b19fddaeb1386c5c8d94e1e51f

Git users: scroll down to the repository link

Download LzwDecoder.h

Latest release: October 11, 2018, size: 2546 bytes, 68 lines

CRC32: bedda00f
MD5: e2cc1576e653d4ab0a1ad5ef9ff79ffe
SHA1: 9abe6ae05c9968d160428565a4a6e3539ac77fb8
SHA256:70a25c3872ce0748736cb3ef676a853a6124010803872df2dcc807d480fcf933

Download LzwDecoder.cpp

Latest release: October 11, 2018, size: 10.6 kBytes, 377 lines

CRC32: 8f14f6db
MD5: 46b8e9661054f43c22355c70254c27fd
SHA1: e0eb4a5a949c8a94dc7f3c1f66f986db71f7b399
SHA256:9177252d37f7dcefeb5e95d6f0c34671364d4a3ead2e4946a4872c3fb68f3f6c

Download GifImage.h

Latest release: November 19, 2018, size: 4255 bytes, 148 lines

CRC32: 872a0593
MD5: 544df5e523ee2d9bcff4d52960ba144a
SHA1: 93079254e0fa96328f8c8dfbcfa2c6a7a1b3804b
SHA256:897562b796342aecc5b3736b3d893986adf162292d3287aed4188b28b79bb9a9

Download GifImage.cpp

Latest release: November 19, 2018, size: 14.8 kBytes, 538 lines

CRC32: e9573c40
MD5: e0eb0277b07310e54934bbcafba9dd4c
SHA1: ad62344745727974aafca21dc47f236ad9618647
SHA256:1544fe9bfdf9b1d6065213da8b2f6f11120d9010d865784a3324256a75cc6559

Download Compress.h

Latest release: October 10, 2018, size: 1451 bytes, 53 lines

CRC32: 531c4542
MD5: e18b2de01f73ac452fe66e9f23b625ac
SHA1: dbdd400a8c249012f9fa1255e95ad4e59f34791b
SHA256:f0e0f892b2fe6f5fc07e6d58f1969dd8e6d67efcf42e0c7564f0eddb290e2bec

Download Compress.cpp

Latest release: October 10, 2018, size: 3870 bytes, 139 lines

CRC32: 6c8dfcde
MD5: 92d0a9d1f5a2e1c17b0347f49dac5633
SHA1: 308d806ff25f113a842af35d597d29840f44c996
SHA256:b96f454ff1116ab63b1b9869938274966dd500269f1ab9fc070436e993272942

Download BinaryInputBuffer.h

Latest release: October 2, 2018, size: 1868 bytes, 62 lines

CRC32: c5da8645
MD5: ecc08bd55f3f1477d6009e20d4159efc
SHA1: a404d9b73583dc48fe020c9a244e8b8a370535c9
SHA256:86dcb88c95b1d013dd22434b714d543b7be4881daac618ed4c47e3267e9ffc9c

Download BinaryInputBuffer.cpp

Latest release: October 2, 2018, size: 3171 bytes, 142 lines

CRC32: 1d06db66
MD5: d98e08e0c4b18fb5af55cec3e033d452
SHA1: faa67b439cb5112701ca3b40faf8d27720c848b9
SHA256:c0e9cfcd337e640ab1b1090120c5a94d876a7ecf4650a93b4ae5390678c173ab

Download flexiGIF.cpp

Latest release: November 19, 2018, size: 30.0 kBytes, 919 lines

CRC32: 36d234ea
MD5: b9bd5573f33a98612a4e65416e61f2d7
SHA1: 0c8bfa5019452655c033c806e6578e3cf5d2cd59
SHA256:8fa9f225149d14a581f850d3d1a9251f7594fdf2319e7817eade87ae751b58ed

Download Makefile

Latest release: October 15, 2018, size: 606 bytes, 27 lines

CRC32: c045e2f9
MD5: f9bd64ce955a8ee362b62d0aa8a51817
SHA1: 7da5fc420dce911c63557ae1e0f1013f90cf0270
SHA256:243e559fb179b9c90ff62af8aac472b79091811d81849f3832ed0d40dade9e1d

Stay up-to-date:git clone https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/git

If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com

Download readme.md

Latest release: October 15, 2018, size: 9.5 kBytes, 192 lines

CRC32: 8be9978a
MD5: f87f9cd74a08a2db277c2a6d1c78930b
SHA1: dee2fe0017528b4f8f50f45dfaa9117fd1e1154b
SHA256:68b2603797570617e267bd234721b7aaaf82dce687e7255bcdf3398ee817281c

Stay up-to-date:git clone https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/git

If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com


June 30, 2021: Length-Limited Prefix Codes April 13, 2020: smalLZ4 - optimal LZ4 compression June 17, 2019: toojpeg - a JPEG encoder in a single C++ file ... and 30 more !	RSS	www.stephan-brumme.com bits.stephan-brumme.com photos.stephan-brumme.com minime.stephan-brumme.com
© 2011-2024 Stephan Brumme products and company names mentioned may be trademarks of their respective owners. see my disclaimer. dude, this is the end ... of this page.