Zstandard - Fast real-time compression algorithm

Zstandard

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.

The project is provided as an open-source dual BSD and GPLv2 licensed C library, and a command line utility producing and decoding .zst, .gz, .xz and .lz4 files. Should your project require another programming language, a list of known ports and bindings is provided on Zstandard homepage.

Development branch status:

Build Status Build status Build status Build status Fuzzing Status

Benchmarks

For reference, several fast compression algorithms were tested and compared on a server running Arch Linux (Linux version 5.5.11-arch1-1), with a Core i9-9900K CPU @ 5.0GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 9.3.0, on the Silesia compression corpus.

Compressor name Ratio Compression Decompress.
zstd 1.4.5 -1 2.884 500 MB/s 1660 MB/s
zlib 1.2.11 -1 2.743 90 MB/s 400 MB/s
brotli 1.0.7 -0 2.703 400 MB/s 450 MB/s
zstd 1.4.5 --fast=1 2.434 570 MB/s 2200 MB/s
zstd 1.4.5 --fast=3 2.312 640 MB/s 2300 MB/s
quicklz 1.5.0 -1 2.238 560 MB/s 710 MB/s
zstd 1.4.5 --fast=5 2.178 700 MB/s 2420 MB/s
lzo1x 2.10 -1 2.106 690 MB/s 820 MB/s
lz4 1.9.2 2.101 740 MB/s 4530 MB/s
zstd 1.4.5 --fast=7 2.096 750 MB/s 2480 MB/s
lzf 3.6 -1 2.077 410 MB/s 860 MB/s
snappy 1.1.8 2.073 560 MB/s 1790 MB/s

The negative compression levels, specified with --fast=#, offer faster compression and decompression speed in exchange for some loss in compression ratio compared to level 1, as seen in the table above.

Zstd can also offer stronger compression ratios at the cost of compression speed. Speed vs Compression trade-off is configurable by small increments. Decompression speed is preserved and remains roughly the same at all settings, a property shared by most LZ compression algorithms, such as zlib or lzma.

The following tests were run on a server running Linux Debian (Linux version 4.14.0-3-amd64) with a Core i7-6700K CPU @ 4.0GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 7.3.0, on the Silesia compression corpus.

Compression Speed vs Ratio Decompression Speed
Compression Speed vs Ratio Decompression Speed

A few other algorithms can produce higher compression ratios at slower speeds, falling outside of the graph. For a larger picture including slow modes, click on this link.

The case for Small Data compression

Previous charts provide results applicable to typical file and stream scenarios (several MB). Small data comes with different perspectives.

The smaller the amount of data to compress, the more difficult it is to compress. This problem is common to all compression algorithms, and reason is, compression algorithms learn from past data how to compress future data. But at the beginning of a new data set, there is no "past" to build upon.

To solve this situation, Zstd offers a training mode, which can be used to tune the algorithm for a selected type of data. Training Zstandard is achieved by providing it with a few samples (one file per sample). The result of this training is stored in a file called "dictionary", which must be loaded before compression and decompression. Using this dictionary, the compression ratio achievable on small data improves dramatically.

The following example uses the github-users sample set, created from github public API. It consists of roughly 10K records weighing about 1KB each.

Compression Ratio Compression Speed Decompression Speed
Compression Ratio Compression Speed Decompression Speed

These compression gains are achieved while simultaneously providing faster compression and decompression speeds.

Training works if there is some correlation in a family of small data samples. The more data-specific a dictionary is, the more efficient it is (there is no universal dictionary). Hence, deploying one dictionary per type of data will provide the greatest benefits. Dictionary gains are mostly effective in the first few KB. Then, the compression algorithm will gradually use previously decoded content to better compress the rest of the file.

Dictionary compression How To:

  1. Create the dictionary

    zstd --train FullPathToTrainingSet/* -o dictionaryName

  2. Compress with dictionary

    zstd -D dictionaryName FILE

  3. Decompress with dictionary

    zstd -D dictionaryName --decompress FILE.zst

Build instructions

Makefile

If your system is compatible with standard make (or gmake), invoking make in root directory will generate zstd cli in root directory.

Other available options include:

  • make install : create and install zstd cli, library and man pages
  • make check : create and run zstd, tests its behavior on local platform

cmake

A cmake project generator is provided within build/cmake. It can generate Makefiles or other build scripts to create zstd binary, and libzstd dynamic and static libraries.

By default, CMAKE_BUILD_TYPE is set to Release.

Meson

A Meson project is provided within build/meson. Follow build instructions in that directory.

You can also take a look at .travis.yml file for an example about how Meson is used to build this project.

Note that default build type is release.

VCPKG

You can build and install zstd vcpkg dependency manager:

git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install zstd

The zstd port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.

Visual Studio (Windows)

Going into build directory, you will find additional possibilities:

  • Projects for Visual Studio 2005, 2008 and 2010.
    • VS2010 project is compatible with VS2012, VS2013, VS2015 and VS2017.
  • Automated build scripts for Visual compiler by @KrzysFR, in build/VS_scripts, which will build zstd cli and libzstd library without any need to open Visual Studio solution.

Buck

You can build the zstd binary via buck by executing: buck build programs:zstd from the root of the repo. The output binary will be in buck-out/gen/programs/.

Testing

You can run quick local smoke tests by executing the playTest.sh script from the src/tests directory. Two env variables $ZSTD_BIN and $DATAGEN_BIN are needed for the test script to locate the zstd and datagen binary. For information on CI testing, please refer to TESTING.md

Status

Zstandard is currently deployed within Facebook. It is used continuously to compress large amounts of data in multiple formats and use cases. Zstandard is considered safe for production environments.

License

Zstandard is dual-licensed under BSD and GPLv2.

Contributing

The dev branch is the one where all contributions are merged before reaching release. If you plan to propose a patch, please commit into the dev branch, or its own feature branch. Direct commit to release are not permitted. For more information, please read CONTRIBUTING.

Owner
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
Facebook
Comments
  • Compressing individual documents with a dictionary produced from a sampled batch to get better compression ratio

    Compressing individual documents with a dictionary produced from a sampled batch to get better compression ratio

    I'm very excited to see work being done on dictionary support in the API, because this is something that could greatly help me solve a pressing problem.

    Context

    In the context of a Document Store, where we are storing a set of JSON-like documents that are sharing the same schema. Each document can be created, read or updated individually in a random fashion. We would like to compress the documents on disk, but there is very little redundancy within each document, which yields very poor compression ratio (maybe 10-20%). When compressing batchs of 10s or 100s documents, the compression ratio gets really good (10x, 50x or sometimes even more), because there is a lot of redundancy between documents, from:

    • the structure of the JSON itself which has a lot of ": ", or ": true, or {[[...]],[[...]]} symbols.
    • the names of the JSON fields: "Id", "Name", "Label", "SomeVeryLongFieldNameThatIsPresentOnlyOncePerDocument", etc..
    • frequent values like constants (true, "Red", "Administrator", ...), keywords, dates that start with 2015-12-14T.... for the next 24h, and even well-known or frequently used GUIDs that are shared by documents (Product Category, Tag Id, hugely popular nodes in graph databases, ...)

    In the past, I used femtozip (https://github.com/gtoubassi/femtozip) which is intended precisely for this use case. It includes a dictionary training step (by building a sample batch of documents), that is then used to compress and decompress single documents, with the same compression ratio as if it was a batch. Using real life data, compressing 1000 documents individually would give the same compression ratio as compressing all 1000 documents in a batch with gzip -5.

    The dictionary training part of femtozip can be very long: the more samples, the better the compression ratio would be in the end but you need tons of RAM to train it.

    Also, I realized that femtozip would sometimes offset the differences in size between different formats like JSON/BSON/JSONB/ProtoBuf and other binary formats, because it would pick up the "grammar" of the format (text or binary) in the dictionary, and only deal with the "meat" of the documents (guids, integers, doubles, natural text) when compressing. This means I can use a format like JSONB (used by Postgres) which is less compact, but is faster to decode at runtime than JSON text.

    Goal

    I would like to be able to do something similar with Zstandard. I don't really care about building the most efficient dictionary (though it could be nice), but at least being able to exploit the fact that FSE builds a list of tokens sorted by frequency. Extracting this list of tokens may help in building a dictionary that will have the most common tokens in the training batch.

    The goal would be:

    • For each new or modified document D, compress it AS IF we were compressing SAMPLES[cur_gen] + D.json, and only storing the bits produced by the D.json part.
    • When reading document D, decompress it AS IF we had the complete compressed version of SAMPLES[D.gen] + D.compressed, and only keeping the last decoded bits that make up D.

    Since it would be impractical to change the compression code to be able to know which compressed bits are from D, and which from the batch, we could aproximate this by computing a DICTIONARY[gen] that would be used to initialize the compressor and decompressor.

    Idea
    • Start by serializing an empty object into JSON (we would get the json structure and all the field names, but no values)
    • Use this as the the initial "gen 0" dictionary for the first batch of documents (when starting with an empty database)
    • After N documents, sample k random documents and compress them to produce a "generation 1" dictionary.
    • Compress each new or updated document (individually) with this new dictionary
    • After another N documents, or if some heuristic shows that compression ratio starts declining, then start a new generation of dictionary.

    The Document Store would durably store each generations of dictionaries, and use them to decompress older entries. Periodically, it could recycle the entire store by recompressing everything with the most recent dictionary.

    Concrete example:

    Training set:

    • { "id": 123, "label": "Hello", "enabled": true, "uuid": "9ad51b87-d627-4e04-85c2-d6cb77415981" }
    • { "id": 126, "label": "Hell", "enabled": false, "uuid": "0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c" }
    • { "id": 129, "label": "Help", "enabled": true, "uuid": "fe6db321-cddd-4e7f-b3d6-6b38365b3e2a" }

    Looking at it, we can extract the following repeating segments: { "id": 12.., "label": "Hel... ", "enabled": ... e, "uuid": " ... " }, which could be condensed into:

    • { "id": 12, "label": "Hel", "enabled": e, "uuid":"" } (53 bytes shared by all docs)

    The unique part of each documents would be:

    • ...3...lo...tru...9ad51b87-d627-4e04-85c2-d6cb77415981 (42 bytes)
    • ...6...l...fals...0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c (42 bytes)
    • ...9......tru...fe6db321-cddd-4e7f-b3d6-6b38365b3e2a (40 bytes)

    Zstd would only have to work on 42 bytes per doc, instead of 85 bytes. More realistic documents will have a lot more stuff in common than this example.

    What I've tested so far
    • create "gen0" dictionary with "hollow" JSON: { "id": , "foo": "", "bar": "", ....} produced by removing all values from the JSON document.
    • using ZSTD_compress_insertDictionary, compressing { "id": 123, "foo": "Hello", "bar": "World", ...} is indeed smaller than without dictionary.
    • looking at cctx->litStart, I can see a buffer with 123HelloWorld which is exactly the content specific to the document itself that got removed when producing the gen0 dict.

    Maybe one way to construct a better dictionary would be:

    • compress the batch of random and complete document (with values)
    • take K first symbols ordered by frequency descending
    • create dictionary by outputing symbol K-1, then K-2, up to 0 (I guess that if most frequent symbol is to the end of the dictionary, offsets to it would be smaller?)
    • maybe one could ask for a target dictionary size, and K would be the number of symbols needed to fill the dictionary?
    What I'm not sure about
    • I don't know how having a dictionary would help with larger documents above 128KB or 256KB. Currently I'm only inserting the dictionary for the first block. Would I need to reuse the same dictionary for each 128KB block?
    • What is the best size for this dictionary? 16KB? 64KB? 128KB?
    • ZSTD_decompress_insertDictionary branches off into different implementation for lazy, greedy and so on. I'm not sure if all compression strategy can be used to produce such a dictionary?

    Again, I don't care about producing the ideal dictionary that produces the smallest result possible, only something that would give me about better compression ratio, while still being able to handle documents in isolation.

  • plans for packaging on different platforms (package manager integration)

    plans for packaging on different platforms (package manager integration)

    Are there any plans on getting zstd into the main package manager repositories of various (or all if possible) platforms? Is there a list for this already?

    A list of platforms includes (but is not limited to):

    • GNU/Linux
      • debian derived
        • [x] *buntu (aptitude)
          • package (xenial : 0.5.1-1 (outdated) ; yakkety : 0.8.0-1 (compatible) ; zesty : 1.1.2-1 (current) )
        • [x] Debian (aptitude)
      • Red Hat
        • [x] Fedora&Red Hat Enterprise Linux
      • SUSE
      • other
    • Unix&BSD
    • Windows
      • [x] Windows (some MSI thing and plain exes)
      • [x] MSYS2
      • [ ] Cygwin
  • Struggling with ZSTD_decompressBlock

    Struggling with ZSTD_decompressBlock

    Hi,

    I'm having a problem when using the block-based methods.

    If I use 'ZSTD_compressContinue' with 'ZSTD_decompressContinue' then my codes work fine:

    static Bool ZSTDCompress(File &src, File &dest, Int compression_level)
    {
       Bool ok=false;
       if(ZSTD_CCtx *ctx=ZSTD_createCCtx_advanced(ZSTDMem))
       {
          ZSTD_parameters params; Zero(params);
          params.cParams=ZSTD_getCParams(Mid(compression_level, 1, ZSTD_maxCLevel()), src.left(), 0);
          if(!ZSTD_isError(ZSTD_compressBegin_advanced(ctx, null, 0, params, src.left())))
          {
             // sizes for 'window_size', 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_compress.c" file, "ZBUFF_compressInit_advanced" function
           C Int window_size=1<<params.cParams.windowLog, block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s, d; s.setNum(window_size+block_size); d.setNum(ZSTDSize(block_size)+1); Int s_pos=0;
             for(; !src.end(); )
             {
                Int read=Min(ZSTD_BLOCKSIZE_MAX, Min(s.elms(), src.left())); // ZSTD_BLOCKSIZE_MAX taken from 'ZBUFF_recommendedCInSize' (without this, 'ZSTD_compressContinue' may fail with 'dest' too small error)
                if(s_pos>s.elms()-read)s_pos=0; // if reading will exceed buffer size
                read=src.getReturnSize(&s[s_pos], read); if(read<=0)goto error;
                auto size=ZSTD_compressContinue(ctx, d.data(), d.elms(), &s[s_pos], read); if(ZSTD_isError(size))goto error;
                if(!dest.put(d.data(), size))goto error;
                s_pos+=read;
             }
             auto size=ZSTD_compressEnd(ctx, d.data(), d.elms()); if(ZSTD_isError(size))goto error;
             if(dest.put(d.data(), size))ok=true;
          }
       error:
          ZSTD_freeCCtx(ctx);
       }
       return ok;
    }
    static Bool ZSTDDecompress(File &src, File &dest, Long compressed_size, Long decompressed_size)
    {
       Bool ok=false;
       if(ZSTD_DCtx *ctx=ZSTD_createDCtx_advanced(ZSTDMem))
       {
          ZSTD_decompressBegin(ctx);
          Byte header[ZSTD_frameHeaderSize_max];
          Long pos=src.pos();
          Int read=src.getReturnSize(header, SIZE(header));
          src.pos(pos);
          ZSTD_frameParams frame; if(!ZSTD_getFrameParams(&frame, header, read))
          {
             Long start=dest.pos();
             // sizes for 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_decompress.c" file, "ZBUFF_decompressContinue" function
           C auto block_size=Min(frame.windowSize, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s; s.setNum(block_size);
             for(;;)
             {
                auto size=ZSTD_nextSrcSizeToDecompress(ctx); if(!size){if(dest.pos()-start==decompressed_size)ok=true; break;} if(ZSTD_isError(size) || size>s.elms())break;
                if(!src.getFast(s.data(), size))break; // need exactly 'size' amount
                size=ZSTD_decompressContinue(ctx, dest.mem(), dest.left(), s.data(), size); if(ZSTD_isError(size))break;
                if(!MemWrote(dest, size))break;
             }
          }
          ZSTD_freeDCtx(ctx);
       }
       return ok;
    }
    

    But if I replace them with 'ZSTD_compressBlock' and 'ZSTD_decompressBlock', (including writing/reading the compressed buffer size before each buffer), then decompression fails:

    static Bool ZSTDCompressRaw(File &src, File &dest, Int compression_level)
    {
       Bool ok=false;
       if(ZSTD_CCtx *ctx=ZSTD_createCCtx_advanced(ZSTDMem))
       {
          ZSTD_parameters params; Zero(params);
          params.cParams=ZSTD_getCParams(Mid(compression_level, 1, ZSTD_maxCLevel()), src.left(), 0);
          if(!ZSTD_isError(ZSTD_compressBegin_advanced(ctx, null, 0, params, src.left())))
          {
             // sizes for 'window_size', 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_compress.c" file, "ZBUFF_compressInit_advanced" function
           C Int window_size=1<<params.cParams.windowLog, block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s, d; s.setNum(window_size+block_size); d.setNum(ZSTDSize(block_size)+1); Int s_pos=0;
             dest.cmpUIntV(params.cParams.windowLog);
             for(; !src.end(); )
             {
                Int read=Min(ZSTD_BLOCKSIZE_MAX, Min(s.elms(), src.left())); // ZSTD_BLOCKSIZE_MAX taken from 'ZBUFF_recommendedCInSize' (without this, 'ZSTD_compressContinue' may fail with 'dest' too small error)
                if(s_pos>s.elms()-read)s_pos=0; // if reading will exceed buffer size
                read=src.getReturnSize(&s[s_pos], read); if(read<=0)goto error;
                auto size=ZSTD_compressBlock(ctx, d.data(), d.elms(), &s[s_pos], read); if(ZSTD_isError(size))goto error;
                if(  size>0) // compressed OK
                {
                   dest.cmpIntV(size-1);
                   if(!dest.put(d.data(), size))goto error;
                }else // failed to compress
                {
                   dest.cmpIntV(-read);
                   if(!dest.put(&s[s_pos], read))goto error;
                }
                s_pos+=read;
             }
             ok=true;
          }
       error:
          ZSTD_freeCCtx(ctx);
       }
       return ok;
    }
    static Bool ZSTDDecompressRaw(File &src, File &dest, Long compressed_size, Long decompressed_size)
    {
       Bool ok=false;
       if(ZSTD_DCtx *ctx=ZSTD_createDCtx_advanced(ZSTDMem))
       {
          ZSTD_decompressBegin(ctx);
          // sizes for 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_decompress.c" file, "ZBUFF_decompressContinue" function
        C auto window_size=1<<src.decUIntV(), block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
          Memt<Byte> s; s.setNum(block_size);
          for(; !src.end(); )
          {
             Int chunk; src.decIntV(chunk);
             if( chunk<0) // un-compressed
             {
                if(!src.copy(dest, -chunk))goto error;
             }else
             {
                chunk++; if(chunk>s.elms())goto error;
                if(!src.getFast(s.data(), chunk))goto error; // need exactly 'chunk' amount
                auto size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); if(ZSTD_isError(size))Exit(ZSTD_getErrorName(size)); // here the error occurs
                if(!MemWrote(dest, size))goto error; // this does: dest.mem+=size; and dest.left-=size;
             }
          }
          ok=true;
       error:
          ZSTD_freeDCtx(ctx);
       }
       return ok;
    }
    

    The error occurs at the second call to ZSTD_decompressBlock First call succeeds: chunk=96050 size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); size=131072

    Second call fails: chunk=94707 size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); size=18446744073709551605 ("Corrupted block detected")

    Am I missing something obvious here?

    When decompressing, the 'dest' File, in this test is a continuous memory capable of storing the entire decompressed data. And with each decompression call, I am advancing 'dest.mem' to the next decompressed chunk position.

    Thanks for any help

  • Weird issues with using the streaming API vs `ZSTD_compress()`

    Weird issues with using the streaming API vs `ZSTD_compress()`

    I trying out the streaming API to write the equivalent of .NET's GZipStream (source), and I'm seeing some strange things.

    I'm using 0.6.1 for the tests, though the API seems to be unchanged in 0.7 at the moment.

    A stream works by having an internal buffer of 128 KB (131,072 bytes exactly). Each call to Write(..) appends any number of bytes to the buffer (could be called with 1 byte, could be called with 1 GB). Everytime the buffer is full, its content is compressed via ZSTD_compressContinue() on an empty destination buffer, and the result is copied into another stream down the line. When the producer is finished writing, it will Close the stream which will compress any pending data in its internal buffer (so anywhere between 1 and 131,071 bytes), call zstd_compress_end, and flush the final bytes to the stream.

    Seen from zstd, the pattern looks like:

    • ZSTD_compressBegin()
    • ZSTD_compressContinue() 131,072 bytes
    • ZSTD_compressContinue() 131,072 bytes
    • ...
    • ZSTD_compressContinue() 123 bytes (last chunk will always be < 128KB)
    • ZSTD_compressEnd()

    I'm comparing the final result, with calling ZSTD_compress() on the complete content of the input stream (ie: storing everything written into a memory buffer, and compress that in one step).

    Issue 1: ZSTD_compress() adds an extra empty frame at the start

    Looking at the compressed result, I see that usually a single call to ZSTD_compress() adds 6 bytes to the input.

    The left side is the compressed output of ZSTD_compress() on the whole file. The right side is the result of streaming with chunks of 128 KB on the same data:

    Left size: 23,350 bytes Right size: 23,344 bytes

    image

    The green part is identical between both files, only 7 bytes differ right after the header, and before the first compressed frame.

    Both results, when passed to ZSTD_decompress() return the same input text with no issues.

    Issue 2: N calls to ZSTD_compressContinue() produce N time the size of a single call to ZSTD_compress() on highly compressible data

    While testing with some text document, duplicated a bunch of time to get to to about 300KB (ie: the same 2 or 3 KB of text repeated about 100 times), I'm getting something strange

    • The result of calling zstd_compress on the whole 300KB returns a single 2.5 KB output.
    • The result of streaming using 3 calls to ZSTD_compressContinue() produces 7.5 KB output (3 times larger).

    Looking more closely: each call to ZSTD_compressContinue() returns 2.5 KB (first two calls with 128KB worth of text, third call with only 50 KB), which is too exact to be a coincidence.

    Since the dataset is the equivalent of "ABCABCABC..." a hundred times, I'm guessing that compressing 25%, 50% or 100% of it would produce the same output, which would look something like "repeat 'ABC' 100 times" vs "repeat 'ABC' 200 times".

    Only, when compressing 25% at a time, you get 4 times as many calls to ZSTD_compressContinue(), which will give you 4 times the output. Compressing 12.5% at a time would probably yield 8 times the output.

    image

    When changing the internal buffer size from 128KB down to 16KB, I get a result of 45 KiB, which is about 6x times more than before.

    Fudging the input data to get lower compression ratio makes this effect disappear progressively, until a point where the result of the streaming API is about the same as a single compression call (except the weird extra 6 bytes in the previous issue).

  • Reduce size of dctx by reutilizing dst buffer

    Reduce size of dctx by reutilizing dst buffer

    WIP, this round of optimizations has gotten performance much closer to parity, though it has introduced a checksum error in the 270MB file test I'm still tracking down. This however hasn't affected the smaller size tests; benchmarks indicate that in some cases we now see performance improvements on top of the memory reduction due to the improved cache behavior. However there's other cases, at low file sizes and high compressibility, where we are still about 1% behind parity.

    Benchmark

    old performance

    ./tests/fullbench -b2 -B1000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 4987.5 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 612.0 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 585.8 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2597.8 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2635.7 MB/s ( 1000) ./tests/fullbench -b2 -B10000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 36167.5 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1292.4 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1671.9 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 3205.2 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 6179.7 MB/s ( 10000) ./tests/fullbench -b2 -B100000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 51880.0 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 1237.1 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 2151.4 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 3193.0 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 7095.5 MB/s ( 100000) ./tests/fullbench -b2 -B1000000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 34106.0 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1309.3 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1973.5 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2637.8 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 14852.5 MB/s ( 1000000)

    new performance

    ./tests/fullbench -b2 -B1000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 4999.4 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 609.1 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 583.5 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2402.1 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2587.4 MB/s ( 1000) ./tests/fullbench -b2 -B10000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 37441.8 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1297.5 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1656.7 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 3081.0 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 6127.2 MB/s ( 10000) ./tests/fullbench -b2 -B100000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 52215.9 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 1252.2 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 2146.6 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 3614.6 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 7084.7 MB/s ( 100000) ./tests/fullbench -b2 -B1000000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 33857.1 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1288.9 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2095.4 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2786.2 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 15258.4 MB/s ( 1000000)

  • Patents?

    Patents?

    Does PATENTS from 4ded9e5 refer to specific patents that are actually used by zstd or is it a generic file added to github.com/facebook projects and zstd doesn't use any patented tech?

    Ie. is the "recipient of the software" the developer of a software using zstd (presumably) and/or also the user of a software using zstd?

    In the latter case, what does it imply in layman's terms? Would the patent license self-terminate if a company using a software using zstd did anything listed in (i)–(iii) in the second paragraph (line 14)?

    Clearing this up would be pretty important I feel before this can be used in other FOSS projects.

    Additional Grant of Patent Rights Version 2

    "Software" means the Zstandard software distributed by Facebook, Inc.

    Facebook, Inc. ("Facebook") hereby grants to each recipient of the Software ("you") a perpetual, worldwide, royalty-free, non-exclusive, irrevocable (subject to the termination provision below) license under any Necessary Claims, to make, have made, use, sell, offer to sell, import, and otherwise transfer the Software. For avoidance of doubt, no license is granted under Facebook’s rights in any patent claims that are infringed by (i) modifications to the Software made by you or any third party or (ii) the Software in combination with any software or other technology.

    The license granted hereunder will terminate, automatically and without notice, if you (or any of your subsidiaries, corporate affiliates or agents) initiate directly or indirectly, or take a direct financial interest in, any Patent Assertion: (i) against Facebook or any of its subsidiaries or corporate affiliates, (ii) against any party if such Patent Assertion arises in whole or in part from any software, technology, product or service of Facebook or any of its subsidiaries or corporate affiliates, or (iii) against any party relating to the Software. Notwithstanding the foregoing, if Facebook or any of its subsidiaries or corporate affiliates files a lawsuit alleging patent infringement against you in the first instance, and you respond by filing a patent infringement counterclaim in that lawsuit against that party that is unrelated to the Software, the license granted hereunder will not terminate under section (i) of this paragraph due to such counterclaim.

    A "Necessary Claim" is a claim of a patent owned by Facebook that is necessarily infringed by the Software standing alone.

    A "Patent Assertion" is any lawsuit or other action alleging direct, indirect, or contributory infringement or inducement to infringe any patent, including a cross-claim or counterclaim.

  • Adding --long support for --patch-from

    Adding --long support for --patch-from

    Patch From Zstandard is introducing a new command line option —patch-from= which leverages our existing compressors, dictionaries and the long range match finder to deliver a high speed engine for producing and applying patches to files.

    Patch from increases the previous maximum limit for dictionaries from 32 MB to 2 GB. Additionally, it maintains fast speeds on lower compression levels without compromising patch size by using the long range match finder (now extended to find dictionary matches). By default, Zstandard uses a heuristic based on file size and internal compression parameters to determine when to activate long mode but it can also be manually specified as before.

    Patch from also works with multi-threading mode at a minimal compression ratio loss vs single threaded mode.

    Example usage:

    # create the patch
    zstd --patch-from=<oldfile> <newfile> -o <patchfile>
    
    # apply the patch
    zstd -d --patch-from=<oldfile> <patchfile> -o <newfile>
    

    Benchmarks: We compared zstd to bsdiff, a popular industry grade diff engine. Our testing data were tarballs of different versions of source code from popular GitHub repositories. Specifically

    repos = {
        # ~31mb (small file)
        "zstd": {"url": "https://github.com/facebook/zstd", "dict-branch": "refs/tags/v1.4.2", "src-branch": "refs/tags/v1.4.3"},
        # ~273mb (medium file)
        "wordpress": {"url": "https://github.com/WordPress/WordPress", "dict-branch": "refs/tags/5.3.1", "src-branch": "refs/tags/5.3.2"},
        # ~1.66gb (large file)
        "llvm": {"url": "https://github.com/llvm/llvm-project", "dict-branch": "refs/tags/llvmorg-9.0.0", "src-branch": "refs/tags/llvmorg-9.0.1"}
    }
    

    alt text Patch from on level 19 (with chainLog=30 and targetLength=4kb) remains competitive with bsdiff when comparing patch sizes.
    alt text And patch from greatly outperforms bsdiff in speed even on its slowest setting of level 19 boasting an average speedup of ~7X. Patch from is >200X faster on level 1 and >100X faster (shown below) on level 3 vs bsdiff while still delivering patch sizes less than 0.5% of the original file size. alt text

    And of course, there is no change to the fast zstd decompression speed.

  • pzstd compression ratios vs zstd

    pzstd compression ratios vs zstd

    We've run some benchmarks on a number of our internal backups and noticed that while pzstd is beautifully fast, it seems to produce worse results:

    time zstd -11 X -o X.11
    X : 13.37%   (123965807088 => 16572036784 bytes, X.11) 
    
    real	49m32.875s
    user	42m6.636s
    sys	0m49.172s
    
    
    time pzstd -11 -p 1 X -o X.11.1thread
    X : 13.76%   (123965807088 => 17056707732 bytes, X.11.1thread)
    
    real	42m50.245s
    user	40m33.648s
    sys	0m44.436s
    
    
    time pzstd -11 -p 3 X -o X.11.3threads
    X : 13.76%   (123965807088 => 17056707732 bytes, X.11.3threads)
    
    real	21m53.584s  <- bottlenecked by the slow hdd
    user	58m14.732s
    sys	1m0.036s
    

    Is this part of the design or a bug? We also noticed that pzstd -p 1 ran faster than zstd.

  • Consider re-licensing to Apache License v2

    Consider re-licensing to Apache License v2

    Hello,

    The Apache Software Foundation recently changed its policy regarding the "Facebook BSD+patents" license that applies to zstd and many other FB open-source projects, and now considers it unsuitable for inclusion in ASF projects. There is a discussion of this in the context of RocksDB on LEGAL-303, which was resolved when RocksDB was relicensed as dual ALv2 and GPLv2.

    Is the zstd community also open to relicensing with ALv2? This change would be helpful for Apache Hadoop (of which I'm a PMC member) since it would let us bundle zstd as part of our release artifacts. @omalley also expressed interest in this relicensing as an Apache ORC PMC member.

    Thanks in advance!

  • small files compression / dictionary issues

    small files compression / dictionary issues

    Hi,

    I am struggling a bit with the compression ratio that remains very low, while I think it should be higher.

    I am compressing a huge number of small data chunks (1323 bytes), representing terrain elevation data for a terrain rendering system.

    The chunk is made of 441x u16 (elevation) + 441x u8 (alpha).

    Now, I understand small data are not great, therefore I tried the 'dictionary' approach. But whatever I do, I can't even reach a 2x compression ratio.

    Some weird things I have noticed:

    • using a 100 KB target dictionary along with a 10 MB samples buffer (each sample being 1323 bytes), I get a dictionary of 73 KB (not a big deal actually) and everything works as expected (but I am still left with my low compression ratio)
    • so I thought I would create a larger dictionary, and tried to train a 1MB dictionary using a 100MB sample buffer. However in that case, the dictionary training phase lasts forever (i.e. I interrupted it after 1 hour or so) => fail.

    Would you have some hints for me ? Any idea how I could achieve higher compression ratio ? I need to be able to read/uncompress each tile randomly (so streaming is not an option).

    Basically, the source data are the SRTM data (roughly 15 GB zip compressed) - when put into small tiles and compressed using ZSTD, it compresses down to about 100 GB (dictionary doesn't even bring 5%).

    Thanks a lot ! Greg

  • [0.7.0] No more VS2013 project makes it difficult to target a MS VC Runtime above 10.0

    [0.7.0] No more VS2013 project makes it difficult to target a MS VC Runtime above 10.0

    I need to build the Windows library targetting the MS VC Runtime 12.0 (the one that came out with VS2013, which maps to MSVCR120.dll), but in the latest dev 0.7.0 branch, there is only two projects left: one for Visual Studio 2008 and one for Visual Studio 2010 (targetting MSVCR100.dll). The project for VS2013 is gone. I'm not sure how to use the CMake stuff, but after a quick look, it does not seem to have code that specifically deal with the version of the VC runtime in it.

    Usually, each developper will use version X of Visual Studio (2010, 2013, 2015, 15/vNext, ....), while targetting a version Y of the Microsoft VC Runtime (10.0, 11.0, 12.0, 14.0, ..., with Y <= X). I guess most dev may use a more recent version of VS than the version of the runtime they target, because retargetting all your dependencies to the latest VC runtime can be a lot of work, or even not possible at all. This means that the version of the VC runtime may not be the same as the version of Visual Studio project, and will be different for each person.

    Currently, if you have VS 2015 Update 2, and open the VS2010 project in the dev branch, VS attempts to convert the project to the latest version possible (14.0, which will probably break because of some includes that are different). And if you don't perform the conversion, then the build will probably fail because you don't have the 10.0 SDK installed on your dev machine or CI build server (unless you have also installed VS2010 before). Plus, if you want a different version (12.0 for me), you then need to update each project (for all the projects, times two for Release/Debug, times two again for Win32/x64). And then you are left with a change in the .vcproj that will cause trouble when you update from git again (forcing you to maintain a custom branch).

    I can see that having to maitain one set of projects for each versions of Visual Studio can be too much work, but what do you think would be the best way to be able to specify which version of the VC runtime to target when building? Maybe a settings for CMake that would then create a set of VS projects specific to you, and not checked in the repo? (this would be ideal).

  • Update linux kernel to latest zstd (from 1.4.10)

    Update linux kernel to latest zstd (from 1.4.10)

    Hi, people of reddit asked (https://old.reddit.com/r/kernel/comments/xp2o53/why_is_the_version_of_zstd_in_the_kernel_outdated/) about updating the linux port of zstd to newer version. This has been promised back then when 1.4.10 was synced a year ago and I'd be glad to see an update (or even a regular update in each kernel release) as btrfs is using zstd and any performance improvement is most welcome.

    There's automation support (contrib/linux-kernel) so generating the patch itself should not take too much time. As there are more things to do like review and benchmarking it's not the last step but at least the preliminary version can be added to linux-next tree. The final pull request sent to Linus may or may not happen depending on the testing results.

    IMHO neglecting the regular updates is more "expensive", like it was with the 1.4.10 update that took about a year and a lot of convincing. The linux-next tree is really convenient as it does not pose a huge risk for users and helps to catch bugs early.

  • Make CMake official? (Makefile build does not provide CMake config file)

    Make CMake official? (Makefile build does not provide CMake config file)

    README.md says

    make is the officially maintained build system of this project.

    When using Makefile, CMake config files like zstdConfig.cmake is not installed. This makes projects using CMake awkward to use zstd. E.g. llvm-project has

    // https://github.com/llvm/llvm-project/blob/main/llvm/cmake/config-ix.cmake
    if(LLVM_ENABLE_ZSTD)
      if(LLVM_ENABLE_ZSTD STREQUAL FORCE_ON)
        find_package(zstd REQUIRED)
        if(NOT zstd_FOUND)
          message(FATAL_ERROR "Failed to configure zstd, but LLVM_ENABLE_ZSTD is FORCE_ON")
        endif()
      elseif(NOT LLVM_USE_SANITIZER MATCHES "Memory.*")
        find_package(zstd QUIET)
      endif()
    endif()
    set(LLVM_ENABLE_ZSTD ${zstd_FOUND})
    
    // https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/CMakeLists.txt#L28
    if(LLVM_ENABLE_ZSTD)
      if(TARGET zstd::libzstd_shared AND NOT LLVM_USE_STATIC_ZSTD)
        set(zstd_target zstd::libzstd_shared)
      else()
        set(zstd_target zstd::libzstd_static)
      endif()
    endif()
    

    It could add pkg-config fallback but that is inconvenient, and logic like zstd::libzstd_shared does not have a good replacement.

    Related:

    • https://bugs.gentoo.org/872254
    • https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020403

    The simplest solution is to make CMake official so downstream is motivated to switch to CMake.

  • Enable the OpenSSF Scorecard Github Action

    Enable the OpenSSF Scorecard Github Action

    Hello, I'm working on behalf of Google and the Open Source Security Foundation to help essential open-source projects improve their supply-chain security. Given relevance that the Zstandard has on countless projects, the OpenSSF has identified it as one of the 100 most critical open source projects.

    Is your feature request related to a problem? Please describe. According to Open Source Security and Risk Analysis Report, 84% of all codebases have at least one vulnerability, with an average of 158 per codebase. The majority have been in the code for more than 2 years and have documented solutions available.

    Even in large tech companies, the tedious process of reviewing code for vulnerabilities falls down the priority list, and there is little insight into known vulnerabilities and solutions that companies can draw on.

    That’s where the OpenSSF tool called Scorecards is helping. Its focus is to understand the security posture of a project and assess the risks that the dependencies could introduce.

    Describe the solution you'd like Scorecards runs dozens of automated security checks to help maintainers better understand their project's supply-chain security posture. It is developed by the OpenSSF, in partnership with GitHub.

    To simplify maintainers' lives, the OpenSSF has also developed the Scorecard GitHub Action. It is very lightweight and runs on every change to the repository's main branch. The results of its checks are available on the project's security dashboard, and include suggestions on how to solve any issues (see examples in additional context). The Action does not run or interact with any workflows, but merely parses them to identify possible vulnerabilities. This Action has been adopted by 1800+ projects already.

    Zstandard already follow many of the Scorecard recommended best practices and criterias for a greater security, such as not having any binary artifacts, CI-Tests, code review, Fuzzing, etc. Although, there are still some criterias that would need to be improved to achieve a good level of security. In these cases, the Scorecard Github Action can help on diagnosing and proposing solutions.

    Would you be interested in a PR which adds this Action? Optionally, it can also publish your results to the OpenSSF REST API, which allows a badge with the project's score to be added to its README.

    Additional context

    Code scanning dashboard with multiple alerts, including Code-Review and Token-Permissions

    Detail of a Token-Permissions alert, indicating the specific file and remediation steps

  • [Help Wanted] should `ZSTD_flushStream` behaves differently from `ZSTD_endStream` when used for flushing purposes?

    [Help Wanted] should `ZSTD_flushStream` behaves differently from `ZSTD_endStream` when used for flushing purposes?

    Hi, I'm using 1.5.2, through jni(https://github.com/luben/zstd-jni 1.5.2-3), both are latest version as of writing

    I use streaming compression to stream some lines of json string, delimited by line feed. After some critical point is reached I would like to flush zstd buffer so that the receiver can decode them immediately. I added a switch to call either ZSTD_flushStream or ZSTD_endStream for flushing purposes. Am I correct that they should be equivalent if the receiver is looping on ZSTD_decompressStream(ignoring its return value, thus also ignoring frame boundaries)?

    More specifically, I found that: when using ZSTD_endStream the receiver never hangs on receive, but using ZSTD_flushStream the receiver almost always hang due to insufficient bytes available(so.read(...) inside read0, code at bottom), that doesn't seems to be correct. This can be reproduced on my production system(I'm trying to minimize it into a minimal reproducer in the mean time), but it can't be reproduced on smaller dataset(yet)

    pointers and their offsets should already be updated and advanced properly according to doc, both pointer arith and error checking code are omitted from the diagram I'll post more code if the the information provided here is not enough(and when I've successfully produced minimal reproducer), but the same code works for smaller dataset, and the only change is the eof flag: from using ZSTD_endStream to ZSTD_flushStream

    reference diagram image

    code snippets: https://github.com/luben/zstd-jni/blob/905202f10e5355cf7ed7a558e909558bc6ebf184/src/main/native/jni_directbuffercompress_zstd.c https://github.com/luben/zstd-jni/blob/905202f10e5355cf7ed7a558e909558bc6ebf184/src/main/native/jni_directbufferdecompress_zstd.c

        private fun castZstd(n:Long)=if(isError(n))throw ZstdException(n)else n.toIntExact()
        fun decompress(dst:ByteBuffer,src:ByteBuffer):Int{
            if(stream<0)throw ClosedChannelException()
            val n=castZstd(decompressStream(stream,dst,dst.position(),dst.remaining(),src,src.position(),src.remaining()))
            src.position(src.position()+consumed)
            dst.position(dst.position()+produced)
            return n
        }
        suspend fun compress(m:ByteBuffer){ // input buffer
            if(stream<0)throw ClosedChannelException();while(m.hasRemaining()){
                if(!buffer.hasRemaining())flushBuffer()
                castZstd(compressDirectByteBuffer(stream,buffer,buffer.position(),buffer.remaining(),m,m.position(),m.remaining()))
                buffer.position(buffer.position()+produced)
                m.position(m.position()+consumed)
            }
        }
        suspend fun flush(eof:Boolean){ // the previously mentioned `flag`
            if(stream<0)throw ClosedChannelException();val fn=if(eof)::endStream else::flushStream;do{
                val n=castZstd(fn(stream,buffer,buffer.position(),buffer.remaining()))
                buffer.position(buffer.position()+produced)
                flushBuffer()
            }while(n!=0)
        }
        // socket outer loop, if `buffer` doesn't contain '\n', call `read0` one more time
        override suspend fun read0():Int{ // `produced` is how much data is produced by calling `decompress`
            do v=decompress(buffer,so.read(max(1,v)))while(produced==0)
            return produced
        }
    
  • FATAL ERROR: zstd uncompress failed with error code 10

    FATAL ERROR: zstd uncompress failed with error code 10

    Describe the bug I am trying to compress a directory containing files of size ~= 2.5GiB via mksquashfs /input/path/* /path/to/output.img -comp zstd -b 256K -noappend -Xcompression-level 22

    /path/to/output.img is inside a mounted s3fs directory (that's the possible cause of issue, but current architecture of the tool is reason to do that). however the same command work for data less than ~= 1 GiB.

    To Reproduce Steps to reproduce the behavior:

    1. Directory with more that 2.5 GiB data
    2. A mounted s3fs bucket
    3. The above mentioned command
    4. FATAL ERROR: zstd uncompress failed with error code 10

    Expected behavior This thing should work the same way it works with data lower than 1 GiB

    Error Code FATAL ERROR: zstd uncompress failed with error code 10

    Desktop (please complete the following information):

    • OS: Ubuntu
    • Version 22.04

    Additional context Error is on myside for sure, I need help in debugging the error code as i am not much aware about the code base of ZSTD,

    • I tried to get information of the exit code from manual but missing.

    I also tried to use the ZSTD_getErrorName(10) for some meaningful information but it gave No error detected.

Related tags
A simple C library implementing the compression algorithm for isosceles triangles.

orvaenting Summary A simple C library implementing the compression algorithm for isosceles triangles. License This project's license is GPL 2 (as of J

Apr 1, 2022
Better lossless compression than PNG with a simpler algorithm

Zpng Small experimental lossless photographic image compression library with a C API and command-line interface. It's much faster than PNG and compres

Sep 18, 2022
Brotli compression format

SECURITY NOTE Please consider updating brotli to version 1.0.9 (latest). Version 1.0.9 contains a fix to "integer overflow" problem. This happens when

Sep 26, 2022
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

Sep 24, 2022
LZFSE compression library and command line tool

LZFSE This is a reference C implementation of the LZFSE compressor introduced in the Compression library with OS X 10.11 and iOS 9. LZFSE is a Lempel-

Sep 29, 2022
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Sep 26, 2022
A massively spiffy yet delicately unobtrusive compression library.

ZLIB DATA COMPRESSION LIBRARY zlib 1.2.11 is a general purpose data compression library. All the code is thread safe. The data format used by the z

Sep 25, 2022
Lossless data compression codec with LZMA-like ratios but 1.5x-8x faster decompression speed, C/C++

LZHAM - Lossless Data Compression Codec Public Domain (see LICENSE) LZHAM is a lossless data compression codec written in C/C++ (specifically C++03),

Oct 1, 2022
A bespoke sample compression codec for 64k intros
A bespoke sample compression codec for 64k intros

pulsejet A bespoke sample compression codec for 64K intros codec pulsejet lifts a lot of ideas from Opus, and more specifically, its CELT layer, which

Jul 25, 2022
A variation CredBandit that uses compression to reduce the size of the data that must be trasnmitted.

compressedCredBandit compressedCredBandit is a modified version of anthemtotheego's proof of concept Beacon Object File (BOF). This version does all t

Sep 22, 2022
Data compression utility for minimalist demoscene programs.

bzpack Bzpack is a data compression utility which targets retrocomputing and demoscene enthusiasts. Given the artificially imposed size limits on prog

Jul 27, 2022
gzip (GNU zip) is a compression utility designed to be a replacement for 'compress'

gzip (GNU zip) is a compression utility designed to be a replacement for 'compress'

Apr 27, 2022
Advanced DXTc texture compression and transcoding library

crunch/crnlib v1.04 - Advanced DXTn texture compression library Public Domain - Please see license.txt. Portions of this software make use of public d

Sep 23, 2022
A fast compressor/decompressor

Snappy, a fast compressor/decompressor. Introduction Snappy is a compression/decompression library. It does not aim for maximum compression, or compat

Sep 25, 2022
Analysing and implementation of lossless data compression techniques like Huffman encoding and LZW was conducted along with JPEG lossy compression technique based on discrete cosine transform (DCT) for Image compression.

PROJECT FILE COMPRESSION ALGORITHMS - Huffman compression LZW compression DCT Aim of the project - Implement above mentioned compression algorithms an

Dec 14, 2021
A fast and small port of Zstandard to WASM.

Zstandard WASM A fast and small port of Zstandard to WASM. (Decompress-only for now). Features Fast: Zstandard has been compiled with the -03 flag, so

Jul 27, 2022
Sep 8, 2022
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Sep 23, 2022
PyFLAC - Real-time lossless audio compression in Python
PyFLAC - Real-time lossless audio compression in Python

A simple Pythonic interface for libFLAC. FLAC stands for Free Lossless Audio Codec, an audio format similar to MP3, but lossless, meaning that audio i

Aug 4, 2022
Extremely Fast Compression algorithm

LZ4 - Extremely fast compression LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core, scalable with multi-cores CPU

Sep 28, 2022