Parsing gigabytes of JSON per second

Fuzzing Status Ubuntu 18.04 CI Ubuntu 20.04 CI VS16-CI MinGW64-CI Doxygen Documentation

simdjson : Parsing gigabytes of JSON per second

JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 2.5x faster than RapidJSON and 25x faster than JSON for Modern C++.
  • Fast: Over 2.5x faster than commonly used production-grade JSON parsers.
  • Record Breaking Features: Minify JSON at 6 GB/s, validate UTF-8 at 13 GB/s, NDJSON at 3.5 GB/s.
  • Easy: First-class, easy to use and carefully documented APIs.
  • Beyond DOM: Try the new On Demand API for twice the speed (>4GB/s).
  • Strict: Full JSON and UTF-8 validation, lossless parsing. Performance with no compromises.
  • Automatic: Selects a CPU-tailored parser at runtime. No configuration needed.
  • Reliable: From memory allocation to error handling, simdjson's design avoids surprises.
  • Peer Reviewed: Our research appears in venues like VLDB Journal, Software: Practice and Experience.

This library is part of the Awesome Modern C++ list.

Table of Contents

Quick Start

The simdjson library is easily consumable with a single .h and .cpp file.

  1. Prerequisites: g++ (version 7 or better) or clang++ (version 6 or better), and a 64-bit system with a command-line shell (e.g., Linux, macOS, freeBSD). We also support programming environments like Visual Studio and Xcode, but different steps are needed.

  2. Pull simdjson.h and simdjson.cpp into a directory, along with the sample file twitter.json.

    wget https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.h https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.cpp https://raw.githubusercontent.com/simdjson/simdjson/master/jsonexamples/twitter.json
    
  3. Create quickstart.cpp:

    #include "simdjson.h"
    int main(void) {
      simdjson::dom::parser parser;
      simdjson::dom::element tweets = parser.load("twitter.json");
      std::cout << tweets["search_metadata"]["count"] << " results." << std::endl;
    }
  4. c++ -o quickstart quickstart.cpp simdjson.cpp

  5. ./quickstart

    100 results.
    

On Demand

The new On Demand JSON parser is just as easy, but much faster due to just-in-time parsing. It is in alpha right now. More information can be found in the On Demand Guide.

  1. Do step 1 of the Quick Start.

  2. Create quickstart.cpp:

    #include "simdjson.h"
    using namespace simdjson;
    int main(void) {
       ondemand::parser parser;
       padded_string json = padded_string::load("twitter.json");
       ondemand::document tweets = parser.iterate(json);
       std::cout << uint64_t(tweets["search_metadata"]["count"]) << " results." << std::endl;
    }
  3. c++ -march=native -o quickstart quickstart.cpp simdjson.cpp

  4. ./quickstart

    100 results.
    

You'll notice that the code here is very similar to the main Quick Start code (and indeed, it does the same thing). However, if you compare the performance, you should find On Demand much faster.

Documentation

Usage documentation is available:

  • Basics is an overview of how to use simdjson and its APIs.
  • Performance shows some more advanced scenarios and how to tune for them.
  • Implementation Selection describes runtime CPU detection and how you can work with it.
  • API contains the automatically generated API documentation.

Performance results

The simdjson library uses three-quarters less instructions than state-of-the-art parser RapidJSON and fifty percent less than sajson. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second (GB/s) on commodity processors. It can parse millions of JSON documents per second on a single core.

The following figure represents parsing speed in GB/s for parsing various files on an Intel Skylake processor (3.4 GHz) using the GNU GCC 9 compiler (with the -O3 flag). We compare against the best and fastest C++ libraries. The simdjson library offers full unicode (UTF-8) validation and exact number parsing. The RapidJSON library is tested in two modes: fast and exact number parsing. The sajson library offers fast (but not exact) number parsing and partial unicode validation. In this data set, the file sizes range from 65KB (github_events) all the way to 3.3GB (gsoc-2018). Many files are mostly made of numbers: canada, mesh.pretty, mesh, random and numbers: in such instances, we see lower JSON parsing speeds due to the high cost of number parsing. The simdjson library uses exact number parsing which is particular taxing.

On a Skylake processor, the parsing speeds (in GB/s) of various processors on the twitter.json file are as follows, using again GNU GCC 9.1 (with the -O3 flag). The popular JSON for Modern C++ library is particularly slow: it obviously trades parsing speed for other desirable features.

parser GB/s
simdjson 2.5
RapidJSON UTF8-validation 0.29
RapidJSON UTF8-valid., exact numbers 0.28
RapidJSON insitu, UTF8-validation 0.41
RapidJSON insitu, UTF8-valid., exact 0.39
sajson (insitu, dynamic) 0.62
sajson (insitu, static) 0.88
dropbox 0.13
fastjson 0.27
gason 0.59
ultrajson 0.34
jsmn 0.25
cJSON 0.31
JSON for Modern C++ (nlohmann/json) 0.11

The simdjson library offers high speed whether it processes tiny files (e.g., 300 bytes) or larger files (e.g., 3MB). The following plot presents parsing speed for synthetic files over various sizes generated with a script on a 3.4 GHz Skylake processor (GNU GCC 9, -O3).

All our experiments are reproducible.

You can go beyond 4 GB/s with our new On Demand API. For NDJSON files, we can exceed 3 GB/s with our multithreaded parsing functions.

Real-world usage

If you are planning to use simdjson in a product, please work from one of our releases.

Bindings and Ports of simdjson

We distinguish between "bindings" (which just wrap the C++ code) and a port to another programming language (which reimplements everything).

About simdjson

The simdjson library takes advantage of modern microarchitectures, parallelizing with SIMD vector instructions, reducing branch misprediction, and reducing data dependency to take advantage of each CPU's multiple execution cores.

Some people enjoy reading our paper: A description of the design and implementation of simdjson is in our research article:

We have an in-depth paper focused on the UTF-8 validation:

We also have an informal blog post providing some background and context.

For the video inclined,
simdjson at QCon San Francisco 2019
(it was the best voted talk, we're kinda proud of it).

Funding

The work is supported by the Natural Sciences and Engineering Research Council of Canada under grant number RGPIN-2017-03910.

Contributing to simdjson

Head over to CONTRIBUTING.md for information on contributing to simdjson, and HACKING.md for information on source, building, and architecture/design.

License

This code is made available under the Apache License 2.0.

Under Windows, we build some tools using the windows/dirent_portable.h file (which is outside our library code): it under the liberal (business-friendly) MIT license.

For compilers that do not support C++17, we bundle the string-view library which is published under the Boost license (http://www.boost.org/LICENSE_1_0.txt). Like the Apache license, the Boost license is a permissive license allowing commercial redistribution.

Owner
Parsing gigabytes of JSON per second
null
Comments
  • On-Demand Parsing

    On-Demand Parsing

    This introduces a DOM-like API that parses JSON with forward-only streaming--combining the ease of traditional DOM parsers with the performance of SAX. One major virtue of this approach is that we know what type the user wants a value to be before we parse it, so we eliminate the typical "type check" employed by DOM and SAX, instead parsing with a parser dedicated to that type.

    It is far faster than using the DOM (4.0 GB/s vs. 2.3 GB/s). It is also much easier than the SAX approach (15 lines for a Tweet reader versus 300 in SAX), as well as being slightly faster (results on the Clang 10 compiler, on a Skylake machine):

    | Benchmark | Generic DOM | SAX | On-Demand | |------------|---|---|---| | Tweets | 2.3 GB/s | 3.5 GB/s | 4.0 GB/s | | LargeRandom | 0.50 GB/s | 0.71 GB/s | 0.71 GB/s |

    Examples

    The benchmarks have some real, fairly simple examples with equivalent DOM and SAJ implementations.

    points.json

    This parses a giant array of points: [ { "x": 1.0, "y": 1.2, "z": 1.3 }, ... ]

    ondemand::parser parser;
    std::vector<my_point> container;
    for (ondemand::object p : parser.parse(json)) {
      container.emplace_back(my_point{p["x"], p["y"], p["z"]});
    }
    

    twitter.json

    This parses a list of Tweets (from the Twitter API) into C++ structs, serializing text, screen names, ids, and favore/retweet count.

    // Walk the document, parsing the tweets as we go
    std::vector<twitter::tweet> tweets;
    ondemand::parser parser;
    auto doc = parser.parse(json);
    for (ondemand::object tweet : doc["statuses"]) {
      tweets.emplace_back(twitter::tweet{
        tweet["created_at"],
        tweet["id"],
        tweet["text"],
        nullable_int(tweet["in_reply_to_status_id"]),
        read_user(tweet["user"]),
        tweet["retweet_count"],
        tweet["favorite_count"]
      });
    }
    

    With these auxiliary functions:

    simdjson_really_inline twitter::twitter_user read_user(ondemand::object && u) {
      return { u["id"], u["screen_name"] };
    }
    simdjson_really_inline uint64_t nullable_int(ondemand::value && value) {
      if (value.is_null()) { return 0; }
      return std::move(value);
    }
    

    Principles

    • Inline Iteration: Iterating arrays or objects is done through exactly the kind of for loop you'd expect. You can nest them, iterating an array of arrays or an array of objects with array values through nested for loops. Under the hood, the iterator checks for the "[", passes you the index of the value, and when you finish with a value, it checks for "," and passes the next value until it sees "]".
    • Forward-Only Iteration: To prevent reiteration of the same values and to keep the number of variables down (literally), only a single index is maintained and everything uses it (even if you have nested for loops). This means when you're going through an array of arrays, for example, that the inner array loop will advance the index to the next comma, and the array can just pick it up and look at it.
    • Inline, On-Demand Parsing: Parses exactly the type you want and nothing else. Because it's inline this means way fewer branches per value, and they're more predictable as well. For example, if you ask for an unsigned integer, we just start parsing digits. If there were no digits, we toss an error. With a generic parser you have to do a big switch statement checking whether it's a digit before you even start parsing, so it's both an extra branch, and a hard to predict one (because you are also checking other values).
    • Streaming Output: This is streaming in the sense of output, but not input. You still have to pass the whole file as input; it just doesn't parse anything besides the marks until you ask. This also means the parser memory has to grow as the file grows (because of structural indexes). Streaming input is a whole nother problem, however.
    • Validate What You Use: It deliberately validates the values you use and the structure leading to it, but nothing else. The goal is a guarantee that the value you asked for is the correct one and is not malformed so that there is no confusion over whether you got the right value. But it leaves the possibility that the JSON as a whole is invalid. A full-validation mode is possible and planned, but I think this mode should be the default, personally, or at least pretty heavily advertised. Full-validation mode should really only be for debug.
    • Avoids Genericity Pitfalls I think it avoids the pitfalls of generating a generic DOM, which are that you don't know what to expect in the file so you can't tune the parser to expect it (and thus branch mispredictions abound). Even SAX falls into this, though definitely less than others: the core of SAX still has to have a giant switch statement in a loop, and that's just going to be inherently branchy.

    Impact on DOM parse (skylake haswell gcc10.2)

    As expected / hoped, this is entirely neutral with respect to our existing performance, all the way down to identical instruction count:

    | File | Blocks | master Cycles | PR Cycles | +Throughput | master Instr. | PR Instr. | -Instr. | | --- | --: | --: | --: | --: | --: | --: | --: | | gsoc-2018 | 51997 | 72.7 | 71.6 | 1% | 160.1 | 160.1 | 0% | | instruments | 3442 | 108.2 | 107.3 | 0% | 370.6 | 370.6 | 0% | | github_events | 1017 | 78.6 | 78.4 | 0% | 256.7 | 256.7 | 0% | | numbers | 2345 | 284.1 | 283.4 | 0% | 791.0 | 791.0 | 0% | | apache_builds | 1988 | 84.9 | 84.7 | 0% | 295.1 | 295.1 | 0% | | mesh | 11306 | 319.3 | 318.5 | 0% | 984.2 | 984.2 | 0% | | twitterescaped | 8787 | 188.1 | 187.8 | 0% | 493.0 | 493.0 | 0% | | marine_ik | 46616 | 318.4 | 318.1 | 0% | 895.7 | 895.7 | 0% | | update-center | 8330 | 113.2 | 113.2 | 0% | 326.5 | 326.5 | 0% | | mesh.pretty | 24646 | 189.0 | 188.9 | 0% | 571.3 | 571.3 | 0% | | twitter | 9867 | 92.0 | 92.0 | 0% | 281.6 | 281.6 | 0% | | citm_catalog | 26987 | 81.7 | 81.7 | 0% | 287.5 | 287.5 | 0% | | canada | 35172 | 311.2 | 311.4 | 0% | 946.7 | 946.7 | 0% | | semanticscholar-corpus | 134271 | 108.8 | 109.0 | 0% | 274.4 | 274.4 | 0% | | random | 7976 | 141.2 | 142.2 | 0% | 482.1 | 482.1 | 0% |

    Design

    The primary classes are:

    • ondemand::parser: The equivalent of dom::parser.
      • This handles allocation and parse calls, and keeps memory around between parses.
    • ondemand::document: Holds iteration state. Can be cast to array, object or scalar.
      • Forward-Only: This is a forward-only input iterator. You may only get the document's value once. Once you have retrieved an array, object, or scalar, subsequent attempts to get other values fail.
      • Iteration Owner: If you let go of the document object, iteration will fail. This is not checked, but the failure will be really obvious :) Moves are disallowed after iteration has started, because array/object/value all point to the document.
      • Locks the Parser: Only one iteration is allowed at a time. If you attempt to parse a new document before destroying the old one, you will get an error. document cannot be copied.
    • ondemand::array: Manages array iteration.
      • Forward-Only: Retrieving the same array element twice will fail. There is currently no check on whether it has handed out a value, it's just something you shouldn't do multiple times without ++. This is consistent with C++'s "input iterator" concept.
      • Child Blindness: Once you get an array element, it has no control over whether you do anything with it. For example, you could decide not to handle a value if it's an array or object. To control for this, when you ++ the array checks whether there is an unfinished array or object by checking if we're at the current depth. If so, it skips tokens until it's returned to the current depth.
      • Chainable: We allow you to pass an error into the iterator, which it will yield on its first iteration and then stop. This allows error chaining to make its way all the way to the loop: for (auto o : parser.parse(json)) works!
      • C++ Iterator: Because C++ breaks what could be a single next() call into !=, ++, and * calls, we have to break up the algorithm into parts and keep some state between them.
        • operator * Reads key, : and a value, advancing exactly three times. Returns an error if key or : is invalid. value takes up the slack from there, incrementing depth if there is a [ or { even if the user doesn't use the value. If there is an error to report, it decrements depth so that the loop will terminate, and returns it.
        • operator ++ Checks if we have a ] (decrementing depth) or , (setting error if no comma).
        • operator != lets the loop continue if current depth >= array depth.
      • Zero Overhead: It keeps state, but that state is all either constant and knowable at compile time, or only has an effect for a single iteration (the first or last). Our goal is to get the compiler to elide them all.
        • document: This member variable is in many objects, but always has the same value. It is likely the compiler will elide it.
        • at_start: This member variable is set the first time true. If it is true, we check for } before the first != and then set it to false.
        • error: Whether this member variable is passed in initially or detected by ++, error has no effect unless it is nonzero, and when it is zero, the loop always terminates after the next iteration. We hope this will be elided, therefore, into a trailing control flow.
        • depth: This member variable is constant and knowable at compile time, because depth will have been incremented a constant number of times based on how many nested objects you have. Whether the compiler recognizes this is anybody's game, however :/
    • ondemand::object: Manages object iteration and object["key"].
      • Forward-Only: [] will never go backwards; you must do [] in order to get all the fields you want. Retrieving the same field twice with * will fail. There is currently no check on whether it has handed out a value, it's just something you shouldn't do multiple times without ++. This is consistent with C++'s "input iterator" concept.
      • Child Blindness: Once you get a field or value, the object has no control over whether you do anything with it. For example, you could decide not to handle a value if it's an array or object. To control for this, when you ++ or do a second [], the array checks whether there is an unfinished array or object by checking if we're at the current depth. If so, it skips tokens until it's returned to the current depth.
      • Chainable: We allow you to pass an error into the iterator, which it will yield on its first iteration and then stop. This allows error chaining to make its way all the way to the loop: for (auto o : parser.parse(json)) works!
      • C++ Iterator: Because C++ breaks what could be a single next() call into !=, ++, and * calls, we have to break up the algorithm into parts and keep some state between them.
        • operator * Reads key, : and a value, advancing exactly three times. Returns an error if key or : is invalid. value takes up the slack from there, incrementing depth if there is a [ or { even if the user doesn't use the value. If there is an error to report, it decrements depth so that the loop will terminate, and returns it.
        • operator ++ Checks if we have a ] (decrementing depth) or , (setting error if no comma).
        • operator != lets the loop continue if current depth >= array depth.
      • Zero Overhead: It keeps state, but that state is all either constant and knowable at compile time, or only has an effect for a single iteration (the first or last). Our goal is to get the compiler to elide them all.
        • document: This member variable is in many objects, but always has the same value. It is likely the compiler will elide it.
        • at_start: This member variable is set the first time true. If it is true, we check for } before the first != and then set it to false therafter. We expect this to be elided in favor of leading control flow.
        • error: Whether this member variable is passed in initially or detected by ++, error has no effect unless it is nonzero, and when it is zero, the loop always terminates after the next iteration. We hope this will be elided, therefore, into a trailing control flow.
        • depth: This member variable is constant and knowable at compile time, because depth will have been incremented a constant number of times based on how many nested objects you have. Whether the compiler recognizes this is anybody's game, however :/
    • ondemand::value: A transient object giving you the opportunity to convert a JSON value into an array, object, or scalar.
      • Forward-Only: This is transient: its value can only be retrieved once. Further retrievals will fail. It is an error to keep multiple value objects around at once (it is also hard to do, but possible).
      • Skippable: If you don't use the value (for example, if you have a field and don't care about the key), the destructor will check if it's { or [ and increment depth, to keep things consistent.
    • ondemand::raw_json_string: Represents the raw json string inside the buffer, terminated by ". This allows you to inspect and do comparisons against the raw, escaped json string without performance penalty.
      • The unescape() method on it will parse it (escapes and all) into a string buffer of your choice.
    • ondemand::token_iterator: Internal. Used to actually track structural iteration. document is the only way users will see this.

    Concerns / Rough Edges

    • Compiler Flags: You have to compile all of simdjson.cpp/simdjson.h with the target flags. Otherwise, simdjson_result<> and other things in the include/ headers can't be inlined with your Haswell-specific code.
    • Heavy Reliance On Optimizers: The object and array structs have four member variables each, which I'm expecting the compiler to elide completely in normal cases. I think these are largely unavoidable given C++'s iterator design. Without that elision, register pressure will be intense and stuff will get shoved into memory. I made some assumptions about how optimizers should work, particularly that it can deduce the depth value since it's constant, and that it can elide variables like at_start into control flow since it only affects the header of the loop.
    • Unconsumed Values: Because the user drives the parse, we hand them array, object and value objects which they can then iterate or call get_string/etc. on. However, if they don't do this, or only partially iterate, then the structural_index and depth won't get updated. I added a destructor to value to check whether the value has been used, and check for start/end array if not. We also keep track of depth so that if child iterations are left partially iterated, we can skip everything until we get back to our own depth. All of this can be optimized away if the compiler is smart enough ... but I'm not convinced it will be :) We'll see.

    Performance

    clang does a lot better on the (relatively complex) twitter benchmark with ondemand than g++. I assume this has to do with its affinity for SSA optimizations:

    Haswell clang10.0 (Skylake)

    | Benchmark | Generic DOM | On-Demand | SAX | |------------|---|---|---| | PartialTweets | 2.3 GB/s | 4.0 GB/s | 3.5 GB/s | | LargeRandom | 0.50 GB/s | 0.71 GB/s | 0.71 GB/s |

    Haswell gcc10 (Skylake)

    GCC is more or less on par with clang:

    | Benchmark | DOM | On-Demand | SAX | |------------|---|---|---| | PartialTweets | 2.5 GB/s | 3.8 GB/s | 3.7 GB/s | | LargeRandom | 0.50 GB/s | 0.78 GB/s | 0.74 GB/s |

    Running the Benchmark

    You can see several examples in benchmark/bench_ondemand.cpp. To compile for your native platform, do this:

    rm -rf build
    cd build
    cmake -DCMAKE_CXX_FLAGS="-march=native" ..
    make bench_ondemand
    benchmark/bench_ondemand --benchmark_counters_tabular=true
    

    Raw Data: Haswell clang10.0 (Skylake)

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Benchmark                         Time             CPU   Iterations best_branch_miss best_bytes_per_sec best_cache_miss best_cache_ref best_cycles best_cycles_per_byte best_docs_per_sec best_frequency best_instructions best_instructions_per_byte best_instructions_per_cycle best_items_per_sec branch_miss      bytes bytes_per_second cache_miss  cache_ref     cycles cycles_per_byte docs_per_sec  frequency instructions instructions_per_byte instructions_per_cycle      items items_per_second
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PartialTweets<OnDemand>      165179 ns       165149 ns         4237           1.538k             4.056G               0        58.152k    574.945k             0.910422          6.42265k       3.69267G          1.93943M                    3.07108                     3.37325           642.265k    1.71353k   631.515k       3.56129G/s   8.96861m   58.1731k   580.103k         0.91859   6.05512k/s  3.5126G/s     1.93943M               3.07108                3.34325        100       605.512k/s [best: throughput=  4.06 GB/s doc_throughput=  6422 docs/s instructions=     1939432 cycles=      574945 branch_miss=    1538 cache_miss=       0 cache_ref=     58152 items=       100 avg_time=    157210 ns]
    PartialTweets<Iter>          182774 ns       182773 ns         3828           2.825k           3.66457G               0         58.28k    636.406k              1.00774          5.80282k       3.69295G             1.84M                    2.91363                     2.89124           580.282k    3.08751k   631.515k       3.21789G/s    3.9185m    58.158k   644.682k         1.02085   5.47127k/s 3.52723G/s        1.84M               2.91363                2.85412        100       547.127k/s [best: throughput=  3.66 GB/s doc_throughput=  5802 docs/s instructions=     1840002 cycles=      636406 branch_miss=    2825 cache_miss=       0 cache_ref=     58280 items=       100 avg_time=    174683 ns]
    PartialTweets<Dom>           282433 ns       282382 ns         2480           3.149k           2.31979G               0        92.404k    1005.16k              1.59167          3.67338k       3.69235G           2.9721M                     4.7063                     2.95682           367.338k     3.2693k   631.515k        2.0828G/s   0.439516   92.8304k   1011.84k         1.60225    3.5413k/s 3.58324G/s      2.9721M                4.7063                2.93731        100        354.13k/s [best: throughput=  2.32 GB/s doc_throughput=  3673 docs/s instructions=     2972097 cycles=     1005165 branch_miss=    3149 cache_miss=       0 cache_ref=     92404 items=       100 avg_time=    274248 ns]
    Creating a source file spanning 44921 KB 
    LargeRandom<Dom>           91468995 ns     91467607 ns            8         968.537k           503.312M        10.8974M       15.4273M    337.109M              7.32865           10.9419        3.6886G          1041.23M                    22.6361                     3.08872           10.9419M    967.752k   45.9988M         479.6M/s   10.9539M   15.4278M   337.335M         7.33356    10.9328/s 3.68803G/s     1041.23M               22.6361                3.08665      1000k       10.9328M/s [best: throughput=  0.50 GB/s doc_throughput=    10 docs/s instructions=  1041233885 cycles=   337109080 branch_miss=  968537 cache_miss=10897426 cache_ref=  15427295 items=   1000000 avg_time=  91455371 ns]
    LargeRandomSum<Dom>        89605540 ns     89588793 ns            8         968.435k           514.102M        10.3415M       14.5719M     330.07M              7.17562           11.1764         3.689G          1022.23M                    22.2231                     3.09702           11.1764M    968.274k   45.9988M       489.658M/s   10.3269M   14.5726M   330.174M         7.17789    11.1621/s 3.68544G/s     1022.23M               22.2231                3.09604      1000k       11.1621M/s [best: throughput=  0.51 GB/s doc_throughput=    11 docs/s instructions=  1022233883 cycles=   330069936 branch_miss=  968435 cache_miss=10341543 cache_ref=  14571861 items=   1000000 avg_time=  89591929 ns]
    LargeRandom<OnDemand>      64622779 ns     64622223 ns           11         929.755k           712.493M        5.63348M       8.01073M     238.12M              5.17666           15.4894       3.68834G           648.69M                    14.1023                     2.72422           15.4894M    930.709k   45.9988M       678.835M/s   5.66211M   8.01242M   238.304M         5.18066    15.4746/s 3.68765G/s      648.69M               14.1023                2.72212      1000k       15.4746M/s [best: throughput=  0.71 GB/s doc_throughput=    15 docs/s instructions=   648690465 cycles=   238120090 branch_miss=  929755 cache_miss= 5633485 cache_ref=   8010731 items=   1000000 avg_time=  64609673 ns]
    LargeRandomSum<OnDemand>   64569064 ns     64568969 ns           11         951.323k           714.434M        4.98723M       7.12271M    237.524M              5.16371           15.5316       3.68913G           641.69M                    13.9502                     2.70158           15.5316M    958.366k   45.9988M       679.395M/s    5.0381M   7.12423M    238.12M         5.17667    15.4873/s 3.68785G/s      641.69M               13.9502                2.69481      1000k       15.4873M/s [best: throughput=  0.71 GB/s doc_throughput=    15 docs/s instructions=   641690192 cycles=   237524226 branch_miss=  951323 cache_miss= 4987231 cache_ref=   7122709 items=   1000000 avg_time=  64556393 ns]
    LargeRandom<Iter>          60862746 ns     60863442 ns           11         990.089k           757.035M        5.62286M       7.98759M    224.156M              4.87309           16.4577        3.6891G          581.692M                    12.6458                     2.59503           16.4577M    995.475k   45.9988M       720.759M/s   5.65907M   7.98859M   224.456M          4.8796    16.4302/s 3.68786G/s     581.692M               12.6458                2.59157      1000k       16.4302M/s [best: throughput=  0.76 GB/s doc_throughput=    16 docs/s instructions=   581691751 cycles=   224156097 branch_miss=  990089 cache_miss= 5622863 cache_ref=   7987593 items=   1000000 avg_time=  60850056 ns]
    LargeRandomSum<Iter>       59778441 ns     59777987 ns           12          981.57k           770.555M        5.01271M       7.15428M    220.194M              4.78696           16.7516       3.68861G          570.691M                    12.4067                     2.59177           16.7516M    986.227k   45.9988M       733.846M/s   5.05344M   7.15511M   220.459M         4.79271    16.7286/s 3.68796G/s     570.691M               12.4067                2.58865      1000k       16.7286M/s [best: throughput=  0.77 GB/s doc_throughput=    16 docs/s instructions=   570691393 cycles=   220194110 branch_miss=  981570 cache_miss= 5012710 cache_ref=   7154280 items=   1000000 avg_time=  59765975 ns]
    Creating a source file spanning 134087 KB 
    Kostya<Dom>                94265351 ns     94266428 ns            7          1045.8k           1.45789G        15.7114M       22.3239M    347.272M               2.5292           10.6179        3.6873G          975.883M                    7.10741                     2.81014           5.56684M    1045.46k   137.305M       1.35653G/s   15.7288M   22.2886M   347.637M         2.53186    10.6082/s 3.68782G/s     975.883M               7.10741                2.80719   524.288k       5.56177M/s [best: throughput=  1.46 GB/s doc_throughput=    10 docs/s instructions=   975882556 cycles=   347271978 branch_miss= 1045795 cache_miss=15711430 cache_ref=  22323923 items=    524288 avg_time=  94251475 ns]
    KostyaSum<Dom>             93482419 ns     93481371 ns            7         1048.42k           1.47166G        15.4012M       21.9002M    344.184M              2.50671           10.7182       3.68902G          970.115M                     7.0654                      2.8186            5.6194M    1049.48k   137.305M       1.36792G/s   15.4135M   21.7919M   344.773M           2.511    10.6973/s 3.68815G/s     970.115M                7.0654                2.81378   524.288k       5.60848M/s [best: throughput=  1.47 GB/s doc_throughput=    10 docs/s instructions=   970115386 cycles=   344183956 branch_miss= 1048419 cache_miss=15401206 cache_ref=  21900243 items=    524288 avg_time=  93468372 ns]
    Kostya<OnDemand>           59869079 ns     59857167 ns           12         468.968k            2.2974G        9.97175M       13.9606M    220.483M              1.60579           16.7321       3.68914G          635.858M                    4.63099                     2.88393           8.77243M    472.341k   137.305M       2.13634G/s   9.99057M   13.9227M   220.746M         1.60771    16.7064/s 3.68788G/s     635.858M               4.63099                2.88049   524.288k       8.75898M/s [best: throughput=  2.30 GB/s doc_throughput=    16 docs/s instructions=   635857684 cycles=   220483050 branch_miss=  468968 cache_miss= 9971751 cache_ref=  13960563 items=    524288 avg_time=  59856199 ns]
    KostyaSum<OnDemand>        60280325 ns     60279422 ns           12         469.529k           2.27999G        9.67866M       13.4947M    222.157M              1.61799           16.6053       3.68898G          630.615M                     4.5928                     2.83859           8.70594M    469.704k   137.305M       2.12137G/s   9.67636M   13.4264M   222.299M         1.61902    16.5894/s 3.68781G/s     630.615M                4.5928                2.83678   524.288k       8.69763M/s [best: throughput=  2.28 GB/s doc_throughput=    16 docs/s instructions=   630614946 cycles=   222157441 branch_miss=  469529 cache_miss= 9678657 cache_ref=  13494711 items=    524288 avg_time=  60267368 ns]
    Kostya<Iter>               61758017 ns     61757605 ns           11         497.377k           2.22614G        9.95741M       13.9293M    227.537M              1.65716           16.2131       3.68908G          606.497M                    4.41715                     2.66549           8.50035M    497.937k   137.305M        2.0706G/s   9.99912M   13.9207M   227.752M         1.65873    16.1923/s 3.68785G/s     606.497M               4.41715                2.66297   524.288k       8.48945M/s [best: throughput=  2.23 GB/s doc_throughput=    16 docs/s instructions=   606497405 cycles=   227536701 branch_miss=  497377 cache_miss= 9957411 cache_ref=  13929258 items=    524288 avg_time=  61745137 ns]
    KostyaSum<Iter>            59370790 ns     59359390 ns           12         464.597k           2.31499G        9.64345M       13.4522M    218.801M              1.59354           16.8602       3.68902G          597.061M                    4.34843                     2.72878           8.83958M    464.774k   137.305M       2.15425G/s   9.67782M   13.4532M    218.89M         1.59419    16.8465/s 3.68754G/s     597.061M               4.34843                2.72767   524.288k       8.83244M/s [best: throughput=  2.31 GB/s doc_throughput=    16 docs/s instructions=   597060518 cycles=   218801084 branch_miss=  464597 cache_miss= 9643448 cache_ref=  13452173 items=    524288 avg_time=  59357614 ns]
    
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Benchmark                    Time             CPU   Iterations best_branch_miss best_bytes_per_sec best_cache_miss best_cache_ref best_cycles best_cycles_per_byte best_docs_per_sec best_frequency best_instructions best_instructions_per_byte best_instructions_per_cycle best_items_per_sec branch_miss      bytes bytes_per_second cache_miss  cache_ref     cycles cycles_per_byte docs_per_sec  frequency instructions instructions_per_byte instructions_per_cycle      items items_per_second
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PartialTweets<Sax>      191072 ns       191072 ns         3670           1.323k           3.47984G               0        59.009k    670.174k              1.06122          5.51031k       3.69287G          2.17912M                    3.45062                     3.25157           551.031k    1.47164k   631.515k       3.07814G/s   0.013624   58.9976k   674.469k         1.06802   5.23364k/s 3.52993G/s     2.17912M               3.45062                3.23086        100       523.364k/s [best: throughput=  3.48 GB/s doc_throughput=  5510 docs/s instructions=     2179117 cycles=      670174 branch_miss=    1323 cache_miss=       0 cache_ref=     59009 items=       100 avg_time=    182754 ns]
    Creating a source file spanning 44921 KB 
    LargeRandom<Dom>      91475275 ns     91476347 ns            8         933.026k           503.328M        11.1398M       15.6303M    337.144M              7.32942           10.9422        3.6891G          1040.23M                    22.6144                     3.08543           10.9422M     932.62k   45.9988M       479.554M/s   11.1208M   15.6317M   337.404M         7.33507    10.9318/s 3.68843G/s     1040.23M               22.6144                3.08305      1000k       10.9318M/s [best: throughput=  0.50 GB/s doc_throughput=    10 docs/s instructions=  1040233883 cycles=   337144298 branch_miss=  933026 cache_miss=11139804 cache_ref=  15630295 items=   1000000 avg_time=  91461980 ns]
    LargeRandomSum<Dom>   90329253 ns     90329264 ns            8         932.216k            509.83M        10.4788M       14.7649M    332.849M              7.23603           11.0836       3.68915G          1022.23M                    22.2231                     3.07117           11.0836M    932.897k   45.9988M       485.644M/s    10.522M   14.7658M   333.176M         7.24315    11.0706/s 3.68846G/s     1022.23M               22.2231                3.06815      1000k       11.0706M/s [best: throughput=  0.51 GB/s doc_throughput=    11 docs/s instructions=  1022233881 cycles=   332848515 branch_miss=  932216 cache_miss=10478836 cache_ref=  14764904 items=   1000000 avg_time=  90315728 ns]
    LargeRandom<Sax>      67111081 ns     67111854 ns           10         973.014k           686.397M         5.6914M       8.09484M    247.225M               5.3746           14.9221        3.6891G          675.692M                    14.6893                     2.73311           14.9221M    975.305k   45.9988M       653.653M/s    5.7521M    8.0964M   247.525M         5.38113    14.9005/s 3.68825G/s     675.692M               14.6893                2.72979      1000k       14.9005M/s [best: throughput=  0.69 GB/s doc_throughput=    14 docs/s instructions=   675691776 cycles=   247224897 branch_miss=  973014 cache_miss= 5691399 cache_ref=   8094842 items=   1000000 avg_time=  67098397 ns]
    

    Raw Data: Haswell gcc 10.2 (Skylake)

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Benchmark                         Time             CPU   Iterations best_branch_miss best_bytes_per_sec best_cache_miss best_cache_ref best_cycles best_cycles_per_byte best_docs_per_sec best_frequency best_instructions best_instructions_per_byte best_instructions_per_cycle best_items_per_sec branch_miss      bytes bytes_per_second cache_miss  cache_ref     cycles cycles_per_byte docs_per_sec  frequency instructions instructions_per_byte instructions_per_cycle      items items_per_second
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PartialTweets<OnDemand>      177528 ns       177530 ns         3941           1.657k           3.74809G               0        54.903k    622.194k              0.98524          5.93507k       3.69277G          2.11159M                    3.34369                     3.39378           593.507k    1.80355k   631.515k       3.31293G/s  0.0284192   54.9838k   626.235k        0.991639   5.63285k/s 3.52748G/s     2.11159M               3.34369                3.37188        100       563.285k/s [best: throughput=  3.75 GB/s doc_throughput=  5935 docs/s instructions=     2111588 cycles=      622194 branch_miss=    1657 cache_miss=       0 cache_ref=     54903 items=       100 avg_time=    169681 ns]
    PartialTweets<Iter>          384499 ns       384501 ns         1819           2.968k           1.68681G               0         55.31k    1.38216M              2.18865          2.67105k       3.69182G          4.40862M                    6.98102                     3.18965           267.105k    3.19399k   631.515k       1.52963G/s   0.239142   55.4276k   1.38969M         2.20056   2.60078k/s 3.61426G/s     4.40862M               6.98102                3.17239        100       260.078k/s [best: throughput=  1.69 GB/s doc_throughput=  2671 docs/s instructions=     4408621 cycles=     1382163 branch_miss=    2968 cache_miss=       0 cache_ref=     55310 items=       100 avg_time=    376595 ns]
    PartialTweets<Dom>           266111 ns       266112 ns         2630            3.53k           2.46696G               0        87.587k    945.164k              1.49666          3.90642k       3.69221G          2.91945M                    4.62293                     3.08883           390.642k    3.70404k   631.515k       2.21014G/s  0.0285171   87.3744k   952.087k         1.50762   3.75782k/s 3.57777G/s     2.91945M               4.62293                3.06637        100       375.782k/s [best: throughput=  2.47 GB/s doc_throughput=  3906 docs/s instructions=     2919449 cycles=      945164 branch_miss=    3530 cache_miss=       0 cache_ref=     87587 items=       100 avg_time=    257984 ns]
    Creating a source file spanning 44921 KB 
    LargeRandom<Dom>           91849988 ns     91849888 ns            8         889.651k           502.486M        10.9371M       15.2509M    337.691M              7.34129           10.9239        3.6889G          970.316M                    21.0944                     2.87339           10.9239M    889.185k   45.9988M       477.604M/s   11.0046M   15.2543M   338.763M         7.36461    10.8873/s 3.68822G/s     970.316M               21.0944                2.86429      1000k       10.8873M/s [best: throughput=  0.50 GB/s doc_throughput=    10 docs/s instructions=   970315574 cycles=   337690544 branch_miss=  889651 cache_miss=10937148 cache_ref=  15250891 items=   1000000 avg_time=  91836392 ns]
    LargeRandomSum<Dom>        92188755 ns     92188563 ns            8         889.635k           499.861M          10.36M       14.4073M    339.484M              7.38027           10.8668       3.68911G          974.316M                    21.1813                     2.86999           10.8668M    889.272k   45.9988M       475.849M/s   10.4213M   14.4099M   340.028M          7.3921    10.8473/s 3.68839G/s     974.316M               21.1813                 2.8654      1000k       10.8473M/s [best: throughput=  0.50 GB/s doc_throughput=    10 docs/s instructions=   974315578 cycles=   339483518 branch_miss=  889635 cache_miss=10360012 cache_ref=  14407265 items=   1000000 avg_time=  92175359 ns]
    LargeRandom<OnDemand>      58992605 ns     58991725 ns           12         869.377k           781.677M        5.62944M       7.89403M    217.093M              4.71954           16.9934       3.68916G          615.695M                     13.385                     2.83609           16.9934M    868.032k   45.9988M       743.627M/s   5.67331M   7.89681M    217.57M         4.72991    16.9515/s 3.68815G/s     615.695M                13.385                2.82987      1000k       16.9515M/s [best: throughput=  0.78 GB/s doc_throughput=    16 docs/s instructions=   615694894 cycles=   217093162 branch_miss=  869377 cache_miss= 5629445 cache_ref=   7894027 items=   1000000 avg_time=  58980167 ns]
    LargeRandomSum<OnDemand>   56594492 ns     56594225 ns           12         876.324k           813.963M        5.01997M       7.05425M    208.485M               4.5324           17.6953       3.68921G          606.695M                    13.1894                     2.91002           17.6953M    876.066k   45.9988M       775.129M/s     5.066M   7.05672M   208.735M         4.53784    17.6696/s 3.68828G/s     606.695M               13.1894                2.90653      1000k       17.6696M/s [best: throughput=  0.81 GB/s doc_throughput=    17 docs/s instructions=   606694893 cycles=   208485037 branch_miss=  876324 cache_miss= 5019967 cache_ref=   7054246 items=   1000000 avg_time=  56582402 ns]
    LargeRandom<Iter>          53364551 ns     53364201 ns           13         894.323k            863.44M        5.63805M       7.89683M    196.541M              4.27273           18.7709       3.68925G          570.695M                    12.4067                      2.9037           18.7709M    894.787k   45.9988M       822.046M/s   5.66445M   7.89822M    196.82M          4.2788    18.7392/s 3.68823G/s     570.695M               12.4067                2.89958      1000k       18.7392M/s [best: throughput=  0.86 GB/s doc_throughput=    18 docs/s instructions=   570694596 cycles=   196540518 branch_miss=  894323 cache_miss= 5638049 cache_ref=   7896828 items=   1000000 avg_time=  53352485 ns]
    LargeRandomSum<Iter>       54883627 ns     54883439 ns           13         871.251k           841.314M        5.02219M       7.05069M    201.706M              4.38502           18.2899       3.68918G          577.695M                    12.5589                     2.86405           18.2899M    871.778k   45.9988M       799.291M/s   5.06164M   7.05285M   202.423M         4.40061    18.2204/s 3.68823G/s     577.695M               12.5589                2.85391      1000k       18.2204M/s [best: throughput=  0.84 GB/s doc_throughput=    18 docs/s instructions=   577695426 cycles=   201705764 branch_miss=  871251 cache_miss= 5022193 cache_ref=   7050692 items=   1000000 avg_time=  54871279 ns]
    Creating a source file spanning 134087 KB 
    Kostya<Dom>                86984857 ns     86984354 ns            8         494.739k           1.58086G        15.8348M       22.1883M      320.4M              2.33349           11.5135       3.68893G          936.468M                    6.82035                     2.92281            6.0364M    494.617k   137.305M       1.47009G/s    15.849M   22.1757M   320.827M          2.3366    11.4963/s 3.68833G/s     936.468M               6.82035                2.91892   524.288k       6.02738M/s [best: throughput=  1.58 GB/s doc_throughput=    11 docs/s instructions=   936467833 cycles=   320400264 branch_miss=  494739 cache_miss=15834791 cache_ref=  22188266 items=    524288 avg_time=  86971545 ns]
    KostyaSum<Dom>             86970961 ns     86970134 ns            8         495.135k           1.58123G        15.5502M       21.6331M    320.352M              2.33314           11.5162       3.68924G          938.565M                    6.83562                     2.92979           6.03781M    494.743k   137.305M       1.47034G/s   15.6095M   21.6834M   320.783M         2.33628    11.4982/s 3.68842G/s     938.565M               6.83562                2.92586   524.288k       6.02837M/s [best: throughput=  1.58 GB/s doc_throughput=    11 docs/s instructions=   938564987 cycles=   320352061 branch_miss=  495135 cache_miss=15550182 cache_ref=  21633093 items=    524288 avg_time=  86957969 ns]
    Kostya<OnDemand>           60343057 ns     60343775 ns           12         456.213k           2.28197G        10.1305M       13.9803M    221.966M              1.61659           16.6197       3.68902G          647.782M                    4.71783                     2.91838           8.71353M    456.138k   137.305M       2.11911G/s   10.1682M    13.868M   222.551M         1.62085    16.5717/s 3.68805G/s     647.782M               4.71783                2.91072   524.288k       8.68835M/s [best: throughput=  2.28 GB/s doc_throughput=    16 docs/s instructions=   647782119 cycles=   221966325 branch_miss=  456213 cache_miss=10130481 cache_ref=  13980257 items=    524288 avg_time=  60330609 ns]
    KostyaSum<OnDemand>        58642231 ns     58641748 ns           12          453.15k            2.3471G        9.82263M       13.5381M    215.814M              1.57178            17.094       3.68913G          643.064M                    4.68347                     2.97972            8.9622M    453.464k   137.305M       2.18062G/s   9.84356M   13.5389M    216.28M         1.57518    17.0527/s 3.68816G/s     643.064M               4.68347                2.97329   524.288k       8.94052M/s [best: throughput=  2.35 GB/s doc_throughput=    17 docs/s instructions=   643063529 cycles=   215813521 branch_miss=  453150 cache_miss= 9822627 cache_ref=  13538124 items=    524288 avg_time=  58629573 ns]
    Kostya<Iter>               59348929 ns     59348585 ns           12          452.97k           2.32237G        10.0784M       13.9769M    218.108M              1.58849           16.9139       3.68906G          642.015M                    4.67583                     2.94357           8.86776M    453.045k   137.305M       2.15465G/s   10.1584M    13.924M    218.88M         1.59412    16.8496/s 3.68804G/s     642.015M               4.67583                2.93318   524.288k       8.83404M/s [best: throughput=  2.32 GB/s doc_throughput=    16 docs/s instructions=   642015174 cycles=   218107739 branch_miss=  452970 cache_miss=10078433 cache_ref=  13976859 items=    524288 avg_time=  59336271 ns]
    KostyaSum<Iter>           121340358 ns    121337200 ns            6         453.895k           1.13236G          9.935M       13.6518M    447.335M              3.25796           8.24704       3.68919G          1.31992G                    9.61305                     2.95063           4.32383M    454.037k   137.305M       1079.18M/s   9.95978M   13.6513M   447.577M         3.25973     8.2415/s 3.68871G/s     1.31992G               9.61305                2.94903   524.288k       4.32092M/s [best: throughput=  1.13 GB/s doc_throughput=     8 docs/s instructions=  1319919270 cycles=   447334605 branch_miss=  453895 cache_miss= 9934995 cache_ref=  13651799 items=    524288 avg_time= 121327101 ns]
    
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Benchmark                    Time             CPU   Iterations best_branch_miss best_bytes_per_sec best_cache_miss best_cache_ref best_cycles best_cycles_per_byte best_docs_per_sec best_frequency best_instructions best_instructions_per_byte best_instructions_per_cycle best_items_per_sec branch_miss      bytes bytes_per_second cache_miss  cache_ref     cycles cycles_per_byte docs_per_sec  frequency instructions instructions_per_byte instructions_per_cycle      items items_per_second
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PartialTweets<Sax>      181416 ns       181385 ns         3854           1.384k           3.67147G               0        58.443k    635.165k              1.00578          5.81375k       3.69269G          2.07459M                     3.2851                     3.26623           581.375k    1.53157k   631.515k       3.24252G/s   0.044629   58.4857k   640.112k         1.01361   5.51314k/s 3.52903G/s     2.07459M                3.2851                3.24098        100       551.314k/s [best: throughput=  3.67 GB/s doc_throughput=  5813 docs/s instructions=     2074593 cycles=      635165 branch_miss=    1384 cache_miss=       0 cache_ref=     58443 items=       100 avg_time=    173490 ns]
    Creating a source file spanning 44921 KB 
    LargeRandom<Dom>      88806316 ns     88806323 ns            8         871.991k           518.893M        10.8318M       15.3949M    326.983M              7.10851           11.2806       3.68855G          970.316M                    21.0944                     2.96748           11.2806M    872.063k   45.9988M       493.972M/s   10.8779M   15.3961M   327.465M         7.11899    11.2605/s 3.68741G/s     970.316M               21.0944                2.96311      1000k       11.2605M/s [best: throughput=  0.52 GB/s doc_throughput=    11 docs/s instructions=   970315579 cycles=   326982612 branch_miss=  871991 cache_miss=10831777 cache_ref=  15394882 items=   1000000 avg_time=  88792975 ns]
    LargeRandomSum<Dom>   89824596 ns     89807350 ns            8          872.01k           513.081M        10.3216M       14.5612M    330.716M              7.18967           11.1542       3.68889G          974.316M                    21.1813                     2.94608           11.1542M    871.863k   45.9988M       488.466M/s   10.3039M   14.5623M   331.208M         7.20036    11.1349/s 3.68798G/s     974.316M               21.1813                 2.9417      1000k       11.1349M/s [best: throughput=  0.51 GB/s doc_throughput=    11 docs/s instructions=   974315579 cycles=   330716103 branch_miss=  872010 cache_miss=10321626 cache_ref=  14561194 items=   1000000 avg_time=  89811290 ns]
    LargeRandom<Sax>      62784008 ns     62783943 ns           11         913.123k           738.521M        5.61415M       8.01956M    229.784M              4.99545           16.0552       3.68924G          672.695M                    14.6242                      2.9275           16.0552M    918.166k   45.9988M       698.711M/s   5.65478M   8.02137M   231.532M         5.03344    15.9276/s 3.68776G/s     672.695M               14.6242                2.90541      1000k       15.9276M/s [best: throughput=  0.74 GB/s doc_throughput=    16 docs/s instructions=   672694521 cycles=   229784372 branch_miss=  913123 cache_miss= 5614148 cache_ref=   8019564 items=   1000000 avg_time=  62771549 ns]
    

    Loose Ends

    Things that likely won't get finished in this checkin, but might be considered for a full release (and also might not :)):

    • Bugs
      • Possible corruption with incomplete arrays / objects ("[ 1, 2, ")
      • Win32 / VS2019 failure (#1208)
    • Rough Edges
      • x["a"]["b"] unsupported (right now x["a"] would be released early and the element never gets fully skipped)
      • parser.load()
      • parser.iterate(buf, len)
      • get_c_str()
      • Out-of-order key lookup support
    • Features
      • Print / minify
      • document_stream
      • Validation of skipped values
      • Strict object support (don't ever skip keys, error immediately if the next key is not what you expect)
      • Nullable value support: .get_nullable_int64()
      • .is_false_or_null() is probably useful ...
      • SIMDJSON_ONDEMAND_SAFETY_RAILS tests
      • Compile-time safety tests (make sure bad things don't compile)
    • Performance
      • Add ondemand to competitions
      • Make more competitions
      • Performance optimization for recursion
      • Tuple support [ x, y, z ]? It would force-unroll loops, basically, possibly improving performance.
      • Ability to stop iterating when finished (right now it will inspect and skip all remaining elements)
    • Sanity checks:
      • Sanity review of & and && versions of methods (I hate the error messages but I hate letting people compile programs that will fail at runtime even more). Can we make it so people can use things in more flexible ways (for example, ["a"]["b"] above)? Can we make error messages better?
      • Sanity review of document vs. value. I don't like that they behave almost identically but have different code. Even with the root value parsing difference notwithstanding, the fact that document owns the iterator and value has a reference makes the code fundamentally non-reusable. We should look into doing something about that.

    Next Steps

    • [X] Supported way to use it on multiple platforms ("single kernel mode"? Plug in to simdjson's architecture selection?)
    • [X] Parse numbers/booleans at the root correctly, without overrun
    • [X] Don't overrun when objects/arrays are unbalanced
    • [X] Thorough type tests similar to DOM API tests
    • [X] Error tests for common error cases (to make sure errors are actually raised and iteration stops)
    • [X] Last performance check to ensure we haven't dropped below 4.0GB/s in the final round of fixes
    • [X] Resolve compiler failures on other platforms
  • Bringing ndjson(document_stream) to On Demand

    Bringing ndjson(document_stream) to On Demand

    I have trying to implement just a simple document stream for On Demand. The main issue, as discussed by @lemire in this comment, is that On Demand does not know where the end of a single document lies in a document stream (if stage1 covers multiple documents). To overcome this, I created a JSON iterator that will be used to traverse a document and go to the next document when needed. This JSON iterator is also used to create a document instance when the *operator is called.

    • [x] implements threaded version so that stage 1 is processed independently
    • [x] add the rest of the tests so that we have as good a coverage as the DOM document_stream,
    • [x] add documentation and examples,
    • [x] add benchmarking.

    Fixes https://github.com/simdjson/simdjson/issues/1464

  • MSVC simdjson is slower than g++ on Windows

    MSVC simdjson is slower than g++ on Windows

    On the same machine and OS, WSL g++ 7.5-compiled simdjson parses at 2.6GB/s and MSVC 2019-compiled simdjson parses at 1.0GB/s. ClangCL parses at 1.4GB/s, so there might be a link.exe thing going on there. My machine is Kady Lake R (AVX2 but not AVX512).

    After investigation: these seem to be the major impactors:

    • [ ] 40%: @TrianglesPCT may be fixing some or all of the most major regression, caused by generic SIMD, by removing lambdas.
    • [ ] 10%: We need to understand why this did not fully recover the performance we had before this. Either one of them could be the culprit, but it's probably not anything in between.
    • [ ] 10%: We need to understand why we lost another 10% to the stage 1 structural scanner refactor.

    Data

    g++ 7.5.0 under WSL

    [email protected]:~/simdjson/build$ benchmark/parse ../jsonexamples/twitter.json
    number of iterations 200 
                                                         
    ../jsonexamples/twitter.json
    
         9867 blocks -     631515 bytes - 55263 structurals (  8.8 %)
    special blocks with: utf8      2284 ( 23.1 %) - escape       598 (  6.1 %) - 0 structurals      1287 ( 13.0 %) - 1+ structurals      8581 ( 87.0 %) - 8+ structurals      3272 ( 33.2 %) - 16+ structurals         0 (  0.0 %)
    special block flips: utf8      1104 ( 11.2 %) - escape       642 (  6.5 %) - 0 structurals       940 (  9.5 %) - 1+ structurals       940 (  9.5 %) - 8+ structurals      2593 ( 26.3 %) - 16+ structurals         0 (  0.0 %)
    
    All Stages
    |    Speed        :  24.3210 ns per block ( 70.04%) -   0.3800 ns per byte -   4.3429 ns per structural -    2.631 GB/s
    |- Stage 1
    |    Speed        :  11.5728 ns per block ( 33.33%) -   0.1808 ns per byte -   2.0665 ns per structural -    5.530 GB/s
    |- Stage 2
    |    Speed        :  12.6267 ns per block ( 36.36%) -   0.1973 ns per byte -   2.2547 ns per structural -    5.068 GB/s
    
    3181.7 documents parsed per second
    

    VS 2019 (cl.exe 19.25.28614)

    PS C:\Users\john\Source\simdjson\build> .\benchmark\Release\parse.exe ..\jsonexamples\twitter.json
    number of iterations 200 
    
    ..\jsonexamples\twitter.json
    
         9867 blocks -     631515 bytes - 55263 structurals (  8.8 %)
    special blocks with: utf8      2284 ( 23.1 %) - escape       598 (  6.1 %) - 0 structurals      1287 ( 13.0 %) - 1+ structurals      8581 ( 87.0 %) - 8+ structurals      3272 ( 33.2 %) - 16+ structurals         0 (  0.0 %)
    special block flips: utf8      1104 ( 11.2 %) - escape       642 (  6.5 %) - 0 structurals       940 (  9.5 %) - 1+ structurals       940 (  9.5 %) - 8+ structurals      2593 ( 26.3 %) - 16+ structurals         0 (  0.0 %)
    
    All Stages
    |    Speed        :  65.5249 ns per block ( 83.29%) -   1.0239 ns per byte -  11.7004 ns per structural -    0.977 GB/s
    |- Allocation
    |    Speed        :   2.8679 ns per block (  3.65%) -   0.0448 ns per byte -   0.5121 ns per structural -   22.315 GB/s
    |- Stage 1
    |    Speed        :  32.2862 ns per block ( 41.04%) -   0.5045 ns per byte -   5.7652 ns per structural -    1.982 GB/s
    |- Stage 2
    |    Speed        :  29.4285 ns per block ( 37.41%) -   0.4598 ns per byte -   5.2549 ns per structural -    2.175 GB/s
    
    1976.0 documents parsed per second
    

    VS 2019 (cl.exe 19.25.28614) with /arch:AVX2

    Compiling with /arch:AVX2 only gave a 10% improvement:

    PS C:\Users\john\Source\simdjson\build> .\benchmark\Release\parse.exe ..\jsonexamples\twitter.json
    number of iterations 200
    
    ..\jsonexamples\twitter.json
    
         9867 blocks -     631515 bytes - 55263 structurals (  8.8 %)
    special blocks with: utf8      2284 ( 23.1 %) - escape       598 (  6.1 %) - 0 structurals      1287 ( 13.0 %) - 1+ structurals      8581 ( 87.0 %) - 8+ structurals      3272 ( 33.2 %) - 16+ structurals         0 (  0.0 %)
    special block flips: utf8      1104 ( 11.2 %) - escape       642 (  6.5 %) - 0 structurals       940 (  9.5 %) - 1+ structurals       940 (  9.5 %) - 8+ structurals      2593 ( 26.3 %) - 16+ structurals         0 (  0.0 %)
    
    All Stages
    |    Speed        :  60.7013 ns per block ( 82.70%) -   0.9485 ns per byte -  10.8391 ns per structural -    1.054 GB/s
    |- Allocation
    |    Speed        :   2.4726 ns per block (  3.37%) -   0.0386 ns per byte -   0.4415 ns per structural -   25.882 GB/s
    |- Stage 1
    |    Speed        :  27.1889 ns per block ( 37.04%) -   0.4249 ns per byte -   4.8550 ns per structural -    2.354 GB/s
    |- Stage 2
    |    Speed        :  29.8135 ns per block ( 40.62%) -   0.4659 ns per byte -   5.3236 ns per structural -    2.147 GB/s
    
    2246.1 documents parsed per second
    
  • UTF-8 validation flag lookup algorithm

    UTF-8 validation flag lookup algorithm

    This lookup algorithm's primary feature is that it does most of the work with 3 lookup tables against the high nibbles of bytes 1 and 2, and the low nibble of byte 1.

    EDIT: @zwegner independently came up with a better variant that uses scalar masks to process continuation bytes (which probably makes better use of the execution cores by spreading the load across simd and scalar execution units). I have integrated it here. ARM uses fastvalidate still, because none of the algorithms could match it.

    UTF-8 Shootout

    While evaluating the algorithms, I ran a "UTF-8 shootout" to figure out what was going to be the fastest. What you see here represents the winner :)

    I put all of the algorithms in here in separate headers, which can be switched between by changing the #include. A brief "shootout" between the algorithms, running stage 1 against twitter.json with ./parse -tf -n 1000 jsonexamples/twitter.json (and ./parse -tfs -n 1000 jsonexamples/twitter.json for SSE) yields this on my Kaby Lake machine (run multiple times for each and pick the best number):

    twitter.json:

    | | AVX2 | SSE4.2 | |--------------|--------------|----------------| | @zwegner | 5.952074 | 3.364491 | | lookup | 5.825784 | 3.400727 | | range | 5.715068 | 3.263643 | | fastvalidate | 5.588628 | 3.208918 |

    random.json:

    | | AVX2 | SSE4.2 | |--------------|--------------|----------------| | @zwegner | 4.318748 | 2.632677 | | lookup | 4.087078 | 2.387633 | | range | 3.917698 | 2.295306 | | fastvalidate | 3.778505 | 2.174089 |

    gsoc-2018.json:

    | | AVX2 | SSE4.2 | |--------------|--------------|----------------| | @zwegner | 7.923407 | 4.403640 | | lookup | 8.091007 | 4.583158 | | range | 7.891465 | 4.483739 | | fastvalidate | 7.949907 | 4.563674 |

    This algorithm uses 4-bit table lookups to lookup 8-bit "error flags" from each nibble, and treats them as an error if all nibbles in the sequence have the error flag. Turns out UTF-8 has 8 3-nibble sequences (byte 1 alongside the high nibble of the next 4 bytes) and 2 2-nibble error sequences (byte 1 by itself). It is also notable that, incredibly, no more than [8 distinct combinations] of the first byte and later paired bytes are required to reliably detect all errors.

    It works by sequences of 4-bit table lookups, &'d together, so that the error is present only if all nibbles are part of the error value. For example, to detect this overlong encoding of "a" (0x61):

    • overlong encoding of a: 11000001 10010001
    • Lookup high nibble 1100 in this table, yielding ERROR_OVERLONG_2
    • Lookup low nibble 0001 in this table, yielding ERROR_OVERLONG_2
    • When &'d together, the bits align and we have an error! If the low nibble had been 0010, we would have detected no errors.

    The algorithm is simple:

    1. First byte errors: Check for errors that only require the first byte to detect: overlong encoding of 2 bytes (1100000_) and a subset of too-large encodings (anything >= 11110101). This is done by a pair of table lookups (high and low nibble) and a single &, Too-large encodings could be done quickly with >, but when we're figuring out both of these together, I couldn't find any way to beat 2 table lookups and an &, instruction-count-wise.
    2. Second byte errors: Check for errors that require the first byte + another (overlong 3- and 4-byte encodings, missing/extra continuations, some of the smaller overlarge values, and surrogates). To accomplish this, we're essentially doing the same thing as step 1 (flag lookups with &) but with 3 nibbles each time.

    This made AVX around 1-1.5% faster on my machine (about the same as the range lookup algorithm), and astonishingly passed make test the first time I ran it! Submitting this is partly to see what ARM thinks of it :)

    I'm curious if anyone has ideas for further improvement. It feels like overuse of the AND-flag-lookup method, but that method takes so few instructions (4-5 per pair) that it's hard to come up with non-lookup-based methods that compare. I suspect 2-byte-long error detection of overlong/surrogate/too-large bytes is about as optimal as it can get, but perhaps there are clever ways to handle missing/extra continuation detection or the 1-byte-long error detection that save us a few instructions.

  • [WIP] Exact float parsing

    [WIP] Exact float parsing

    This is an attempt at providing exact (or more exact) float parsing at high speed.

    • [ ] in functions like compute_float_64 and the 128-bit counterpart, we need to ensure that the binary exponent is always greater than 0 and strictly smaller than 0x7FF.
    • [ ] we need to add more testing
    • [ ] we should compare the performance against abseil-cpp, ScanadvDouble, Boost Spirit and Andrei Alexandrescu's implementation (Folly)
    • [ ] the lookup tables could be made smaller, we should investigate
    • [ ] Make sure that the code compiles under most compilers including Visual Studio; this involves avoiding 128-bit integers.
    • [ ] Benchmark on ARM processors.

    Fixes https://github.com/lemire/simdjson/issues/242

    This replaces https://github.com/lemire/simdjson/pull/303

  • elimination of g++ -Weffc++ warnings

    elimination of g++ -Weffc++ warnings

    • document.h bugfixes
      • operator++
      • document.h
        • noninitialized members
        • document_stream load_many trailing whitespace
    • implementation.h
      • missing virtual destructor
      • whitespaces (not me my editor)
    • parsedjson_iterator.h
      • operator=
    • document_stream.h
      • trailing blank
      • noninitialized members
    • document.h
      • trailing witespace
      • noninitialized members
      • operator++
    • parsedjson_iterator.h
      • noninitialized members json_minifier.h
      • noninitialized members json_scanner.h
      • noninitialized members
      • trailing space
    • json_structural_indexer.h
      • noninitialized members
      • trailing space
    • stage2_build_tape.h
      • noninitialized members
  • MSVC simdjson twice as slow as ClangCL

    MSVC simdjson twice as slow as ClangCL

    On Windows, simdjson compiled with MSVC parses twitter.json at almost half the speed of ClangCL-compiled simdjson, on the same machine, in the same command prompt.

    | Platform | Overall | Stage 1 | Stage 2 | |---|---|---|---| | MSVC 19.25.28614 | 1.3051 | 2.3777 | 3.3898 | | ClangCL 9.0.0 | 2.2221 | 5.4161 | 4.6401 |

    Methodology:

    • MSVC: git clean -ffdx build && cmake -B build && cmake --build build --target parse --config Release && build\benchmark\Release\parse jsonexamples\twitter.json
    • ClangCL: git clean -ffdx build && cmake -B build -T ClangCL && cmake --build build --target parse --config Release && build\benchmark\Release\parse jsonexamples\twitter.json

    I validated that MSVC simdjson is using the haswell implementation, both by running json2json to print out the implementation, and by doing set SIMDJSON_FORCE_IMPLEMENTATION=haswell.

  • [WIP] Unroll the loop, do more work during pipeline stall

    [WIP] Unroll the loop, do more work during pipeline stall

    This patch improves simdjson performance on AVX-2 by 3-4% in my tests. This comment describes the 4 changes made, and the reasons why. It also adds a -f option to parse.cpp so that causes it to only run find_structural_bits--this was critical to isolating the different performance gains.

  • Faster float parsing

    Faster float parsing

    @lemire here's a version that works for all floats in canada.json. there are a few edge cases to be worked out, and the perf does take a hit (about a 20% increase). but i will be playing with using a mix of powers of 10 and 5, or just using both all 10 and all 5, to see if i can get more accurate mantissas (or at least, mantissas whose inaccuracy is not correlated). it does hit the fast path most of the time, just gets hit hard by what i guess is a slow strtod on this mac

  • Change new usages to std::allocator to accept custom memory allocator

    Change new usages to std::allocator to accept custom memory allocator

    I've checked the performance using benchmark/perfdiff as well as benchmark/on_demand, and there seems to be no difference.

    The main issue with this PR is that std::allocator::allocate throws bad_alloc when it fails to allocate instead of gracefully failing into a nullptr like new (std::nothrow) does. Per this stack overflow post, forcing a bad_alloc with -fno_exceptions would result in an abort() (also confirmed locally).

    Edit: Also wasn't sure to mess with malloc calls or not, so I chose not to.

    Closes issue #1017

  • Make `object[

    Make `object["field"]` order-insensitive in On Demand

    This makes field lookup order-insensitive by default in On Demand: out-of-order lookups will succeed.

    This means this JSON can be properly processed with code that does object["x"].get_double() + object["y"].get_double():

    [
      { "x": 1, "y": 2 },
      { "y": 2, "x": 1 }
    ]
    

    The previous order-sensitive behavior can still be accessed with object.find_field("field").

    Design

    • object.find_field("field") does an order-sensitive search.
    • object.find_field_unordered("field") does an order-insensitive search.
    • object["field"] uses find_field_unordered().
    • When fields are in order, it behaves exactly the same as find_field(), starting after "x" when it's looking for "y".
    • When fields are out of order, the find_field() algorithm will reach the end without finding anything. At this point, find_field_unordered() cycles back to the beginning of the object and searches from there, stopping when it reaches the original (in order) starting point. If the field is still not found, it returns NO_SUCH_FIELD. This is what allows it to find "y" even though it's not after "x".

    Performance

    This may cause minor performance regressions relative to order-sensitive lookup (there is certainly no reason to believe it will improve performance). There is now an OnDemandUnordered version of the LargeRandom benchmark, which is roughly 2% worse than the ordered version, a real difference caused almost entirely by increased instruction count:

    | Benchmark | Throughput | Instructions | Branch Misses | |---|---|---|---| | LargeRandom<OnDemand> | 634.066 MB/s | 614,119,494 | 972,480 | | LargeRandom<OnDemandUnordered> | 650.795 MB/s | 636,119,541 | 956,152 |

    All benchmarks were updated to use find_field() since they are intended to be order-sensitive. This also let me verify that there is, as expected, no performance difference between the previous [] and the new find_field().

  • Add get_uint32()/get_int32()

    Add get_uint32()/get_int32()

    Couldn't find any discussion on this. User can of course check if value is in the range, but this is such a common case that it would be a useful addition to get_uint64() and get_int64().

  • RISC-V 64 - Vector Support

    RISC-V 64 - Vector Support

    Hi!

    I have recently been doing some tests using riscv_vector.h and still can't find any tests in real projects, only examples cited in the references below. I don't know how significant it would be to get this advantage in a new architecture, but it might be interesting to try as many in simulated environments and real devices.


    References

  • Document and test std::ranges functionality

    Document and test std::ranges functionality

    For C++20 users, we support std::ranges but it undocumented.

    #include "simdjson.h"
    #include <iostream>
    #include <ranges>
    using namespace simdjson;
    int main(void) {
      auto cars_json = R"( [
      { "make": "Toyota", "model": "Camry",  "year": 2018, "tire_pressure": [ 40.1, 39.9, 37.7, 40.4 ] },
      { "make": "Kia",    "model": "Soul",   "year": 2012, "tire_pressure": [ 30.1, 31.0, 28.6, 28.7 ] },
      { "make": "Toyota", "model": "Tercel", "year": 1999, "tire_pressure": [ 29.8, 30.0, 30.2, 30.5 ] }
    ] )"_padded;
      dom::parser parser;
      auto justmodel = [](auto car) { return car["model"]; };
      for (auto car : parser.parse(cars_json).get_array() | std::views::transform(justmodel)) {
        std::cout << car << std::endl;
      }
    }
    

    Also it would be nice to extend std::ranges to ondemand.

  • Providing a list of security-related bugs

    Providing a list of security-related bugs

    When deciding whether or not to update my version of simdjson in ZippyJSON, one of the big factors is security. I'm not necessarily interested in new APIs or features, but I want to make sure that the version I use is as secure as possible. That said, I don't want to upgrade every time if I can help it (and also risk encountering new bugs). It would be great if there was a list somewhere of simdjson's security-related bugs, e.g. buffer overflows, the version they were fixed in (if fixed), and the version they were first introduced (ideally). It would help me decide when to upgrade my version of simdjson.

  • get_float() to retrive 4byte float missing.

    get_float() to retrive 4byte float missing.

    Hi folks!

    We need to retrieve a lot of float information from JSON files but the only get() helper we found in your lib is the get_double() which returns an 8byte double variable. This is a serious performance issue due to conversion we need to do to float value.

    ¿Can you help use about? ¿Can you please guide us and tell if we have some f(x) to get 4byte float in your lib?

    Thanks.

  • Check input is valid JSON without full parsing

    Check input is valid JSON without full parsing

    Discussed in https://github.com/simdjson/simdjson/discussions/1393

    Originally posted by daverigby January 14, 2021 Hi all,

    As per subject, I'm wondering what's the most efficient way to use simdjson to simply check an input is valid JSON, without actually parsing out the content. Essentially the same use-case as http://www.json.org/JSON_checker/

    Ideally this would require state which isn't proportional to the size of the input document; for my use-case I need to validate many MB JSON files without consuming a similar amount of memory.

    Thanks in advance.

A header only C++11 library for parsing TOML

tinytoml A header only C++11 library for parsing TOML. This parser is based on TOML v0.4.0. This library is distributed under simplified BSD License.

Oct 10, 2022
cpptoml is a header-only library for parsing TOML

cpptoml A header-only library for parsing TOML configuration files. Targets: TOML v0.5.0 as of August 2018. This includes support for the new DateTime

Dec 1, 2022
Use to copy a file from an NTFS partitioned volume by reading the raw volume and parsing the NTFS structures.

ntfsDump Use to copy a file from an NTFS partitioned volume by reading the raw volume and parsing the NTFS structures. Similar to https://github.com/P

Nov 28, 2022
JSON for Modern C++
JSON for Modern C++

Design goals Sponsors Integration CMake Package Managers Pkg-config Examples JSON as first-class data type Serialization / Deserialization STL-like ac

Nov 29, 2022
A C++ library for interacting with JSON.

JsonCpp JSON is a lightweight data-interchange format. It can represent numbers, strings, ordered sequences of values, and collections of name/value p

Dec 4, 2022
Header-only library for automatic (de)serialization of C++ types to/from JSON.

fuser 1-file header-only library for automatic (de)serialization of C++ types to/from JSON. how it works The library has a predefined set of (de)seria

Oct 20, 2022
Similar to C++ streams, but the stream elements are structured JSON data rather than characters.

JIOS : JSON Input Output Streams Similar to C++ streams, but the stream elements are structured JSON data rather than characters. Contents Features [P

Aug 16, 2019
Yet another JSON/YAML/BSON serialization library for C++.
Yet another JSON/YAML/BSON serialization library for C++.

ThorsSerializer Support for Json Yaml Bson NEW Benchmark Results Conformance mac linux Performance max linux For details see: JsonBenchmark Yet anothe

Oct 27, 2022
Parsing gigabytes of JSON per second
Parsing gigabytes of JSON per second

simdjson : Parsing gigabytes of JSON per second JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach

Nov 30, 2022
Parsing gigabytes of JSON per second
Parsing gigabytes of JSON per second

simdjson : Parsing gigabytes of JSON per second JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach

Nov 30, 2022
A low-latency LRU approximation cache in C++ using CLOCK second-chance algorithm. Multi level cache too. Up to 2.5 billion lookups per second.
A low-latency LRU approximation cache in C++ using CLOCK second-chance algorithm. Multi level cache too. Up to 2.5 billion lookups per second.

LruClockCache Low-latency LRU approximation cache in C++ using CLOCK second-chance algorithm. (see wiki for details) using MyKeyType = std::string; us

Oct 13, 2022
https://github.com/json-c/json-c is the official code repository for json-c. See the wiki for release tarballs for download. API docs at http://json-c.github.io/json-c/

\mainpage json-c Overview and Build Status Building on Unix Prerequisites Build commands CMake options Testing Building with vcpkg Linking to libjson-

Nov 28, 2022
A C++, header-only library for constructing JSON and JSON-like data formats, with JSON Pointer, JSON Patch, JSON Schema, JSONPath, JMESPath, CSV, MessagePack, CBOR, BSON, UBJSON

JSONCONS jsoncons is a C++, header-only library for constructing JSON and JSON-like data formats such as CBOR. For each supported data format, it enab

Nov 26, 2022
Simdutf - Unicode routines (UTF8, UTF16): billions of characters per second.
Simdutf - Unicode routines (UTF8, UTF16): billions of characters per second.

simdutf: Unicode validation and transcoding at billions of characters per second Most modern software relies on the Unicode standard. In memory, Unico

Nov 29, 2022
A convenience C++ wrapper library for JSON-Glib providing friendly syntactic sugar for parsing JSON

This library is a wrapper for the json-glib library that aims to provide the user with a trivial alternative API to the API provided by the base json-

Oct 19, 2022
A simple class for parsing JSON data into a QVariant hierarchy and vice versa.

The qt-json project is a simple collection of functions for parsing and serializing JSON data to and from QVariant hierarchies. NOTE: Qt5 introduced a

Nov 29, 2022
A fast streaming JSON parsing library in C.

********************************************************************** This is YAJL 2. For the legacy version of YAJL see https

Nov 24, 2022
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.

Branch master develop Azure Docs Drone Matrix Fuzzing --- Appveyor codecov.io Boost.JSON Overview Boost.JSON is a portable C++ library which provides

Dec 1, 2022
Very fast Python JSON parsing library

cysimdjson Fast JSON parsing library for Python, 7-12 times faster than standard Python JSON parser. It is Python bindings for the simdjson using Cyth

Nov 20, 2022
An easy-to-use and competitively fast JSON parsing library for C++17, forked from Bitcoin Cash Node's own UniValue library.

UniValue JSON Library for C++17 (and above) An easy-to-use and competitively fast JSON parsing library for C++17, forked from Bitcoin Cash Node's own

Sep 21, 2022