FlatBuffers: Memory Efficient Serialization Library

logo FlatBuffers

Build Status Build status Join the chat at https://gitter.im/google/flatbuffers Discord Chat Twitter Follow

FlatBuffers is a cross platform serialization library architected for maximum memory efficiency. It allows you to directly access serialized data without parsing/unpacking it first, while still having great forwards/backwards compatibility.

Go to our landing page to browse our documentation.

Supported operating systems

  • Windows
  • MacOS X
  • Linux
  • Android
  • And any others with a recent C++ compiler.

Supported programming languages

  • C++
  • C#
  • C
  • Dart
  • Go
  • Java
  • JavaScript
  • Lobster
  • Lua
  • PHP
  • Python
  • Rust
  • TypeScript

and more in progress...

Contribution

To contribute to this project, see CONTRIBUTING.

Licensing

Flatbuffers is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.


Comments
  • Feature RFC: Flatbuffers support for Optional types.

    Feature RFC: Flatbuffers support for Optional types.

    Note: This top comment is heavily edited. It was kept up with the current state until this issue closed.

    Motivation

    Flatbuffers should allow users to choose to use optional types. There has been some interest in distinguishing between default values (which are not stored in the binary) and some notion of None which the user controls.

    Here are some links to previous interest in this idea:

    • #333
    • #3777
    • https://groups.google.com/forum/#!topic/flatbuffers/1hrNtBI0BQI
    • https://github.com/google/flatbuffers/issues/5875#issuecomment-619737916

    Currently, a user can control a field's presence in the binary by specifying "force_defaults" and checking "IsFieldPresent" which is a bit of a hack. This proposal should define proper Flatbuffers optional types, which should be a better way of doing this. Use of this feature is only advisable for new fields, since changing default values is in general backwards-incompatible.

    How do we represent this in the schema file?

    We will specify it like so

    table Monster { mana: int = null; }
    

    This visually implies that optional types are at odds with default values and is "consistent" since the value to the right of the equals sign is what we interpret non-presence to mean.

    Change to Schema Grammar:

    field_decl = ident : type [ = (scalar | null) ] metadata ;

    ~We can add a field tag, e.g. "optional" or "no_default", that triggers this behavior. Hopefully no one is using those tags. Maybe we can make it specifiable to flatc, an "--optional-field-keyword-tag" flag, just in case people are using it and can't stop.~

    How do we represent this in the binary?

    We are going with option (A).

    (A) Non-Presence means None

    Instead of omitting zero-like values, the generated code must store them. Non-presence for optional fields no longer means "whatever the default is," now it means None. You can interpret it as "the default value is None". This also means we cannot specify both specify a non-null default and mark the field as optional.

    Pros:

    • This seems more intuitive.
    • It aligns with the "force_defaults" + "IsFieldPresent" hacky manual approximation of this feature.
    • If Nones are more common than zero-likes then this will have smaller binaries.

    Cons:

    • @aardappel thinks this is harder to implement

      "making presence an indicator would require we pass this special field status down to the field construction code to override the current val == default check, which means slowdown, codegen and runtime changes in all languages.. whereas my "least likely to be used default" trick requires no changes"

    ~(B) Some Sentinel value means None~

    In this scenario, zero-like values are still not stored. Instead we choose some "sentinel" value which we interpret to be None (e.g. int can use int_min and float can use some kind of Nan).

    Pros:

    • @aardappel thinks this is easier to implement

      "it requires the schema parser to set default values for you, and no changes anywhere else"

    • If zero-likes are more common than None then this will have smaller binaries

    Cons:

    • Someone might want to use the sentinel value (Hyrum's law).
      • This can be mitigated by publishing the sentinels and letting users decide whether they need the sentinels.
    • This probably won't work for fields representing raw bits.

    How do we represent this in every language API?

    We'll need to change the type signature of all generated code (building/reading/mutating/object/etc) around the optional type to signal its optional-ness. I think we should use the language's local standard for optional types. Suggestions:

    • Python: Optional[T].
    • Rust: Option<T>.
    • C++17 has std::optional<T> but its not obvious what to use for earlier versions. T* would work.
    • Java: Optional shows up in Java 1.8 and triggers autoboxing, so idk :/

    The exact generated-API for a language should be discussed in the PR implementing this feature in that language.

    Out of scope

    (I'll add links if you make issues for these feature requests)

    • Syntactic types
    • Default values for strings and tables

    TODO

    | task | owner | done |---|---|---| | Change flatc to support schemas with optional types and cause an error if they're used in unsupported languages | @CasperN | #6026 ✅ | Implement optional type API in C++ | @vglavnyy | #6155 ✅ | Implement optional type API in Java | @paulovap | #6212 ✅ | Implement optional type API in Rust | @CasperN | #6034 ✅ | Implement optional type API in Swift | @mustiikhalil | #6038 ✅ | Implement optional type API in lobster | @aardappel | ✅ | Implement optional type API in Kotlin | @paulovap | #6115 ✅ | Implement optional type API in Python | @rw? | Implement optional type API in Go | @rw? | Implement optional type API in C | @mikkelfj | ✅ | Implement optional type API in C# | @dbaileychess | #6217 ✅ | Implement optional type API in Typescript/javacscript | @krojew | #6215 ✅ | Php, Dart, etc ... | ? | Update documentation to advertise this feature | @cneo | #6270 ✅

    [edits]

    • added todo list
    • added points from the discussion to each section.
    • added out of scope section
    • Decision: go with (A) and use = null syntax in schema file (cross out alternatives)
    • Updated TODO list, finished parser, Rust in progress
    • Change to schema grammar, link to swift PR, Note at top
    • Added more languages to the TODO
    • Lobster support 🦞
    • Kotlin and C support
    • Java, C#, TS/JS support and docs, issue closed, no longer editing.
  • Fixed array length

    Fixed array length

    This change ports the pull request #3987 by daksenik and idoroshev to the most recent commit and adds support for generation of JSON schema, Java, C# and Python.

  • Support for Rust programming language

    Support for Rust programming language

    This pull request include an implementation of the Flatbuffers runtime for the Rust programming language and Rust code generation via the flatc binary.

    Code is well documented and currently passes a full suite of tests. I do however still need to add a sample and do more benchmark testing. In the meantime I want to get this on the projects radar to discuss.

    Thanks!

  • [Swift] Swift implementation 🎉🎉

    [Swift] Swift implementation 🎉🎉

    Good afternoon,

    @mzaks and I hope with this PR we would be able to add an official implementation for swift, it's heavily inspired by the C++ and C# implementations. There are some elements that are still missing such as the code generator and also documentation for the code base, however I found that we can implement those after getting an initial review for the swift code base, and if it lives up to the standards.

    The FlatBuffer class uses apples underlaying UnsafeMutableRawPointer which allows is to write directly to the memory, and lefts a lot of heavy weight from our shoulders, since this implementation comes which the following function storeBytes with takes a value and it's Bytes representation and just adds it to the buffer. UnsafeMutableRawPointer the name is a bit misleading since it's actually safe to use if everything writing the bytes is well structured. and since we don't really allow the user to get to the underlying adding and pushing to the buffer and only allowing them to do so through the builder it would be completely safe to use it.

    The FlatBuffersBuilder class uses the same underlying logic as C++ and C#, but unlike C# it relies on generics to actually add bytes and elements into the buffer. we took a different approach regarding adding the structs into the buffer, since swift uses the storeBytes to actually store the struct into the buffer, and since FlatBuffers do not allow string in Structs we took that into our advantage.

    struct Vec2: Writeable {
        var _x: Float32
        var _y: Float32
        var _z: Float32
        var _c: Color2
    
         init(x: Float32, y: Float32, z: Float32, color: Color2) { _c = color; _x = x; _y = y; _z = z }
    
         static func createVec2(_ bb: FlatBuffersBuilder, v: Vec2) -> Offset<UOffset> {
            return bb.create(struct: v)
        }
    }
    

    the following struct will be able to be directly inserted into the buffer, and we will be implementing a reader struct that will read the Writable struct as shown below. This will have the same concept as the builder Objects in c++. The readers were implemented to follow the C# way of fetching elements from the buffer.

    struct Vec2_Read: Readable {
        private var __p: Struct
        init(_ fb: FlatBuffer, o: Int32) { __p = Struct(bb: fb, position: o) }
        var c: Color2 { return Color2(rawValue: __p.readBuffer(of: Int32.self, at: 12)) ?? .red }
        var x: Float32 { return __p.readBuffer(of: Float32.self, at: 0)}
        var y: Float32 { return __p.readBuffer(of: Float32.self, at: 4)}
        var z: Float32 { return __p.readBuffer(of: Float32.self, at: 8)}
    }
    

    All the test case are passing, and they were created by the flatc code generator for both C++ and C#, and the basic implementation of swift was verified against those two since the code generator isn't implemented yet.

    what's missing: 1- Code Generator 2- Documentation 3- Benchmarking

    when merging this PR we would be closing the following issues, Closes #5504. and I will be rebasing the commit before merging for sure

  • Tracking issue: Rust buffer verification

    Tracking issue: Rust buffer verification

    This is a tracking issue to document design and implementation of a verifier system for Rust Flatbuffers. I'm thinking that we can clone the logic from the C++ codebase.

    The benefit is twofold:

    1. Verifiers let users check if data is valid, thereby providing a security check for unknown data.
    2. If a buffer is verified, we can justify using more unsafe pointer access in Rust, thereby removing bounds checking.

    Anyone have thoughts on this? @aardappel

  • Non smart pointers --gen-object-api version

    Non smart pointers --gen-object-api version

    It is needed to allow the users to assign one struct to another.

    I also added implementation for operator= for unions, now you can do: union1 = union2; or union1 = supportedStruct;

    It is also needed for the upcoming (in the (near) future) Qt support.

  • [TS/JS] New gen TS code gen

    [TS/JS] New gen TS code gen

    Modernize TS/JS code gen by doing a reworked TS only code gen.

    TODO

    • [x] Split output into modules (files)
    • [x] Devise a sensible alias/prefix for flatbuffers imports to not collide with user symbols
    • [x] Track needed imports
    • [x] Alias imported symbols if same name from different namespaces
    • [x] Do not generate extra ;
    • [x] Fix wrong static method generation
    • [x] Drop flatbuffers namespace completely
    • [x] var to let or const
    • [x] Remove ns wrapping/prefixing
    • [x] Fix object API
    • [x] Remove obsolete generator options
    • [x] Fix and verify JavaScriptFlexBuffersTest.js
    • [x] Remove jsdoc type annotations

    Documentation

    • [ ] Decide what to do about Compiler.md, JavaScriptUsage.md, Tutorial.md and javascript_sample.sh

    TODO (undecided)

    • [ ] Resolve closer relative paths in imports
    • [ ] Avoid unused imports
    • [ ] Proper indentation
    • [ ] Const correctness
    • [ ] noImplicitAny compliant generated source
  • FlatBuffers 2.0 tracking issue

    FlatBuffers 2.0 tracking issue

    This issue is to keep track of the FlatBuffers 2.0 release, which is intended to be "soon", but also "when its done".

    2.0 you ask? Wasn't the last one 1.12 ? Yes, because:

    • We're going to attempt to adhere as best we can to "Semantic Versioning" from now on. Past releases were simply 1.0 thru 1.12 with no regard for how many breaking changes were in a release. Generally, every release so far has been breaking for at least 1 language, but we have not expressed that. If we continue our habit of occasionally breaking APIs, the release after this one may thus have to be 3.0 instead of 2.1, but we'll see. Our version architect is @krojew
    • We actually do seem to have more breaking changes than usual, with especially Rust getting a big overhaul, but also Swift, Python and others. So 2.0 seems appropriate.

    What are we waiting for? We are waiting for some of these larger breaking changes to "settle down". This thread will link those issues, once all are merged we'll get the release going.

    Also worth mentioning: we are attempting to work towards a structure where FlatBuffers development is more "distributed", with clear individual maintainers for each language, that may at some point may make language package releases and such independently of the main FlatBuffers releases. But this will take time.

  • [Python] (scalar) vector reading speedup via numpy

    [Python] (scalar) vector reading speedup via numpy

    Reading vectors with generated Python code is slow. There have been other efforts to speed this up but they appear to have stagnated.

    This PR attempts to make a minimal change that covers a use case I personally hit a lot (copying over large vectors that represent nested flatbuffers). There may be other small changes that could drastically speed up other use cases that are not included here since I am unaware of them :)

    What's the change?

    This PR adds support for accessing a scalar vector as a numpy array of the corresponding type. This is much faster than copying large vectors element-by-element.

    • Update the Table class in Python to have a GetVectorAsNumpy method which returns a zero-copy view (in numpy terminology) into a scalar vector cast as the correct type.
    • Update the python-code-generation code (idl_gen_python.cpp) to generate a method which wraps GetVectorAsNumpy in generated code. This method is named <field name>AsNumpy, which can be compared to the generated name of the method to get the length of a vector, <field name>Length. Attempting to use this method if numpy is not installed will result in an error, but otherwise numpy is optional.
    • Update appveyor CI to run Python tests.
    • Update python docs appropriately.

    See also

    • #4090 Byte data from flatbuffer vector into a Python NumPy array
      • This PR can probably close that issue?
    • #4144 Reading and Writing Binary blobs is incredibly slow with the default API!
      • This PR can probably close that issue?
    • #4152 Python: Add numpy array accessors for vectors [WIP]
      • The PR here and that PR seem to accomplish similar things, but I haven't read enough details to say whether or not the PR here completely supersedes that one.
    • #284 Python: Support Cython build
    • #304 Python: Speedup with cython extension (closed by author due to lack of time)
  • Port FlatBuffers to Python.

    Port FlatBuffers to Python.

    Implement code generation and runtime library for Python 2 and 3, derived from the Go implementation. Additionally, the test suite verifies:

    the exact bytes in the Builder buffer during many scenarios, vtable deduplication, and table construction, via a fuzzer derived from the Go implementation.

  • TypeScript support

    TypeScript support

    Thank you for submitting a PR!

    Please make sure you include the names of the affected language(s) in your PR title. This helps us get the correct maintainers to look at your issue.

    If you make changes to any of the code generators, be sure to run cd tests && sh generate_code.sh (or equivalent .bat) and include the generated code changes in the PR. This allows us to better see the effect of the PR.

    If your PR includes C++ code, please adhere to the Google C++ Style Guide, and don't forget we try to support older compilers (e.g. VS2010, GCC 4.6.3), so only some C++11 support is available.

    Include other details as appropriate.

    Thanks!

  • Add LICENSE.txt to python

    Add LICENSE.txt to python

    This PR Fixes #7628 Added LICENSE file to python setup.py and for the next releases the LICENSE fill will be automatically added to the distribution wheel.

  • Dart: Flex buffer crashed if read double round value

    Dart: Flex buffer crashed if read double round value

    Expected: print 1.0 Result: crash

    void test() {
        var builder = flex.Builder()..addDouble(1.0);
     
        final buffer = builder.finish();
        final byteData = ByteData(buffer.lengthInBytes);
        byteData.buffer.asUint8List().setAll(0, buffer);
        final byteBuffer = byteData.buffer;
     
        final ref = flex.Reference.fromBuffer(byteBuffer);
        print(ref.doubleValue);
      }
    
  • [FR] Output Annotated Flatbuffers as a flatbuffer itself

    [FR] Output Annotated Flatbuffers as a flatbuffer itself

    For annotated flatbuffers flatc --annotate we only support a text-based output. It would better if it exported the data as a flatbuffer itself, that other tools could read and parse/display as they desire.

    I think the text-based output is useful, but it can be implemented in terms of reading a flatbuffer.

  • [FR] flatc generates names of output files

    [FR] flatc generates names of output files

    Some build systems (👀 at you bazel) require knowing all the generated filenames up front, so their custom generation build rules work. See genrule's outs parameter, as an example. With flatc right now a single schema can generate multiple files depending on language and configuration flags, and its hard to know a priori what files are going to be produced.

    We sort of have something with our flatc -M option, that prints out a "MakeRules" file per language. But its in a make file format which might not be the best representation, nor is it consistent across languages, or even implemented for every language. So I think one generic implementation that just spits out the generated files names, 1 per line to stdout, would suffice.

    @aardappel for historical comment.

  • Big Endian CI

    Big Endian CI

    We should have at least one CI build that uses big endian to test that things don't break in that environment. We usually get external reports (e.g. #7671) when things break.

Your binary serialization library

Bitsery Header only C++ binary serialization library. It is designed around the networking requirements for real-time data delivery, especially for ga

Dec 1, 2022
Cap'n Proto serialization/RPC system - core tools and C++ library
Cap'n Proto serialization/RPC system - core tools and C++ library

Cap'n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except

Dec 1, 2022
A C++11 library for serialization
A C++11 library for serialization

cereal - A C++11 library for serialization cereal is a header-only C++11 serialization library. cereal takes arbitrary data types and reversibly turns

Dec 2, 2022
Simple C++ 20 Serialization Library that works out of the box with aggregate types!

BinaryLove3 Simple C++ 20 Serialization Library that works out of the box with aggregate types! Requirements BinaryLove3 is a c++20 only library.

Sep 2, 2022
Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications

Zmeya Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications. Zmeya is not even a serializ

Nov 2, 2022
CppSerdes is a serialization/deserialization library designed with embedded systems in mind
CppSerdes is a serialization/deserialization library designed with embedded systems in mind

A C++ serialization/deserialization library designed with embedded systems in mind

Nov 5, 2022
Header-only library for automatic (de)serialization of C++ types to/from JSON.

fuser 1-file header-only library for automatic (de)serialization of C++ types to/from JSON. how it works The library has a predefined set of (de)seria

Oct 20, 2022
C++17 library for all your binary de-/serialization needs

blobify blobify is a header-only C++17 library to handle binary de-/serialization in your project. Given a user-defined C++ struct, blobify can encode

Oct 20, 2022
Yet another JSON/YAML/BSON serialization library for C++.
Yet another JSON/YAML/BSON serialization library for C++.

ThorsSerializer Support for Json Yaml Bson NEW Benchmark Results Conformance mac linux Performance max linux For details see: JsonBenchmark Yet anothe

Oct 27, 2022
Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.

Simple C++ Serialization & Reflection. Cista++ is a simple, open source (MIT license) C++17 compatible way of (de-)serializing C++ data structures. Si

Dec 1, 2022
Nov 28, 2022
Fast Binary Encoding is ultra fast and universal serialization solution for C++, C#, Go, Java, JavaScript, Kotlin, Python, Ruby, Swift

Fast Binary Encoding (FBE) Fast Binary Encoding allows to describe any domain models, business objects, complex data structures, client/server request

Dec 1, 2022
Yet Another Serialization
Yet Another Serialization

YAS Yet Another Serialization - YAS is created as a replacement of boost.serialization because of its insufficient speed of serialization (benchmark 1

Nov 30, 2022
Binary Serialization

Binn Binn is a binary data serialization format designed to be compact, fast and easy to use. Performance The elements are stored with their sizes to

Nov 24, 2022
An implementation of the MessagePack serialization format in C / msgpack.org[C]

CMP CMP is a C implementation of the MessagePack serialization format. It currently implements version 5 of the MessagePack Spec. CMP's goal is to be

Nov 17, 2022
MPack - A C encoder/decoder for the MessagePack serialization format / msgpack.org[C]

Introduction MPack is a C implementation of an encoder and decoder for the MessagePack serialization format. It is: Simple and easy to use Secure agai

Nov 30, 2022
Serialization framework for Unreal Engine Property System that just works!

DataConfig Serialization framework for Unreal Engine Property System that just works! Unreal Engine features a powerful Property System which implemen

Nov 27, 2022
Yet Another Serialization
Yet Another Serialization

YAS Yet Another Serialization - YAS is created as a replacement of boost.serialization because of its insufficient speed of serialization (benchmark 1

Sep 7, 2021
universal serialization engine

A Universal Serialization Engine Based on compile-time Reflection iguana is a modern, universal and easy-to-use serialization engine developed in c++1

Dec 2, 2022