High-performance specialized replacements for PHP's pack() and unpack() functions

ext-encoding

High-performance specialized replacements for PHP's pack() and unpack() functions

Under a profiler, it becomes obvious that PHP's pack() and unpack() functions are horribly slow due to the format code parsing. This is a problem for PocketMine-MP because these functions are used in hot paths to encode packets (pmmp/BinaryUtils).

This extension implements specialized replacements for these functions, such as:

  • readShortLE()
  • readShortBE()
  • readIntLE()
  • readIntBE()

and various others.

In synthetic benchmarks, it's shown that these functions are around 3x faster than unpack(). When taking into account the overhead of Binary class and the possibility of implementing a native Binary class directly, the performance difference is closer to 7x faster.

TODO

  • Implement Binary as an extension class.
  • Implement BinaryStream as an extension class (will elide 2 PHP function calls in most cases, saving a bunch of time).
Owner
PMMP
Home of PocketMine-MP, a server software for Minecraft: Bedrock Edition written in PHP and C++
PMMP
Similar Resources

Fast Binary Encoding is ultra fast and universal serialization solution for C++, C#, Go, Java, JavaScript, Kotlin, Python, Ruby, Swift

Fast Binary Encoding (FBE) Fast Binary Encoding allows to describe any domain models, business objects, complex data structures, client/server request

Jun 16, 2022

MessagePack implementation for C and C++ / msgpack.org[C/C++]

msgpack for C/C++ It's like JSON but smaller and faster. Overview MessagePack is an efficient binary serialization format, which lets you exchange dat

Jun 19, 2022

FlatBuffers Compiler and Library in C for C

OS-X & Ubuntu: Windows: The JSON parser may change the interface for parsing union vectors in a future release which requires code generation to match

Jun 14, 2022

Utility to convert any binary file into C source that can be compiled and linked to the executable.

bin2c Utility to convert any binary file into C source that can be compiled and linked to the executable. bin2o Utility to convert any binary file int

Jul 14, 2021

A YAML parser and emitter in C++

yaml-cpp yaml-cpp is a YAML parser and emitter in C++ matching the YAML 1.2 spec. To get a feel for how it can be used, see the Tutorial or How to Emi

Jun 16, 2022

Experimental mutation testing tool for Swift and XCTest powered by mull

Experimental mutation testing tool for Swift and XCTest powered by mull

mull-xctest Experimental mutation testing tool for Swift and XCTest powered by mull. ⚠️ This tool is still experimental and under development. Install

Mar 3, 2022

Use to copy a file from an NTFS partitioned volume by reading the raw volume and parsing the NTFS structures.

ntfsDump Use to copy a file from an NTFS partitioned volume by reading the raw volume and parsing the NTFS structures. Similar to https://github.com/P

Apr 14, 2022

A C++11 ASN.1 BER Encoding and Decoding Library

fast_ber A performant ASN.1 BER encoding and decoding library written in C++11 Introduction fast_ber is a small, lightweight library for BER encoding

May 22, 2022

SimpleBaseLib4CPP is a simple C++11 Base Encoding library that provides at the moment support for encoding and decoding various bases such as Base16, Base32 (various variants), Base58 (various variants), Base64 (various variants).

SimpleBaseLib4CPP SimpleBaseLib4CPP is a simple C++11 Base Encoding library that provides at the moment support for encoding and decoding various base

Nov 29, 2021
Comments
  • Use boost endian library for dealing with byte order conversions

    Use boost endian library for dealing with byte order conversions

    this takes away the hassle of dealing with intrinsics, since the library does it on our behalf, meaning basically free performance.

    This is low priority right now due to other more effective ways of improving performance.

  • Reusable buffers

    Reusable buffers

    Currently around 2/3 of the overhead of write* comes from zend_string_init() and/or zend_string_alloc() due to emalloc() of tiny new strings. These strings are almost immediately discarded in practice - after being appended to a buffer, they are no longer useful. This means this is a giant waste of CPU time (though, it should be noted that this accounts for less than 1/3rd of the total time in benchmarks due to overhead added by PHP code).

    This could be avoided by accepting a string by-reference to write bytes into instead of allocating new strings every time, although it would be better to have some dedicated type which we could reserve() bytes in (like a vector or similar).

    This would also enable having large pooled reusable buffers for encoding, which would further improve performance.

  • Unit tests

    Unit tests

    The following things should be tested:

    • [ ] Basic parity with appropriate pack() and unpack() codes with selected values
    • [ ] Symmetry of encoding and decoding
    • [x] read*() must correctly handle non-reference types by throwing an error
    • [x] read*() must correctly update reference integer offset parameter if given
    • [ ] read*() must correctly handle not being given an offset parameter
    • [x] read*() must read from the correct place when given an offset parameter
    • [x] writeUnsignedVarInt() must handle negative numbers properly (i.e. it must terminate)
    • [x] readUnsignedVarInt() must limit the number of bytes read and error appropriately
    • [x] read*() functions must error properly when not given enough bytes

    to be continued ...

Simple Binary Encoding (SBE) - High Performance Message Codec

Simple Binary Encoding (SBE) SBE is an OSI layer 6 presentation for encoding and decoding binary application messages for low-latency financial applic

Jun 22, 2022
Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.

Simple C++ Serialization & Reflection. Cista++ is a simple, open source (MIT license) C++17 compatible way of (de-)serializing C++ data structures. Si

Jun 11, 2022
Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications

Zmeya Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications. Zmeya is not even a serializ

Jun 6, 2022
Jun 16, 2022
Header-only TOML config file parser and serializer for C++17 (and later!).
Header-only TOML config file parser and serializer for C++17 (and later!).

toml++ homepage ✨ This README is fine, but the toml++ homepage is better. ✨ Library features Header-only Supports the latest TOML release (v1.0.0), pl

Jun 21, 2022
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.
A C++11 or library for parsing and serializing JSON to and from a DOM container in memory.

Branch master develop Azure Docs Drone Matrix Fuzzing --- Appveyor codecov.io Boost.JSON Overview Boost.JSON is a portable C++ library which provides

Jun 20, 2022
libcluon is a small and efficient, single-file and header-only library written in modern C++ to power microservices.

libcluon Linux & OSX Build (TravisCI) Win64 Build (AppVeyor) Test Coverage Coverity Analysis CII Best Practices libcluon is a small single-file, heade

Apr 23, 2022
Cap'n Proto serialization/RPC system - core tools and C++ library
Cap'n Proto serialization/RPC system - core tools and C++ library

Cap'n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except

Jun 23, 2022
Header-only C++11 library to encode/decode base64, base64url, base32, base32hex and hex (a.k.a. base16) as specified in RFC 4648, plus Crockford's base32. MIT licensed with consistent, flexible API.

cppcodec Header-only C++11 library to encode/decode base64, base64url, base32, base32hex and hex (a.k.a. base16) as specified in RFC 4648, plus Crockf

Jun 8, 2022