Selective user space swap (kubernetes swap / kubeswap)

BigMaac 🍔 🍟 ( Big Malloc Access And Calloc )

because sometimes a happy meal is not big enough

BigMaac can be used in userspace (e.g. inside Kubernetes containers) to enable selective user space swap for large (user defined) memory allocations (typically your data objects).

BigMaac intercepts calls to memory management that would normally get mapped directly to the heap/RAM and instead returns memory allocated from an alternative source (SSD/Disk).

Schematic of memory allocation using BigMaac Some slides showing how BigMaac can be used to overcome OOM

The specific calls BigMaac intercepts are,

malloc(), calloc(), realloc() and free()

If any of these calls are managing memory smaller than BIGMAAC_MIN_BIGMAAC_SIZE (env variable), BigMaac passes them directly through to the original memory management system. However if a memory call exceeds this size, it is no longer a small fry but instead it is a bigmaac, and therefore instead of using RAM directly it uses the Disk backed RAM storage through the magic of mmap().

If memory_request > BIGMAAC_MIN_BIGMAAC_SIZE -> its a 🍔 , so swap to disk if need be
  else memory request gets sent to normal memory management system (RAM)

For example, lets say you are on a system that has no swap available and only 2GB RAM, but! you would like to work with matricies of size 5GB. BigMaac lets you do that! When numpy/python/anything makes the request for 5GB of memory, that call is intercepted and diverted by BigMaac. It is diverted through mmap to Disk backed RAM. Which means that the OS will keep whatever part of the 5GB matrix in can in RAM, and page the rest from disk. If you happen to run the same code on a system with 10GB RAM, it ~should leave the entire contents in RAM, with hopefully not much slowdown.

BigMaac quick start

To use BigMaac,

make

And then ,

LD_PRELOAD=./bigmaac.so your-executable with all the arguments

For example,

LD_PRELOAD=./bigmaac.so python test.py

To run test cases (generate checksums with and without library usage), make test

But I want fries with that!

Memory requests larger than BIGMAAC_MIN_BIGMAAC_SIZE are swapped using mmap with each memory request being backed by a separate file on the swap partition. In order for these to be valid (efficient) mmap mappings memory requests are aligned to a multiple of page size (often 4096bytes) which makes this method of swapping inefficient for smaller (small fries) memory allocations. To handle small fries (<<page_size) a new level of paging is used.

If memory_request > BIGMAAC_MIN_BIGMAAC_SIZE -> its a 🍔 , so swap to disk if need be (using mmap with unique file and page size aligned memory)
Else if memory_request > BIGMAAC_MIN_FRY_SIZE -> its a 🍟 , so swap to disk if need be  (using mmap with a singe file for all fries)
  else memory request gets sent to normal memory management system (RAM)

BIGMAACS are sent to a preallocated 512GB (env variable SIZE_BIGMAAC) virtual address space, each BIGMAAC is backed by its own file on the swap partition [ page aligned ]

FRIES are sent to a preallocated 512GB (env variable SIZE_FRIES) virtual address space, the total of which is backed by its own single file on the swap partition.

For example if you would like to swap all memory allocations above 10MB using BigMaac and all memory allocations above 128bytes using FRIES you would run the following,

LD_PRELOAD=./bigmaac.so BIGMAAC_MIN_BIGMAAC_SIZE=10485760 BIGMAAC_MIN_FRY_SIZE=128 your-executable with all the arguments

The above command will swap all memory allocations larger than 128 bytes, if the allocation is larger than 10MB it will be swapped to its own virtual file on the swap partition. For memory allocations smaller than or equal to 128 bytes the system memory functions are directly called.

Choosing the swap partition

By default /tmp/ is used for swapping memory to disk. If you would like to use a different swap partition you need to change the enviornment variable,

LD_PRELOAD=./bigmaac.so BIGMAAC_TEMPLATE=/swap-partition/bigmaax.XXXXXXXX your-executable with all the arguments

Once a temporary file is opened, it is immediately removed from disk and only the file description remains open in the process running wrapped by BIGMAAC. Once the process dies, the kernel removes the swap files. This gaurantees that no swap files are left behind after the application is done.

How efficient is this?

The main focus of BigMaac is to swap larger memory calls, things like large data matricies that dont always behave as random access and are variable from run to run. To avoid adding overhead to smaller memory calls, all of BIGMAAC and FRIES are kept in a contiguous 1TB (512GB BIGMAAC env SIZE_BIGMAAC / 512GB FRIES env SIZE_FRIES) part of the virtual address space. This allows a simple two pointer comparison to determine if a memory allocation is managed by BIGMAAC or the system library, hopefully adding very minimal overhead to calls that pass through.

Similar Resources

CPU implementation of Seidel aberrations for screen-space DOF by Niels Asberg.

PrimeFocusCPU CPU implementation of Seidel aberrations for screen-space DOF by Niels Asberg. MIT License Copyright (c) 2021 Niels Asberg Permission is

Nov 24, 2021

Exploring the Design Space of Page Management for Multi-Tiered Memory Systems (USENIX ATC'21)

AutoTiering This repo contains the kernel code in the following paper: Exploring the Design Space of Page Management for Multi-Tiered Memory Systems (

Dec 20, 2022

Space invaders clone in C++

memeinvaders Space invaders clone in C++ To build: Install SDL2 (and possibly OpenGL): https://wiki.libsdl.org/Installation Install CMake: https://cma

Jan 8, 2022

Lib 2d - A c++ library for paths defined by points within the 2d space

#lib_2d A c++ library for anything related to points within the 2d space (using floating point data types) using Catch as testing framework https://gi

Dec 16, 2021

HESS (Hyper Exponential Space Sorting) is a polynomial black-box optimization algorithm, that work very well with any NP-Complete, or NP-Hard problem

The original HESS (Hyper Exponential Space Sorting) is a polynomial black-box optimization algorithm, that work very well with any NP-Complete, or NP-Hard problem, at 2021 thanks to suggestions of Daniel Mattes, work like a complete algorithm.

Jan 18, 2022

A Navigator 2.0 based Flutter widget that automatically splits the screen into two views based on available space

A Navigator 2.0 based Flutter widget that automatically splits the screen into two views based on available space

A Navigator 2.0 based Flutter widget that automatically splits the screen into two views based on available space

Sep 17, 2022

imGuIZMO.quat is a ImGui widget: like a trackball it provides a way to rotate models, lights, or objects with mouse, and graphically visualize their position in space, also around any single axis (Shift/Ctrl/Alt/Super)

imGuIZMO.quat is a ImGui widget: like a trackball it provides a way to rotate models, lights, or objects with mouse, and graphically visualize their position in space, also around any single axis (Shift/Ctrl/Alt/Super)

imGuIZMO.quat v3.0 imGuIZMO.quat is a ImGui widget: like a trackball it provides a way to rotate models, lights, or objects with mouse, and graphicall

Dec 28, 2022

Separable Subsurface Scattering is a technique that allows to efficiently perform subsurface scattering calculations in screen space in just two passes.

Separable Subsurface Scattering Separable Subsurface Scattering is a technique that allows to efficiently perform subsurface scattering calculations i

Dec 22, 2022
Comments
  • where is fries file allocated?

    where is fries file allocated?

    I am simply reviewing your code and I can't seem to see where the fries file is allocated? FRIES are sent to a preallocated 512GB (env variable SIZE_FRIES) virtual address space, the total of which is backed by its own single file on the swap partition.

    I see the virtual address space being allocated base_fries = mmap(NULL, size_total, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);

    The only file backed memory is coming from mmap_tmpfile which only is called for allocation of the big maac?

     if (head==_head_bigmaacs) {
            mmap_tmpfile(heap_chunk->ptr,size);
        }
    
  • env BIGMAAC_TEMPLATE is read but never used

    env BIGMAAC_TEMPLATE is read but never used

     const char * template=getenv("BIGMAAC_TEMPLATE");
        if (template==NULL) {
            template=DEFAULT_TEMPLATE;
        }
    
    static void mmap_tmpfile(void * ptr, size_t size) {
        char * filename=(char*)real_malloc(sizeof(char)*(strlen(DEFAULT_TEMPLATE)+1));
        if (filename==NULL) {
            fprintf(stderr,"Bigmaac: failed to allocate memory in remove_chunk\n");
            assert(filename!=NULL);
        }
        strcpy(filename,DEFAULT_TEMPLATE);
    
  • Support for ENOMEM

    Support for ENOMEM

    There are various places where BigMaac could report ENOMEM and return NULL.

    Some applications are coded to gracefully error when there is not enough memory to perform calculations so BigMaac could set errno = ENOMEM; and return NULL.

    This would be useful in situations where the following env variables are set SIZE_BIGMAAC SIZE_FRIES, this would allow users to somewhat mimic k8s resource limits for pods

    Notable locations within the current codebase that could report ENOMEM: fprintf(stderr,"BigMaac : Failed to find available space\n");

    fprintf(stderr,"There is no free memory!\n");

our supper awesome kernel and user space system

osakauss our super awesome kernel and user space system memory layout The kernel is loaded at 0x00100000. kmalloc initially uses 'placement' allocatio

Aug 26, 2021
User space configuration tool for RME HDSPe MADI / AES / RayDAT / AIO and AIO Pro cards driven by the snd-hdspe driver.

hdspeconf User space configuration tool for RME HDSPe MADI / AES / RayDAT / AIO and AIO Pro cards, driven by the snd-hdspe driver. Building hdspeconf

Nov 29, 2022
A collection of user-space Linux kernel specific guided fuzzers based on LKL

kBdysch kBdysch is a collection of fast Linux kernel specific fuzzing harnesses supposed to be run in userspace in a guided fuzzing manner. It was des

Nov 25, 2022
Project Etnaviv is an open source user-space driver for the Vivante GCxxx series of embedded GPUs.

Introduction Project Etnaviv is an open source user-space driver for the Vivante GCxxx series of embedded GPUs. This repository contains reverse-engin

Oct 29, 2022
Cloud Native Data Plane (CNDP) is a collection of user space libraries to accelerate packet processing for cloud applications.

CNDP - Cloud Native Data Plane Overview Cloud Native Data Plane (CNDP) is a collection of userspace libraries for accelerating packet processing for c

Dec 28, 2022
Single-header VMT hook class using vfptr swap method

Single-header C++ VMT hooking (vfptr swap) Supports RAII Unit tested with Catch2 Tested on x86/x64, MSVC and Clang/LLVM VMT size calculation Windows-o

Dec 11, 2022
Allows to swap the Fn key and left Control key and other tweaks on Macbook Pro and Apple keyboards in GNU/Linux

A patched hid-apple kernel module UPDATE August 2020: swap_fn_leftctrl is now built-in in Linux 5.8 ?? UPDATE Jun 2020: New feature added (swap_fn_f13

Dec 29, 2022
ORBION the OpenSource Space Mouse 3D
ORBION the OpenSource Space Mouse 3D

Orbion The OpenSource Space Mouse To ensure greater precision and fluidity it is recommended to put a foam ring under the knob (see photo above) and d

Jan 7, 2023
OSA a is minisatellite/ space probe the size of a can designed to participate in the ESA CanSat 2021 competition 🛰️ 📡 .
OSA a is minisatellite/ space probe the size of a can designed to participate in the ESA CanSat 2021 competition 🛰️ 📡 .

Project OSA OSA a is minisatellite/ space probe the size of a can designed to participate in the ESA CanSat 2021 competition ??️ ?? . Our project is c

Sep 30, 2022
👾 Jupyter Notebook + Space Invaders!?
👾 Jupyter Notebook + Space Invaders!?

Train Invaders Jupyter Notebook + Space Invaders!? Why? • Getting started • How it works • FAQ • Drawbacks • Contribute • Thanks to • You may also lik

Oct 23, 2022