Single header asymmetric stackful cross-platform coroutine library in pure C.

minicoro

Minicoro is single-file library for using asymmetric coroutines in C. The API is inspired by Lua coroutines but with C use in mind.

The project is being developed mainly to be a coroutine backend for the Nelua programming language.

The library assembly implementation is inspired by Lua Coco by Mike Pall.

Features

  • Stackful asymmetric coroutines.
  • Supports nesting coroutines (resuming a coroutine from another coroutine).
  • Supports custom allocators.
  • Storage system to allow passing values between yield and resume.
  • Customizable stack size.
  • Coroutine API design inspired by Lua with C use in mind.
  • Yield across any C function.
  • Made to work in multithread applications.
  • Cross platform.
  • Minimal, self contained and no external dependencies.
  • Readable sources and documented.
  • Implemented via assembly, ucontext or fibers.
  • Lightweight and very efficient.
  • Works in most C89 compilers.
  • Error prone API, returning proper error codes on misuse.
  • Support running with Valgrind, ASan (AddressSanitizer) and TSan (ThreadSanitizer).

Supported Platforms

Most platforms are supported through different methods:

Platform Assembly Method Fallback Method
Android ARM/ARM64 N/A
Windows x86_64 Windows fibers
Linux x86_64/i686 ucontext
Mac OS X x86_64 ucontext
Browser N/A Emscripten fibers
Raspberry Pi ARM ucontext
RISC-V riscv64 ucontext

The assembly method is used by default if supported by the compiler and CPU, otherwise ucontext or fiber method is used as a fallback.

The assembly method is very efficient, it just take a few cycles to create, resume, yield or destroy a coroutine.

Caveats

  • Don't use coroutines with C++ exceptions, this is not supported.
  • When using C++ RAII (i.e. destructors) you must resume the coroutine until it dies to properly execute all destructors.
  • To use in multithread applications, you must compile with C compiler that supports thread_local qualifier.
  • Some unsupported sanitizers for C may trigger false warnings when using coroutines.
  • The mco_coro object is not thread safe, you should lock each coroutine into a thread.
  • Take care to not cause stack overflows, otherwise your program may crash or not, the behavior is undefined.
  • On WebAssembly you must compile with emscripten flag -s ASYNCIFY=1.

Introduction

A coroutine represents an independent "green" thread of execution. Unlike threads in multithread systems, however, a coroutine only suspends its execution by explicitly calling a yield function.

You create a coroutine by calling mco_create. Its sole argument is a mco_desc structure with a description for the coroutine. The mco_create function only creates a new coroutine and returns a handle to it, it does not start the coroutine.

You execute a coroutine by calling mco_resume. When calling a resume function the coroutine starts its execution by calling its body function. After the coroutine starts running, it runs until it terminates or yields.

A coroutine yields by calling mco_yield. When a coroutine yields, the corresponding resume returns immediately, even if the yield happens inside nested function calls (that is, not in the main function). The next time you resume the same coroutine, it continues its execution from the point where it yielded.

To associate a persistent value with the coroutine, you can optionally set user_data on its creation and later retrieve with mco_get_user_data.

To pass values between resume and yield, you can optionally use mco_push and mco_pop APIs, they are intended to pass temporary values using a FIFO (First In, First Out) style buffer. The storage system can also be used to send and receive initial values on coroutine creation or before it finishes.

Usage

To use minicoro, do the following in one .c file:

#define MINICORO_IMPL
#include "minicoro.h"

You can do #include "minicoro.h" in other parts of the program just like any other header.

Minimal Example

The following simple example demonstrates on how to use the library:

#define MINICORO_IMPL
#include "minicoro.h"
#include <stdio.h>

// Coroutine entry function.
void coro_entry(mco_coro* co) {
  printf("coroutine 1\n");
  mco_yield(co);
  printf("coroutine 2\n");
}

int main() {
  // First initialize a `desc` object through `mco_desc_init`.
  mco_desc desc = mco_desc_init(coro_entry, 0);
  // Configure `desc` fields when needed (e.g. customize user_data or allocation functions).
  desc.user_data = NULL;
  // Call `mco_create` with the output coroutine pointer and `desc` pointer.
  mco_coro* co;
  mco_result res = mco_create(&co, &desc);
  assert(res == MCO_SUCCESS);
  // The coroutine should be now in suspended state.
  assert(mco_status(co) == MCO_SUSPENDED);
  // Call `mco_resume` to start for the first time, switching to its context.
  res = mco_resume(co); // Should print "coroutine 1".
  assert(res == MCO_SUCCESS);
  // We get back from coroutine context in suspended state (because it's unfinished).
  assert(mco_status(co) == MCO_SUSPENDED);
  // Call `mco_resume` to resume for a second time.
  res = mco_resume(co); // Should print "coroutine 2".
  assert(res == MCO_SUCCESS);
  // The coroutine finished and should be now dead.
  assert(mco_status(co) == MCO_DEAD);
  // Call `mco_destroy` to destroy the coroutine.
  res = mco_destroy(co);
  assert(res == MCO_SUCCESS);
  return 0;
}

NOTE: In case you don't want to use the minicoro allocator system you should allocate a coroutine object yourself using mco_desc.coro_size and call mco_init, then later to destroy call mco_deinit and deallocate it.

Yielding from anywhere

You can yield the current running coroutine from anywhere without having to pass mco_coro pointers around, to this just use mco_yield(mco_running()).

Passing data between yield and resume

The library has the storage interface to assist passing data between yield and resume. It's usage is straightforward, use mco_push to send data before a mco_resume or mco_yield, then later use mco_pop after a mco_resume or mco_yield to receive data. Take care to not mismatch a push and pop, otherwise these functions will return an error.

Error handling

The library return error codes in most of its API in case of misuse or system error, the user is encouraged to handle them properly.

Library customization

The following can be defined to change the library behavior:

  • MCO_API - Public API qualifier. Default is extern.
  • MCO_MIN_STACK_SIZE - Minimum stack size when creating a coroutine. Default is 32768.
  • MCO_DEFAULT_STORAGE_SIZE - Size of coroutine storage buffer. Default is 1024.
  • MCO_DEFAULT_STACK_SIZE - Default stack size when creating a coroutine. Default is 57344.
  • MCO_MALLOC - Default allocation function. Default is malloc.
  • MCO_FREE - Default deallocation function. Default is free.
  • MCO_DEBUG - Enable debug mode, logging any runtime error to stdout. Defined automatically unless NDEBUG or MCO_NO_DEBUG is defined.
  • MCO_NO_DEBUG - Disable debug mode.
  • MCO_NO_MULTITHREAD - Disable multithread usage. Multithread is supported when thread_local is supported.
  • MCO_NO_DEFAULT_ALLOCATORS - Disable the default allocator using MCO_MALLOC and MCO_FREE.
  • MCO_ZERO_MEMORY - Zero memory of stack for new coroutines and when poping storage, intended for garbage collected environments.
  • MCO_USE_ASM - Force use of assembly context switch implementation.
  • MCO_USE_UCONTEXT - Force use of ucontext context switch implementation.
  • MCO_USE_FIBERS - Force use of fibers context switch implementation.
  • MCO_USE_VALGRIND - Define if you want run with valgrind to fix accessing memory errors.

Benchmarks

The coroutine library was benchmarked for x86_64 counting CPU cycles for context switch (triggered in resume or yield) and initialization.

CPU Arch OS Method Context switch Initialize Uninitialize
x86_64 Linux assembly 9 cycles 31 cycles 14 cycles
x86_64 Linux ucontext 352 cycles 383 cycles 14 cycles
x86_64 Windows fibers 69 cycles 10564 cycles 11167 cycles
x86_64 Windows assembly 33 cycles 74 cycles 14 cycles

NOTE: Tested on Intel Core i7-8750H CPU @ 2.20GHz with pre allocated coroutines.

Cheatsheet

Here is a list of all library functions for quick reference:

/* Structure used to initialize a coroutine. */
typedef struct mco_desc {
  void (*func)(mco_coro* co); /* Entry point function for the coroutine. */
  void* user_data;            /* Coroutine user data, can be get with `mco_get_user_data`. */
  /* Custom allocation interface. */
  void* (*malloc_cb)(size_t size, void* allocator_data); /* Custom allocation function. */
  void  (*free_cb)(void* ptr, void* allocator_data);     /* Custom deallocation function. */
  void* allocator_data;       /* User data pointer passed to `malloc`/`free` allocation functions. */
  size_t storage_size;        /* Coroutine storage size, to be used with the storage APIs. */
  /* These must be initialized only through `mco_init_desc`. */
  size_t coro_size;           /* Coroutine structure size. */
  size_t stack_size;          /* Coroutine stack size. */
} mco_desc;

/* Coroutine functions. */
mco_desc mco_desc_init(void (*func)(mco_coro* co), size_t stack_size);  /* Initialize description of a coroutine. When stack size is 0 then MCO_DEFAULT_STACK_SIZE is used. */
mco_result mco_init(mco_coro* co, mco_desc* desc);                      /* Initialize the coroutine. */
mco_result mco_uninit(mco_coro* co);                                    /* Uninitialize the coroutine, may fail if it's not dead or suspended. */
mco_result mco_create(mco_coro** out_co, mco_desc* desc);               /* Allocates and initializes a new coroutine. */
mco_result mco_destroy(mco_coro* co);                                   /* Uninitialize and deallocate the coroutine, may fail if it's not dead or suspended. */
mco_result mco_resume(mco_coro* co);                                    /* Starts or continues the execution of the coroutine. */
mco_result mco_yield(mco_coro* co);                                     /* Suspends the execution of a coroutine. */
mco_state mco_status(mco_coro* co);                                     /* Returns the status of the coroutine. */
void* mco_get_user_data(mco_coro* co);                                  /* Get coroutine user data supplied on coroutine creation. */

/* Storage interface functions, used to pass values between yield and resume. */
mco_result mco_push(mco_coro* co, const void* src, size_t len); /* Push bytes to the coroutine storage. Use to send values between yield and resume. */
mco_result mco_pop(mco_coro* co, void* dest, size_t len);       /* Pop bytes from the coroutine storage. Use to get values between yield and resume. */
mco_result mco_peek(mco_coro* co, void* dest, size_t len);      /* Like `mco_pop` but it does not consumes the storage. */
size_t mco_get_bytes_stored(mco_coro* co);                      /* Get the available bytes that can be retrieved with a `mco_pop`. */
size_t mco_get_storage_size(mco_coro* co);                      /* Get the total storage size. */

/* Misc functions. */
mco_coro* mco_running(void);                        /* Returns the running coroutine for the current thread. */
const char* mco_result_description(mco_result res); /* Get the description of a result. */

Complete Example

The following is a more complete example, generating Fibonacci numbers:

#define MINICORO_IMPL
#include "minicoro.h"
#include <stdio.h>

static void fail(const char* message, mco_result res) {
  printf("%s: %s\n", message, mco_result_description(res));
  exit(-1);
}

static void fibonacci_coro(mco_coro* co) {
  unsigned long m = 1;
  unsigned long n = 1;

  /* Retrieve max value. */
  unsigned long max;
  mco_result res = mco_pop(co, &max, sizeof(max));
  if(res != MCO_SUCCESS)
    fail("Failed to retrieve coroutine storage", res);

  while(1) {
    /* Yield the next Fibonacci number. */
    mco_push(co, &m, sizeof(m));
    res = mco_yield(co);
    if(res != MCO_SUCCESS)
      fail("Failed to yield coroutine", res);

    unsigned long tmp = m + n;
    m = n;
    n = tmp;
    if(m >= max)
      break;
  }

  /* Yield the last Fibonacci number. */
  mco_push(co, &m, sizeof(m));
}

int main() {
  /* Create the coroutine. */
  mco_coro* co;
  mco_desc desc = mco_desc_init(fibonacci_coro, 0);
  mco_result res = mco_create(&co, &desc);
  if(res != MCO_SUCCESS)
    fail("Failed to create coroutine", res);

  /* Set storage. */
  unsigned long max = 1000000000;
  mco_push(co, &max, sizeof(max));

  int counter = 1;
  while(mco_status(co) == MCO_SUSPENDED) {
    /* Resume the coroutine. */
    res = mco_resume(co);
    if(res != MCO_SUCCESS)
      fail("Failed to resume coroutine", res);

    /* Retrieve storage set in last coroutine yield. */
    unsigned long ret = 0;
    res = mco_pop(co, &ret, sizeof(ret));
    if(res != MCO_SUCCESS)
      fail("Failed to retrieve coroutine storage", res);
    printf("fib %d = %lu\n", counter, ret);
    counter = counter + 1;
  }

  /* Destroy the coroutine. */
  res = mco_destroy(co);
  if(res != MCO_SUCCESS)
    fail("Failed to destroy coroutine", res);
  return 0;
}

Updates

  • 19-Jan-2021: Fix compilation and issues on Mac OS X, release v0.1.1.
  • 19-Jan-2021: First release, v0.1.0.
  • 18-Jan-2021: Fix issues when using Clang on Linux.
  • 17-Jan-2021: Add support for RISC-V 64 bits.
  • 16-Jan-2021: Add support for Mac OS X x86_64, thanks @RandyGaul for testing, debugging and researching about it.
  • 15-Jan-2021: Make assembly method the default one on Windows x86_64. Redesigned the storage API, thanks @RandyGaul for the suggestion.
  • 14-Jan-2021: Add support for running with ASan (AddressSanitizer) and TSan (ThreadSanitizer).
  • 13-Jan-2021: Add support for ARM and WebAssembly. Add Public Domain and MIT No Attribution license.
  • 12-Jan-2021: Some API changes and improvements.
  • 11-Jan-2021: Support valgrind and add benchmarks.
  • 10-Jan-2021: Minor API improvements and document more.
  • 09-Jan-2021: Library created.

Donation

I'm a full-time open source developer. Any amount of the donation will be appreciated and could bring me encouragement to keep supporting this and other open source projects.

Become a Patron

License

Your choice of either Public Domain or MIT No Attribution, see LICENSE file.

Owner
Eduardo Bart
Open source developer, creating Nelua programming language and other game development tools.
Eduardo Bart
Comments
  • Trying to get minicoro working with a C++ wrapper (and smart pointers)

    Trying to get minicoro working with a C++ wrapper (and smart pointers)

    Hi there,

    Have been experimenting with minicoro for a few hours, with a mixture of successes and failures. (I suspect smart pointers are likely to have some issues.)

    I built up a little C++ helper class and some macros to simplify some of the calls:

    struct CoroutineManager
    {
        mco_coro* m_pCo = nullptr;
        mco_desc desc;
    
        void Init(void (*func)(mco_coro*))
        {
            desc = mco_desc_init(func, 0);
            mco_result res = mco_create(&m_pCo, &desc);
            WASSERT(res == MCO_SUCCESS);
        }
    
        void PushParam(const void* src, size_t len)
        {
            mco_push(m_pCo, src, len);
        }
    
        ~CoroutineManager()
        {
            if (m_pCo != nullptr)
            {
                auto res = mco_destroy(m_pCo);
                WASSERT(res == MCO_SUCCESS);
            }
        }
    
        bool YieldNext(void* dest, size_t len)
        {
            if (mco_status(m_pCo) == MCO_SUSPENDED)
            {
                auto res = mco_resume(m_pCo);
                WASSERT(res == MCO_SUCCESS);
    
                res = mco_pop(m_pCo, dest, len);
                WASSERT(res == MCO_SUCCESS);
                return true;
            }
            else
            {
                return false;
            }
        }    
    };
    

    This seems to work (with some caveats), but if I move YieldNext(...) out of the header into the associated cpp file I created, it no longer functions properly. Are you able to tell me why this is?

  • Interrupting a long-running coroutine and moving it to another os-level thread

    Interrupting a long-running coroutine and moving it to another os-level thread

    Hi, I'm new to minicoro and try to understand its capabilities. The readme states:

    The mco_coro object is not thread safe, you should lock each coroutine into a thread.

    Does that mean that it's impossible to interrupt a long-running coroutine and move it to a different os-level thread?

  • Can this run in Switch?

    Can this run in Switch?

    Probably a super easy question to answer, but I'm sort of a noob at this. Wikipedia says (https://en.wikipedia.org/wiki/ARM_Cortex-A57) the CPU is ARMv8-A 64-bit instruction set, does that mean it would fall under the ARM64 assembly implementation?

  • stack overflow with VS2008 / VS2019

    stack overflow with VS2008 / VS2019

    Hi,

    first of all thank you for this library. With the recent update (stack overflow check) I get a message about stack corruption. I've tested it against simple.c, the output I get is:

    coroutine 1 coroutine stack overflow, try increasing the stack size coroutine 2 Assertion failed: mco_status(co) == MCO_SUSPENDED, file simple.c, line 27

    I've compiled it with my ancient complier vc++ 2008 and with the last one vc++ 2019. Same issue occurs.

    Any help ?

    Please note that vc2008 is a C89 compiler, are you interested in a PR that simply move vars definition at beginning of the function ?

    Thank you, Alessio

  • C++ exceptions question

    C++ exceptions question

    What's the reason for that C++ exceptions are not supported? What happens if an exception is thrown anyway? Would it be possible to add support for exceptions?

  • Makefile mt-example link fails due to -lpthread order wrong

    Makefile mt-example link fails due to -lpthread order wrong

    Following patch resolves, this SO has a detailed post on the reason: https://stackoverflow.com/questions/11893996/why-does-the-order-of-l-option-in-gcc-matter

    diff --git a/tests/Makefile b/tests/Makefile
    index b7bc02e..3f8113f 100644
    --- a/tests/Makefile
    +++ b/tests/Makefile
    @@ -36,7 +36,7 @@ example: example.c ../minicoro.h Makefile
     	$(CC) $(EXTRA_CFLAGS) $(CFLAGS) example.c -o example
     
     mt-example: mt-example.c ../minicoro.h Makefile
    -	$(CC) $(EXTRA_CFLAGS) $(CFLAGS) -std=gnu11 -lpthread mt-example.c -o mt-example
    +	$(CC) $(EXTRA_CFLAGS) $(CFLAGS) -std=gnu11 mt-example.c -lpthread -o mt-example
     
     simple: simple.c ../minicoro.h Makefile
     	$(CC) $(EXTRA_CFLAGS) $(CFLAGS) -std=c99 simple.c -o simple
    
  • Mild red zone bug.

    Mild red zone bug.

    Spotted a pretty mild bug related to red zones. The current code shouldn't cause any bugs. It just wastes a few bytes of stack space. https://github.com/edubart/minicoro/blob/main/minicoro.h#L742

    You subtract the 128 bytes from the size of the stack, and then use that size to find the high address of the stack. The red zone is the space with lower addresses than the current stack pointer, while your reserved space is above the start of the stack. Also, the red zone really only applies meaningfully to leaf functions so they can skip adjusting the stack pointer without getting stomped by interrupts. Any function that calls resume/yield is no longer a leaf function, so you don't have to worry about it in your implementation.

Mx - C++ coroutine await, yield, channels, i/o events (single header + link to boost)

mx C++11 coroutine await, yield, channels, i/o events (single header + link to boost). This was originally part of my c++ util library kit, but I'm se

Sep 21, 2019
:copyright: Concurrent Programming Library (Coroutine) for C11

libconcurrent tiny asymmetric-coroutine library. Description asymmetric-coroutine bidirectional communication by yield_value/resume_value native conte

Sep 2, 2022
A golang-style C++ coroutine library and more.

CO is an elegant and efficient C++ base library that supports Linux, Windows and Mac platforms. It pursues minimalism and efficiency, and does not rely on third-party library such as boost.

Dec 4, 2022
A C++20 coroutine library based off asyncio
A C++20 coroutine library based off asyncio

kuro A C++20 coroutine library, somewhat modelled on Python's asyncio Requirements Kuro requires a C++20 compliant compiler and a Linux OS. Tested on

Nov 9, 2022
C++20 Coroutine-Based Synchronous Parser Combinator Library

This library contains a monadic parser type and associated combinators that can be composed to create parsers using C++20 Coroutines.

Oct 13, 2022
Cppcoro - A library of C++ coroutine abstractions for the coroutines TS

CppCoro - A coroutine library for C++ The 'cppcoro' library provides a large set of general-purpose primitives for making use of the coroutines TS pro

Nov 29, 2022
A go-style coroutine library in C++11 and more.
A go-style coroutine library in C++11 and more.

cocoyaxi English | 简体中文 A go-style coroutine library in C++11 and more. 0. Introduction cocoyaxi (co for short), is an elegant and efficient cross-pla

Dec 2, 2022
C++14 coroutine-based task library for games

SquidTasks Squid::Tasks is a header-only C++14 coroutine-based task library for games. Full project and source code available at https://github.com/we

Nov 30, 2022
Powerful multi-threaded coroutine dispatcher and parallel execution engine

Quantum Library : A scalable C++ coroutine framework Quantum is a full-featured and powerful C++ framework build on top of the Boost coroutine library

Nov 28, 2022
Async GRPC with C++20 coroutine support

agrpc Build an elegant GRPC async interface with C++20 coroutine and libunifex (target for C++23 executor). Get started mkdir build && cd build conan

Nov 15, 2022
Elle - The Elle coroutine-based asynchronous C++ development framework.
Elle - The Elle coroutine-based asynchronous C++ development framework.

Elle, the coroutine-based asynchronous C++ development framework Elle is a collection of libraries, written in modern C++ (C++14). It contains a rich

Nov 19, 2022
Fiber - A header only cross platform wrapper of fiber API.

Fiber Header only cross platform wrapper of fiber API A fiber is a particularly lightweight thread of execution. Which is useful for implementing coro

Jul 31, 2022
experimental cooperative threading library for gba in pure C
experimental cooperative threading library for gba in pure C

gba-co-thread Experimental cooperative threading library for Gameboy Advance in pure C. See co_thread.h and co_thread.c for the tiny threading library

Oct 25, 2022
C++20's jthread for C++11 and later in a single-file header-only library
C++20's jthread for C++11 and later in a single-file header-only library

jthread lite: C++20's jthread for C++11 and later A work in its infancy. Suggested by Peter Featherstone. Contents Example usage In a nutshell License

Nov 26, 2022
Coro - Single-header library facilities for C++2a Coroutines

coro This is a collection of single-header library facilities for C++2a Coroutines. coro/include/ co_future.h Provides co_future<T>, which is like std

Nov 15, 2022
A fast single-producer, single-consumer lock-free queue for C++

A single-producer, single-consumer lock-free queue for C++ This mini-repository has my very own implementation of a lock-free queue (that I designed f

Dec 3, 2022
A bounded single-producer single-consumer wait-free and lock-free queue written in C++11
A bounded single-producer single-consumer wait-free and lock-free queue written in C++11

SPSCQueue.h A single producer single consumer wait-free and lock-free fixed size queue written in C++11. Example SPSCQueue<int> q(2); auto t = std::th

Dec 1, 2022
ThreadPool - A fastest, exception-safety and pure C++17 thread pool.

Warnings Since commit 468129863ec65c0b4ede02e8581bea682351a6d2, I move ThreadPool to C++17. (To use std::apply.) In addition, the rule of passing para

Nov 10, 2022
ThreadPool - Lightweight, Generic, Pure C++11 ThreadPool

ThreadPool Lightweight, Generic, Pure C++11 ThreadPool Rational I needed a Thread Pool for something I was writing, and I didn't see any that I liked.

Nov 3, 2022