header only, dependency-free deep learning framework in C++14




Maintainers Wanted

The project may be abandoned since the maintainer(s) are just looking to move on. In the case anyone is interested in continuing the project, let us know so that we can discuss next steps.

Please visit: https://groups.google.com/forum/#!forum/tiny-dnn-dev


Join the chat at https://gitter.im/tiny-dnn/users Docs License Coverage Status

tiny-dnn is a C++14 implementation of deep learning. It is suitable for deep learning on limited computational resource, embedded systems and IoT devices.

Linux/Mac OS Windows
Build Status Build status

Table of contents

Check out the documentation for more info.

What's New

Features

  • Reasonably fast, without GPU:
    • With TBB threading and SSE/AVX vectorization.
    • 98.8% accuracy on MNIST in 13 minutes training (@Core i7-3520M).
  • Portable & header-only:
    • Runs anywhere as long as you have a compiler which supports C++14.
    • Just include tiny_dnn.h and write your model in C++. There is nothing to install.
  • Easy to integrate with real applications:
    • No output to stdout/stderr.
    • A constant throughput (simple parallelization model, no garbage collection).
    • Works without throwing an exception.
    • Can import caffe's model.
  • Simply implemented:
    • A good library for learning neural networks.

Comparison with other libraries

Please see wiki page.

Supported networks

layer-types

  • core
    • fully connected
    • dropout
    • linear operation
    • zero padding
    • power
  • convolution
    • convolutional
    • average pooling
    • max pooling
    • deconvolutional
    • average unpooling
    • max unpooling
  • normalization
    • contrast normalization (only forward pass)
    • batch normalization
  • split/merge
    • concat
    • slice
    • elementwise-add

activation functions

  • tanh
  • asinh
  • sigmoid
  • softmax
  • softplus
  • softsign
  • rectified linear(relu)
  • leaky relu
  • identity
  • scaled tanh
  • exponential linear units(elu)
  • scaled exponential linear units (selu)

loss functions

  • cross-entropy
  • mean squared error
  • mean absolute error
  • mean absolute error with epsilon range

optimization algorithms

  • stochastic gradient descent (with/without L2 normalization)
  • momentum and Nesterov momentum
  • adagrad
  • rmsprop
  • adam
  • adamax

Dependencies

Nothing. All you need is a C++14 compiler (gcc 4.9+, clang 3.6+ or VS 2015+).

Build

tiny-dnn is header-only, so there's nothing to build. If you want to execute sample program or unit tests, you need to install cmake and type the following commands:

cmake . -DBUILD_EXAMPLES=ON
make

Then change to examples directory and run executable files.

If you would like to use IDE like Visual Studio or Xcode, you can also use cmake to generate corresponding files:

cmake . -G "Xcode"            # for Xcode users
cmake . -G "NMake Makefiles"  # for Windows Visual Studio users

Then open .sln file in visual studio and build(on windows/msvc), or type make command(on linux/mac/windows-mingw).

Some cmake options are available:

options description default additional requirements to use
USE_TBB Use Intel TBB for parallelization OFF1 Intel TBB
USE_OMP Use OpenMP for parallelization OFF1 OpenMP Compiler
USE_SSE Use Intel SSE instruction set ON Intel CPU which supports SSE
USE_AVX Use Intel AVX instruction set ON Intel CPU which supports AVX
USE_AVX2 Build tiny-dnn with AVX2 library support OFF Intel CPU which supports AVX2
USE_NNPACK Use NNPACK for convolution operation OFF Acceleration package for neural networks on multi-core CPUs
USE_OPENCL Enable/Disable OpenCL support (experimental) OFF The open standard for parallel programming of heterogeneous systems
USE_LIBDNN Use Greentea LibDNN for convolution operation with GPU via OpenCL (experimental) OFF An universal convolution implementation supporting CUDA and OpenCL
USE_SERIALIZER Enable model serialization ON2 -
USE_DOUBLE Use double precision computations instead of single precision OFF -
USE_ASAN Use Address Sanitizer OFF clang or gcc compiler
USE_IMAGE_API Enable Image API support ON -
USE_GEMMLOWP Enable gemmlowp support OFF -
BUILD_TESTS Build unit tests OFF3 -
BUILD_EXAMPLES Build example projects OFF -
BUILD_DOCS Build documentation OFF Doxygen
PROFILE Build unit tests OFF gprof

1 tiny-dnn use C++14 standard library for parallelization by default.

2 If you don't use serialization, you can switch off to speedup compilation time.

3 tiny-dnn uses Google Test as default framework to run unit tests. No pre-installation required, it's automatically downloaded during CMake configuration.

For example, type the following commands if you want to use Intel TBB and build tests:

cmake -DUSE_TBB=ON -DBUILD_TESTS=ON .

Customize configurations

You can edit include/config.h to customize default behavior.

Examples

Construct convolutional neural networks

#include "tiny_dnn/tiny_dnn.h"
using namespace tiny_dnn;
using namespace tiny_dnn::activation;
using namespace tiny_dnn::layers;

void construct_cnn() {
    using namespace tiny_dnn;

    network<sequential> net;

    // add layers
    net << conv(32, 32, 5, 1, 6) << tanh()  // in:32x32x1, 5x5conv, 6fmaps
        << ave_pool(28, 28, 6, 2) << tanh() // in:28x28x6, 2x2pooling
        << fc(14 * 14 * 6, 120) << tanh()   // in:14x14x6, out:120
        << fc(120, 10);                     // in:120,     out:10

    assert(net.in_data_size() == 32 * 32);
    assert(net.out_data_size() == 10);

    // load MNIST dataset
    std::vector<label_t> train_labels;
    std::vector<vec_t> train_images;

    parse_mnist_labels("train-labels.idx1-ubyte", &train_labels);
    parse_mnist_images("train-images.idx3-ubyte", &train_images, -1.0, 1.0, 2, 2);

    // declare optimization algorithm
    adagrad optimizer;

    // train (50-epoch, 30-minibatch)
    net.train<mse, adagrad>(optimizer, train_images, train_labels, 30, 50);

    // save
    net.save("net");

    // load
    // network<sequential> net2;
    // net2.load("net");
}

Construct multi-layer perceptron (mlp)

#include "tiny_dnn/tiny_dnn.h"
using namespace tiny_dnn;
using namespace tiny_dnn::activation;
using namespace tiny_dnn::layers;

void construct_mlp() {
    network<sequential> net;

    net << fc(32 * 32, 300) << sigmoid() << fc(300, 10);

    assert(net.in_data_size() == 32 * 32);
    assert(net.out_data_size() == 10);
}

Another way to construct mlp

#include "tiny_dnn/tiny_dnn.h"
using namespace tiny_dnn;
using namespace tiny_dnn::activation;

void construct_mlp() {
    auto mynet = make_mlp<tanh>({ 32 * 32, 300, 10 });

    assert(mynet.in_data_size() == 32 * 32);
    assert(mynet.out_data_size() == 10);
}

For more samples, read examples/main.cpp or MNIST example page.

Contributing

Since deep learning community is rapidly growing, we'd love to get contributions from you to accelerate tiny-dnn development! For a quick guide to contributing, take a look at the Contribution Documents.

References

[1] Y. Bengio, Practical Recommendations for Gradient-Based Training of Deep Architectures. arXiv:1206.5533v2, 2012

[2] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324.

Other useful reference lists:

License

The BSD 3-Clause License

Gitter rooms

We have gitter rooms for discussing new features & QA. Feel free to join us!

developers https://gitter.im/tiny-dnn/developers
users https://gitter.im/tiny-dnn/users
Comments
  • use ThreadPool

    use ThreadPool

    Hi, this PR changes parallel_for function to use ThreadPool library written by Jakob Progsch, Václav Zeman. https://github.com/progschj/ThreadPool

    This will improve execution speed on Linux.

  • quantization, bug fix in deconv and graph enet

    quantization, bug fix in deconv and graph enet

    Task list:

    • [x] Basic functions and pass the tests on core quantization utilities.
    • [x] Quantized convolution layer and pass the tests for q_conv.
    • [x] Quantized deconvolution layer and pass the tests for q_deconv.
    • [x] Quantized fully connected layer and pass the tests for q_fully_connected.
    • [x] Quantized bias inside other kernels.
    • [x] Ensure acceptable accuracy on typical examples.
    • [x] Add low precision gemm from Google TF as a kernel for matmul.
    • [x] Resolve unnecessary quantize-dequantize procedure.
    • [x] Add backward propagation for quantization.

    Method:

    The method is totally the same as described in Pete's blog for quantization and my basic codes are modified from TensorFlow quantization module.

  • Add tests for dropout and different batches sizes, fix connected issues

    Add tests for dropout and different batches sizes, fix connected issues

    Reproducing the problem described in https://github.com/tiny-dnn/tiny-dnn/issues/540 No fix yet. As my test shown, the problem exists only with batch size >10.

  • Abstract Convolutional Layer

    Abstract Convolutional Layer

    First commit for device abstraction:

    • The convolution class has been abstracted
    • Added a PlantUML in order to show the "big picture" of the current implementation.
    • Removed test warnings
  • Clang tooling

    Clang tooling

    I've rerun clang-format on current code base. All test are passed at my PC.

    Command to run formatting is find -iname *.h -o -iname *.cpp | xargs clang-format-4.0 -style=file -i

    I suppose to add it as a git hook or something like that.

  • LibDNN integration

    LibDNN integration

    @naibaf7 @nyanp @bhack @mtamburrano I open this ticket for LibDNN integration discussion.

    @naibaf7 What's the current status of LibDNN standalone? Recently, the initial backend architecture was merged. Please, give it a shot and feel free to comment about design or any consideration you think that will be convenient for LibDNN or others optimizations. Thx !

  • Move to an organization & renaming tiny-cnn

    Move to an organization & renaming tiny-cnn

    We've decided to move tiny-cnn to an organization account to accelarate its development (discussed in #226).

    Since it is clear that we are expanding the scope of tiny-cnn from convolutional networks to general networks, the project name tiny-Cnn seems a bit inaccurate now. I want to change the project name to more appropriate one (if we agreed), at the timing of the transferring repository.

    In #226 we have these 3 proposals:

    • tiny-dnn (convolutioanl net -> deep net)
    • hornet (loose acronyms of Header Only Network)
    • tiny-cnn (of course we can keep its name)

    Whichever we take, naming of project doesn't affect the library API except for its namespace, and hyperlinks and folks, pull requests will be correctly redirected to the new repository.

    Please feel free to give me your feedback if you have suggestions for the naming! We want to decide the name and move to a new account until around 7/25.

  • Tensor (reprise)

    Tensor (reprise)

    Well, as usual, I made a mess with GIT, and apparently I can't easily push to @edgarriba's original PR.

    Today I reworked a bit @edgarriba's #400, I changed a bit the interface, added the lazy-allocation and lazy-movement of memory around, renamed accessor to host_ptr and host_at (to clarify they work on host memory), and implemented the generic functions for binary and unary host operations, element-wise and scalar, so it's now easier to implement functions such as add, sub, mul, div, exp, sqrt, ...

    I commented out linspace, if someone has a strong opinion about that, we can still get it back in.

  • Add AVX implementation for Global Average Pooling layer.

    Add AVX implementation for Global Average Pooling layer.

    Since global average pooling layer calculates average of all activations per channel, we pick up contigious 8 floats and keep on performing vertical sum channelwise. At the end a net sum is accumulated by horizontal sum. This is repeated for all channels of a layer.

    Current code falls back to internal backend if nnpack or other unsupported backend is chosen.

  • Model Tensor structures

    Model Tensor structures

    I open this ticket to discuss about modeling tensors in a classes as discussed in https://github.com/tiny-dnn/tiny-dnn/issues/235#issuecomment-239196739

    @pansk Proposed to have different structures depending on the data nature:

    In an email thread with @nyanp we were discussing about having a structure that represents i/o tensors and another which represents parameters (weights, biases, convolution coefficients, ...). The first structure was supposed to "automatically" move between the CPU and the GPU when needed (e.g., when interleaving CPU and GPU layers, something which is probably still inefficient, but should be allowed for prototyping new, complex layers), while the second was conceived for full-time resident on the GPU (for GPU backends) and was supposed to be downloaded/uploaded only for serialization/deserialization purposes (probably manually).

  • add Tensor class

    add Tensor class

    Add the Tensor structure:

    • data hold by std::vector<> with 64bytes alignment
    • three different data accessors: t.ptr<float_t>(), t.at<float_t>() and t[idx]
    • basic reshape() and resize() routines
    • basic toDevice() and fromDevice() routines
    • implement element-wise add, sub, mul, div
  • Bad Function Call with deconv layer.

    Bad Function Call with deconv layer.

    Any time I try to use a deconv layer in a network im hit with the

    terminate called after throwing an instance of 'std::bad_function_call' what(): bad_function_call Aborted (coredumped)``

    I try the exact same setup with fully_connected_layers and it works just fine. Perhaps a bug?

  • mse in loss_fuction.h

    mse in loss_fuction.h

    I did a simple regression (MLP) with my 2155 data sets. The training seemed successfully completed, however, when I got "get_loss" with mse, "d" seems not to be divided by the total data sets which is 2155.

    input_data:2155 lines x 6 input elements output_data:2155 lines x 1 output element

    double loss = net.get_loss<tiny_dnn::mse>(input_data, target_data) std::cout<<"mse="<< loss <<std::endl;

    ---- loss_function.h ---- class mse { public: static float_t f(const vec_t &y, const vec_t &t) { assert(y.size() == t.size()); float_t d{0.0};

    for (size_t i = 0; i < y.size(); ++i) d += (y[i] - t[i]) * (y[i] - t[i]);
    **[Ichi]: this calculation is right. I confirmed ("predicted value" - "target value")^2 with Excel spared sheet.**
    
    return d / static_cast<float_t>(y.size());
    **[Ichi]: divided by one??? When I outputted "y.size()" with std::cout, it was "1".  I'm not a skillful C++ programmer. I might make a mistake.**
    

    }....

  • How to use this library?

    How to use this library?

    I am pretty new to using C++ for Deep Neural Network? Could someone help with how to install this library? I have downloaded the zip, extracted it but I can't seem to include the tiny_dnn.h file.

    I am using the Dev C++ editor. Could someone tell me how to add this particular library to the additional library of Dev C++?

  • Bug in average pooling.

    Bug in average pooling.

    Not used very often, so I can see this oversight.

    I had a 20x1 data sample, and I used a 3x1 average pooling layer with 1x1 stride. The result should be an 18x1 output, but instead received an error from pooling_size_mismatch. In this case, average_pooling_layer wanted the width and height of the data to be a multiple of the pooling window, which it is not in my case.

    However, due to the stride, the size was fine.

Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Nov 17, 2022
CubbyDNN - Deep learning framework using C++17 in a single header file
CubbyDNN - Deep learning framework using C++17 in a single header file

CubbyDNN CubbyDNN is C++17 implementation of deep learning. It is suitable for deep learning on limited computational resource, embedded systems and I

Aug 16, 2022
A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms.
A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms.

iNeural A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms. What is a Neural Network? Work on

Apr 5, 2022
Caffe: a fast open framework for deep learning.

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berke

Nov 26, 2022
TFCC is a C++ deep learning inference framework.

TFCC is a C++ deep learning inference framework.

Sep 28, 2022
KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

KSAI Lite English | 简体中文 KSAI Lite是一个轻量级、灵活性强、高性能且易于扩展的深度学习推理框架,底层基于tensorflow lite,定位支持包括移动端、嵌入式以及服务器端在内的多硬件平台。 当前KSAI Lite已经应用在金山office内部业务中,并逐步支持金山

Nov 2, 2022
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices.

Nov 27, 2022
Plaidml - PlaidML is a framework for making deep learning work everywhere.
Plaidml - PlaidML is a framework for making deep learning work everywhere.

A platform for making deep learning work everywhere. Documentation | Installation Instructions | Building PlaidML | Contributing | Troubleshooting | R

Nov 21, 2022
Caffe2 is a lightweight, modular, and scalable deep learning framework.

Source code now lives in the PyTorch repository. Caffe2 Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the origin

Nov 23, 2022
A header-only C++ library for deep neural networks

MiniDNN MiniDNN is a C++ library that implements a number of popular deep neural network (DNN) models. It has a mini codebase but is fully functional

Nov 21, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Nov 27, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

Nov 30, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Nov 22, 2022
LibDEEP BSD-3-ClauseLibDEEP - Deep learning library. BSD-3-Clause

LibDEEP LibDEEP is a deep learning library developed in C language for the development of artificial intelligence-based techniques. Please visit our W

Nov 28, 2022
Deep Learning API and Server in C++11 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Open Source Deep Learning Server & API DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state

Nov 28, 2022
Forward - A library for high performance deep learning inference on NVIDIA GPUs
 Forward - A library for high performance deep learning inference on NVIDIA GPUs

a library for high performance deep learning inference on NVIDIA GPUs.

Mar 17, 2021
A library for high performance deep learning inference on NVIDIA GPUs.
A library for high performance deep learning inference on NVIDIA GPUs.

Forward - A library for high performance deep learning inference on NVIDIA GPUs Forward - A library for high performance deep learning inference on NV

Nov 21, 2022
Nimble: Physics Engine for Deep Learning
Nimble: Physics Engine for Deep Learning

Nimble: Physics Engine for Deep Learning

Nov 20, 2022