PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.

PSTensor : Custimized a Tensor Data Structure Compatible with PyTorch and TensorFlow.

You may need this software in the following cases.

  1. Manage memory allocation by yourself. Sometimes, you are irritated by the framework's memory allocation mechanism. They use a complicated caching-based allocator and generate fragments.

  2. Unified framework-agnostic memory management operations. For example, you are developing a plugin for both PyTorch and TensorFlow. It spits the tensors to multiple GPUs before an operator and merges them afterward. You have to write different Python logic for PyTorch and TensorFlow, respectively, and can not make sure they work in the same efficiency. Alternatively, you can write C++/CUDA code for these operators and provide two sets of Python APIs for both TF and Torch.

  3. Customized Communication Pattern. Using PyTorch, it is impossible to implement GPU P2P communication, since nccl backend only supports collective communication APIs. Now, you can implement it with help of CUDA-level libraries.

Installation

mkdir build && cd build && cmake .. && make
pip install `find . -name "*whl"`

Usage

See PyTorch Example and TensorFlow Example for details. More features are Working In Progress.

Owner
Jiarui Fang
Senior Software Engineer at Wechat @Tencent. Tsinghua University Ph.D.
Jiarui Fang
Similar Resources

GPTPU: General-Purpose Computing on (Edge) Tensor Processing Units

GPTPU: General-Purpose Computing on (Edge) Tensor Processing Units Welcome to the repository of ESCAL @ UCR's GPTPU project! We aim at demonstrating t

Dec 23, 2022

Training and Evaluating Facial Classification Keras Models using the Tensorflow C API Implemented into a C++ Codebase.

CFace Training and Evaluating Facial Classification Keras Models using the Tensorflow C API Implemented into a C++ Codebase. Dependancies Tensorflow 2

Oct 18, 2022

SoftON Hack is an internal Ring3 cheat for Free2Play MMOFPS game (x86) - CSN:S (ex. CSN:Z)

SoftON Hack is an internal Ring3 cheat for Free2Play MMOFPS game (x86) - CSN:S (ex. CSN:Z)

Changelog (2021.03.19) DLL: Empty signature is allowed to Hacker Detector, and allows be invisible to other SoftON users DLL: Players state updating m

Dec 29, 2022

Number recognition with MNIST on Raspberry Pi Pico + TensorFlow Lite for Microcontrollers

Number recognition with MNIST on Raspberry Pi Pico + TensorFlow Lite for Microcontrollers

About Number recognition with MNIST on Raspberry Pi Pico + TensorFlow Lite for Microcontrollers Device Raspberry Pi Pico LCDディスプレイ 2.8"240x320 SPI TFT

Dec 16, 2022

Eloquent interface to Tensorflow Lite for Microcontrollers

This Arduino library is here to simplify the deployment of Tensorflow Lite for Microcontrollers models to Arduino boards using the Arduino IDE.

Dec 26, 2022

TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

TensorFlow Lite, Coral Edge TPU samples (Python/C++, Raspberry Pi/Windows/Linux).

Nov 16, 2022

Want a faster ML processor? Do it yourself! -- A framework for playing with custom opcodes to accelerate TensorFlow Lite for Microcontrollers (TFLM).

CFU Playground Want a faster ML processor? Do it yourself! This project provides a framework that an engineer, intern, or student can use to design an

Jan 1, 2023

TensorFlow Lite for Microcontrollers

TensorFlow Lite for Microcontrollers Build Status Official Builds Community Supported Builds Additional Documentation TensorFlow Lite for Microcontrol

Jan 3, 2023
Pose-tensorflow - Human Pose estimation with TensorFlow framework
Pose-tensorflow -  Human Pose estimation with TensorFlow framework

Human Pose Estimation with TensorFlow Here you can find the implementation of the Human Body Pose Estimation algorithm, presented in the DeeperCut and

Dec 29, 2022
Deep Learning API and Server in C++11 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Open Source Deep Learning Server & API DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state

Dec 30, 2022
Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, ncnn, MNN, SNPE, Arm NN, NNAbla
Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, ncnn, MNN, SNPE, Arm NN, NNAbla

InferenceHelper This is a helper class for deep learning frameworks especially for inference This class provides an interface to use various deep lear

Dec 26, 2022
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

Jan 4, 2023
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large scales

Fairring (FAIR + Herring): a faster all-reduce TL;DR: Using a variation on Amazon’s "Herring" technique, which leverages reduction servers, we can per

Nov 24, 2022
A external memory allocator example for PyTorch.

Custom PyTorch Memory Management This is a external memory allocator example for PyTorch. The underlying memory allocator is CNMeM. Usage Compile with

Aug 2, 2022
Deep Learning in C Programming Language. Provides an easy way to create and train ANNs.
Deep Learning in C Programming Language. Provides an easy way to create and train ANNs.

cDNN is a Deep Learning Library written in C Programming Language. cDNN provides functions that can be used to create Artificial Neural Networks (ANN)

Dec 24, 2022
Yet another tensor library in C++. It allows direct access to its underlying data buffer, and serializes in JSON.

Yet another tensor library in C++. It allows direct access to its underlying data buffer, and serializes in JSON. Built on top of zax json parser, C++ structures having tensor members can also be JSON-serialized and deserialized, allowing one to save and load the state of a highly hierarchical object.

Dec 15, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Dec 30, 2022