Fast, gpu-based CSV parser

nvParse

Parsing CSV files with GPU

Parsing delimiter-separated files is a common task in data processing. The regular way of extracting the columns from a text file is to use strtok function :

char * p = strtok(line, "|");
while (p != NULL)
{
    printf ("%s\n",p);
    p = strtok (NULL, "|");
}

However this method of parsing is CPU bound because

  • it doesn't take advantage of multiple cores of modern CPUs.

  • memory bandwidth limitations

This is how the same task can be done using a GPU :

auto break_cnt = thrust::count(d_readbuff.begin(), d_readbuff.end(), '\n');
thrust::device_vector<int> dev_pos(break_cnt);
thrust::copy_if(thrust::make_counting_iterator(0),
                thrust::make_counting_iterator(bytes_read-1),
                d_readbuff.begin(), dev_pos.begin(), _1 == '\n');

The first line counts the number of lines in a buffer (assuming that file is read into memory and copied to gpu buffer d_readbuff). The second line creates a vector in gpu memory that will hold the positions of new line characters. The last line compares the characters in a buffer to new line character and, if a match is found, copies the position of the character to dev_pos vector.

Now that we know the starting positions of every line in a buffer, we can launch a gpu procedure that will parse the lines using several thousands gpu cores :

thrust::counting_iterator<unsigned int> begin(0);
parse_functor ff(...); // examples of call's parameters are in test.cu file
thrust::for_each(begin, begin + break_cnt, ff);

As a result we get the needed columns in separate arrays in gpu memory and can copy them to host memory. Or convert them to binary values using relevant gpu procedures :

gpu_atoll atoll_ff(...);
thrust::for_each(begin, begin + break_cnt, atoll_ff);

Benchmarks !

Hardware : PC with one Intel i3-4130, 16GB of RAM, one 2TB hard drive and GTX Titan

File : 750MB lineitem.tbl text file (6001215 lines)

Parsing 1 field using CPU :

$ time cut -d "|" -f 6 lineitem.tbl > /dev/null

real 0m28.764s

Parsing 11 fields using hand-written program with strtok : (no threads, no memory-mapped file)

14.5s

Parsing 11 fields using GPU :

$ time ./test

0.77s

And the actual gpu parsing part is done in just 0.25 seconds.

P.S. Thanks to Nicolas Guillemot for suggestion on memory-mapping files.

Similar Resources

Monitoring Radeon GPU temperature on macOS

Monitoring Radeon GPU temperature on macOS

RadeonSensor - Kext and Gadget to show Radeon GPU temperature on macOS The kext is based on FakeSMCs RadeonMonitor to provide GPU temperature to a ded

Nov 26, 2022

Vulkan and other GPU API bugs I found.

Vulkan and other GPU API bugs I found.

GPU-my-list-of-bugs what is it - list of bugs I found writing shaders, mostly shader bugs. Maybe this is my code bug or/and shader bugs, but this code

Nov 21, 2022

Risc-V RV32IMAFC + 80s ERA SoC (bitmap + GPU, sprites, tilemaps)

Risc-V RV32IMAFC + 80s ERA SoC (bitmap + GPU, sprites, tilemaps)

A simple (no interrupts or exceptions/traps), Risc-V RV32IMAFC CPU, with a pseudo SMT (dual thread) capability. The display is similar to the 8-bit era machines, along with audio, SDCARD read support, UART and PS/2 keyboard input.

Jun 3, 2022

Move CS beacon to GPU memory when sleeping

Blog post Tested on Windows 21H1, Visual Studio 2019 (v142) and an NVIDIA GTX860M. GPUSleep GPUSleep moves the beacon image to GPU memory before the b

Nov 13, 2022

A composable container for Adaptive ROS 2 Node computations. Select between FPGA, CPU or GPU at run-time.

adaptive_component A composable stateless container for Adaptive ROS 2 Node computations. Select between FPGA, CPU or GPU at run-time. Nodes using har

Sep 9, 2022

A library for applying rootless Adreno GPU driver modifications/replacements

Adreno Tools A library for applying rootless Adreno GPU driver modifications/replacements. Currently supports loading custom GPU drivers such as turni

Dec 2, 2022

Cross-platform GPU-oriented C++ application/game framework

Cross-platform GPU-oriented C++ application/game framework

Introduction neoGFX is a C++ app/game engine and development platform targeted at app and game developers that wish to leverage modern GPUs for perfor

Nov 27, 2022

Basis Universal GPU Texture Codec

Basis Universal GPU Texture Codec

basis_universal Basis Universal Supercompressed GPU Texture Codec Basis Universal is a "supercompressed" GPU texture data interchange system that supp

Nov 30, 2022

Get CPU & GPU temperatures and fan and battery statistics from your Mac.

macOS Hardware Stats Get CPU & GPU temperatures and fan and battery statistics from your Mac. This simple script will output a JSON array containing h

May 5, 2022
Comments
  • syntax hilite readme

    syntax hilite readme

    This adds GitHub style syntax highlighting to the README file. You can see what it looks like when rendered on GitHub at: https://github.com/eklitzke/nvParse/tree/syntax

    The existing README mixed tabs/spaces for the code examples, which I converted to all spaces. Also my editor stripped out trailing whitespace. :man_shrugging:

Fast C/C++ CSS Parser (Cascading Style Sheets Parser)

MyCSS — a pure C CSS parser MyCSS is a fast CSS Parser implemented as a pure C99 library with the ability to build without dependencies. Mailing List:

Sep 22, 2022
Using a RP2040 Pico as a basic logic analyzer, exporting CSV data to read in sigrok / Pulseview

rp2040-logic-analyzer This project modified the PIO logic analyzer example that that was part of the Raspberry Pi Pico examples. The example now allow

Oct 31, 2022
Unix pager (with very rich functionality) designed for work with tables. Designed for PostgreSQL, but MySQL is supported too. Works well with pgcli too. Can be used as CSV or TSV viewer too. It supports searching, selecting rows, columns, or block and export selected area to clipboard.
Unix pager (with very rich functionality) designed for work with tables. Designed for PostgreSQL, but MySQL is supported too. Works well with pgcli too. Can be used as CSV or TSV viewer too. It supports searching, selecting rows, columns, or block and export selected area to clipboard.

Unix pager (with very rich functionality) designed for work with tables. Designed for PostgreSQL, but MySQL is supported too. Works well with pgcli too. Can be used as CSV or TSV viewer too. It supports searching, selecting rows, columns, or block and export selected area to clipboard.

Dec 1, 2022
Lister (Total Commander) plugin to view CSV files
Lister (Total Commander) plugin to view CSV files

csvtab-wlx is a Total Commander plugin to view CSV files. Download the latest version Features Auto-detect codepage and delimiter Column filters Sort

Nov 7, 2022
official repository of the muparser fast math parser library
official repository of the muparser fast math parser library

muparser - Fast Math Parser 2.3.3 (Prerelease) To read the full documentation please go to: http://beltoforion.de/en/muparser. See Install.txt for ins

Nov 18, 2022
Physically-based GPU and CPU ray-tracer emerging on a surface
Physically-based GPU and CPU ray-tracer emerging on a surface

etx-tracer Physically-based GPU and CPU ray-tracer emerging on a surface. Features Vertex Connection and Merging algorithm (CPU and GPU); Full-spectra

Nov 16, 2022
Peregrine - A blazing fast language for the blazing fast world(WIP)
Peregrine - A blazing fast language for the blazing fast world(WIP)

A Blazing-Fast Language for the Blazing-Fast world. The Peregrine Programming Language Peregrine is a Compiled, Systems Programming Language, currentl

Nov 29, 2022
GPU 3D signed distance field generator, written with DirectX 11 compute shader
GPU 3D signed distance field generator, written with DirectX 11 compute shader

GPU SDF Generator GPU 3D signed distance field generator, written with DirectX 11 compute shader Building git clone --recursive https://github.com/Air

Oct 24, 2022
Performance Evaluation of a Parallel Image Enhancement Technique for Dark Images on Multithreaded CPU and GPU Architectures
 Performance Evaluation of a Parallel Image Enhancement Technique for Dark Images on Multithreaded CPU and GPU Architectures

Performance Evaluation of a Parallel Image Enhancement Technique for Dark Images on Multithreaded CPU and GPU Architectures Image processing is a rese

Nov 4, 2021
Experiments using the RPI Zero GPU for FFT (1D and 2D)

RPI0_GPU_FFT Experiments using the RPI Zero GPU for FFT/IFFT 1D/2D For an input 4194304 (1D), the GPU was around 7X faster than np.fft.fft and np.fft.

Nov 15, 2022