The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 0.9, working as a coprocessor to CORE-V's CVA6 core

Ara

Ara is a vector unit working as a coprocessor for the CVA6 core. It supports the RISC-V Vector Extension, version 0.9.

Dependencies

Check DEPENDENCIES.md for a list of hardware and software dependencies of Ara.

Supported instructions

Check FUNCTIONALITIES.md to check which instructions are currently support by Ara.

Get started

Make sure you clone this repository recursively to get all the necessary submodules:

git submodule update --init --recursive

If the repository path of any submodule changes, run the following command to change your submodule's pointer to the remote repository:

git submodule sync --recursive

Toolchain

Ara requires a RISC-V GCC toolchain capable of understanding the vector extension, version 0.9.x.

To build this toolchain, run the following command in the project's root directory.

# Build the GCC toolchain
make toolchain

Verilator

Ara requires an updated version of Verilator, for RTL simulations.

To build it, run the following command in the project's root directory.

# Build Verilator
make verilator

Configuration

Ara's parameters are centralized in the config folder, in the config.mk file. Please check config/README.md for more details.

Software

Build Applications

The apps folder contains example applications that work on Ara. Run the following command to build an application. E.g., hello_world:

cd apps
make bin/hello_world

RISC-V Tests

The apps folder also contains the RISC-V tests repository, including a few unit tests for the vector instructions. Run the following command to build the unit tests:

cd apps
make riscv_tests

RTL Simulation

To simulate the Ara system with ModelSim, go to the hardware folder, which contains all the SystemVerilog files. Use the following command to run your simulation:

# Go to the hardware folder
cd hardware
# Apply the patches (only need to run this once)
make apply-patches
# Only compile the hardware without running the simulation.
make build
# Run the simulation with the *hello_world* binary loaded
app=hello_world make sim
# Run the simulation with the *some_binary* binary. This allows specifying the full path to the binary
preload=/some_path/some_binary make sim
# Run the simulation without starting the gui
app=hello_world make simc

It is also possible to simulate the unit tests compiled in the apps folder. Given the number of unit tests, we use Verilator. Use the following command to install Verilator, verilate the design, and run the simulation:

# Go to the hardware folder
cd hardware
# Apply the patches (only need to run this once)
make apply-patches
# Verilate the design
make verilate
# Run the tests
make riscv_tests_simv

Alternatively, you can also use the riscv_tests target at Ara's top-level Makefile to both compile the RISC-V tests and run their simulation.

Publication

If you want to use Ara, you can cite us:

@Article{Ara2020,
  author = {Matheus Cavalcante and Fabian Schuiki and Florian Zaruba and Michael Schaffner and Luca Benini},
  journal= {IEEE Transactions on Very Large Scale Integration (VLSI) Systems},
  title  = {Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOI},
  year   = {2020},
  volume = {28},
  number = {2},
  pages  = {530-543},
  doi    = {10.1109/TVLSI.2019.2950087}
}
Comments
  • Kernels update

    Kernels update

    Merge https://github.com/pulp-platform/ara/pull/81 and https://github.com/pulp-platform/ara/pull/101 before this one.

    Add baseline Jacobi2d, Dropout, Convolution3D benchmark

    The convolution is now defined by its data type and its dimensions. fconv3d, for example, processes double-precision floating-point data, using 3D filters with depth ch (channels): (i*i*ch) ∗ (f*f*ch) = (o*o)

    Even if fconv3d is parameterized on the number of channels and can also be used with ch = 1 becoming a fconv2d, the code for fconv2d is kept since it is more optimized for that particular case.

    fconv3d: F = {7}, optimized with an enhanced algorithm

    fconv2d: F = {3, 7}. F == 3 is optimized, F == 7 is optimized with an enhanced algorithm

    iconv2d: F = {3, 5, 7}. F == 3 is optimized, F == 7 is optimized with an enhanced algorithm. F == 5 is not optimized

    We will support and optimize the other filter sizes in the future.

    The roofline plots produced for the convolutions are produced with the following parameters: iconv2d = F = 3 fconv2d = F = 3 fconv3d = F = 7

    Changelog

    Fixed

    • Generate data.S files before compiling the programs
    • Clean intermediate app object files with make clean
    • Add a fence before stopping the cycle counter, to let the last vector store complete

    Added

    • Add fconv3d kernel, optimized for 7x7 filters
    • Optimize fconv2d and iconv2d kernels for 3x3 filters
    • Add convolutions to the benchmark app, and print the related roofline plots
    • Add corner case test to vslidedown instruction

    Changed

    • Update README with instructions on how to compile convolutions
    • Refactor benchmark app
    • Double the testbench memory size
    • Update the python-requirements list

    Checklist

    • [x] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed

    Please check our contributing guidelines before opening a Pull Request.

  • ”make verilator“fail with ”CC=$(CLANG_CC) CXX=$(CLANG_CXX) CXXFLAGS=$(CLANG_CXXFLAGS) LDFLAGS=$(CLANG_LDFLAGS) \“ configed.

    ”make verilator“fail with ”CC=$(CLANG_CC) CXX=$(CLANG_CXX) CXXFLAGS=$(CLANG_CXXFLAGS) LDFLAGS=$(CLANG_LDFLAGS) \“ configed.

    fail information: In file included from ../V3Combine.cpp:27: ../V3DupFinder.h:50:5: error: constructor for 'V3DupFinder' must explicitly initialize the const member 'm_hasher' V3DupFinder(){}; ^ ../V3DupFinder.h:46:20: note: 'm_hasher' declared here const V3Hasher m_hasher; ^ 1 error generated. ../Makefile_obj:297: recipe for target 'V3Combine.o' failed make[3]: *** [V3Combine.o] Error 1

    But if i del "CC=$(CLANG_CC) CXX=$(CLANG_CXX) CXXFLAGS=$(CLANG_CXXFLAGS) LDFLAGS=$(CLANG_LDFLAGS) " this, all version of verilator can compile succeed!

    Best Wishes!

  • Make  bin/hello_world failed (library not found)

    Make bin/hello_world failed (library not found)

    Hi, The ligloss library cannot be found when compiling hello_world. When I add the tool chain install directory to the path, the same problem still exists. [email protected]:/share/zhuxuanlong/Vector_Work/ara/apps# make bin/hello_world chmod +x /share/zhuxuanlong/Vector_Work/ara/apps/common/script/align_sections.sh rm -f /share/zhuxuanlong/Vector_Work/ara/apps/common/link.ld && cp /share/zhuxuanlong/Vector_Work/ara/apps/common/arch.link.ld /share/zhuxuanlong/Vector_Work/ara/apps/common/link.ld /share/zhuxuanlong/Vector_Work/ara/apps/common/script/align_sections.sh 4 /share/zhuxuanlong/Vector_Work/ara/apps/common/link.ld /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -c hello_world/main.c -o hello_world/main.c.o /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -c common/crt0.S -o common/crt0-llvm.S.o /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -c common/printf.c -o common/printf-llvm.c.o /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -c common/string.c -o common/string-llvm.c.o /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -c common/serial.c -o common/serial-llvm.c.o mkdir -p bin/ /share/zhuxuanlong/Vector_Work/ara/install/riscv-llvm/bin/clang -Iinclude -march=rv64gcv0p10 -mabi=lp64d -menable-experimental-extensions -mno-relax -fuse-ld=lld -mcmodel=medany -I/share/zhuxuanlong/Vector_Work/ara/apps/common -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -Wno-unused-command-line-argument -o bin/hello_world hello_world/main.c.o common/crt0-llvm.S.o common/printf-llvm.c.o common/string-llvm.c.o common/serial-llvm.c.o -static -nostartfiles -lm -T/share/zhuxuanlong/Vector_Work/ara/apps/common/link.ld ld.lld: error: unable to find library -lgloss clang-13: error: ld command failed with exit code 1 (use -v to see invocation) make: *** [Makefile:59: bin/hello_world] Error 1 rm hello_world/main.c.o common/string-llvm.c.o common/crt0-llvm.S.o common/printf-llvm.c.o common/serial-llvm.c.o

    Thank you.

  • Stuck at the complie flow `make riscv_tests_simv`

    Stuck at the complie flow `make riscv_tests_simv`

    Hi, @mp-17 @suehtamacv When I try to make riscv_tests_simv according to the README file, my terminal has been stuck with no message update for a long while, about a few hours.

    (base) ➜ hardware git:(main) ✗ make riscv_tests_simv build/verilator/Vara_tb_verilator -l ram,/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd,elf &> build/rv64uv-ara-vadd.trace

    And I checked the message in the build/rv64uv-ara-vadd.trace file for several times, which is listed as below. It remains the same for a long while as well.

    Program header number 0 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' low is 80000000
    Program header number 0 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004179
    Program header number 1 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004877
    Program header number 2 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' high is 80004b17
    Program header number 3 in `/home/fantasysee/Projects/ara/apps/bin/rv64uv-ara-vadd' is not of type PT_LOAD; ignoring.
    Set `ram TOP.ara_tb_verilator.dut.i_ara_soc.i_dram 10 0x80000000 0x80000 write with offset: 0x0 write with size: 0x4b18
    Simulation of Ara
    =================
    
    
    Simulation running, end by pressing CTRL-c.
    
    

    Note that, my QuestaSim version is Mentor Graphics QuestaSim 10.6c instead of Mentor Graphics QuestaSim 2020.1. And I merely make a fake version soft link to 2020.1, with no modification in the hardware/Makefile.

    Is this experimental phenomenon normal? If yes, could you please tell me how long this process approximately lasts? If no, would you please help me check if there is something wrong with my experimental environment?

    Thanks in advance!!!

  • Verilator Simulation Error

    Verilator Simulation Error

    When I run this command:

    ~/ara/hardware$make apply-patches 
    

    I face this error:

    Makefile:62: "Specified QuestaSim version (questa-2020.1) not found in PATH /home/hpc-user/xilinx/Vivado/2016.2/bin:/home/hpc-user/intelFPGA_pro/21.2/modelsim_ase/bin:/usr/bin/sbt:/home/hpc-user/riscv-gnu-toolchain/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
    cd deps/tech_cells_generic && git apply ../../patches/0001-tech-cells-generic-sram.patch
    error: patch failed: src/rtl/tc_sram.sv:124
    error: src/rtl/tc_sram.sv: patch does not apply
    make: *** [Makefile:101: apply-patches] Error 1
    

    I ignored this error, and I ran the next one:

    ~/ara/hardware$make verilate
    

    Again, I face this error:

    Makefile:62: "Specified QuestaSim version (questa-2020.1) not found in PATH /home/hpc-user/xilinx/Vivado/2016.2/bin:/home/hpc-user/intelFPGA_pro/21.2/modelsim_ase/bin:/usr/bin/sbt:/home/hpc-user/riscv-gnu-toolchain/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
    mkdir -p build
    rm -rf build/verilator; mkdir -p build/verilator
    ./bender script verilator -t rtl -t ara_test -t cva6_test -t verilator --define NR_LANES=4 --define VLEN=4096 --define RVV_ARIANE=1 > build/verilator/bender_script_default
    bash: ./bender: No such file or directory
    make: *** [Makefile:145: build/verilator/Vara_tb_verilator] Error 127
    make: *** Waiting for unfinished jobs....
    Successfully installed bender 0.21.0 in '/home/hpc-user/ara/hardware'.
    bender 0.21.0 available.
    

    The Second Time I run this command, face:

    Makefile:62: "Specified QuestaSim version (questa-2020.1) not found in PATH /home/hpc-user/xilinx/Vivado/2016.2/bin:/home/hpc-user/intelFPGA_pro/21.2/modelsim_ase/bin:/usr/bin/sbt:/home/hpc-user/riscv-gnu-toolchain/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
    rm -rf build/verilator; mkdir -p build/verilator
    ./bender script verilator -t rtl -t ara_test -t cva6_test -t verilator --define NR_LANES=4 --define VLEN=4096 --define RVV_ARIANE=1 > build/verilator/bender_script_default
    /home/hpc-user/ara/install/verilator/bin/verilator -f build/verilator/bender_script_default           \
      -GNrLanes=4                                                         \
      -O3                                                                           \
      -Wno-BLKANDNBLK                                                               \
      -Wno-CASEINCOMPLETE                                                           \
      -Wno-CMPCONST                                                                 \
      -Wno-LATCH                                                                    \
      -Wno-LITENDIAN                                                                \
      -Wno-UNOPTFLAT                                                                \
      -Wno-UNPACKED                                                                 \
      -Wno-UNSIGNED                                                                 \
      -Wno-WIDTH                                                                    \
      -Wno-WIDTHCONCAT                                                              \
      --hierarchical                                                                \
      tb/verilator/waiver.vlt                                                       \
      --Mdir build/verilator                                                       \
      -Itb/dpi                                                                      \
      --compiler clang                                                              \
      -CFLAGS "-DTOPLEVEL_NAME=ara_tb_verilator"                                        \
      -CFLAGS "-DNR_LANES=4"                                              \
      -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp       \
      -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp \
      -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp \
      ""                                                             \
      -LDFLAGS "-lelf"                                                              \
      ""                                                              \
      --exe                                                                         \
      /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/*.cc            \
      /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp/*.cc      \
      /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp/*.cc      \
      /home/hpc-user/ara/hardware/tb/verilator/ara_tb.cpp                                           \
      --cc                                                                          \
      --top-module ara_tb_verilator &&                                                  \
    cd build/verilator && OBJCACHE='' make -j4 -f Vara_tb_verilator.mk
    %Error: Verilator internal fault, sorry. Suggest trying --debug --gdbbt
    %Error: Command Failed /home/hpc-user/ara/install/verilator/bin/verilator_bin -f build/verilator/bender_script_default -GNrLanes=4 -O3 -Wno-BLKANDNBLK -Wno-CASEINCOMPLETE -Wno-CMPCONST -Wno-LATCH -Wno-LITENDIAN -Wno-UNOPTFLAT -Wno-UNPACKED -Wno-UNSIGNED -Wno-WIDTH -Wno-WIDTHCONCAT --hierarchical tb/verilator/waiver.vlt --Mdir build/verilator -Itb/dpi --compiler clang -CFLAGS -DTOPLEVEL_NAME=ara_tb_verilator -CFLAGS -DNR_LANES=4 -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp -CFLAGS -I/home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp  -LDFLAGS -lelf  --exe /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/dpi_memutil.cc /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/sv_scoped.cc /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp/verilator_memutil.cc /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp/verilated_toplevel.cc /home/hpc-user/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp/verilator_sim_ctrl.cc /home/hpc-user/ara/hardware/tb/verilator/ara_tb.cpp --cc --top-module ara_tb_verilator
    make: *** [Makefile:146: build/verilator/Vara_tb_verilator] Error 255
    

    The version of the Verilator is 4.210.

  • RTL simulation getting failed with Verilator

    RTL simulation getting failed with Verilator

    I am trying to run makefile target make verilate but I am getting the following errors on the default settings.

    $ make verilate 
    Makefile:43: "Specified QuestaSim version (questa-2020.1) not found in PATH 
    rm -rf build/verilator; mkdir -p build/verilator
    ./bender script verilator -t rtl -t ara_test -t cva6_test -t verilator --define NR_LANES=4 --define VLEN=4096 --define RVV_ARIANE=1 > build/verilator/bender_script
    /home/ara/install/verilator/bin/verilator -f build/verilator/bender_script                     \
      -GNrLanes=4                                                         \
      -O3                                                                           \
      -Wno-BLKANDNBLK                                                               \
      -Wno-CASEINCOMPLETE                                                           \
      -Wno-CMPCONST                                                                 \
      -Wno-LITENDIAN                                                                \
      -Wno-MODDUP                                                                   \
      -Wno-PINMISSING                                                               \
      -Wno-SYMRSVDWORD                                                              \
      -Wno-UNOPTFLAT                                                                \
      -Wno-UNPACKED                                                                 \
      -Wno-UNSIGNED                                                                 \
      -Wno-WIDTH                                                                    \
      -Wno-WIDTHCONCAT                                                              \
      --Mdir build/verilator --trace                                               \
      -Itb/dpi                                                                      \
      -CFLAGS "-std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator"                       \
      -CFLAGS "-DNR_LANES=4"                                              \
      -CFLAGS -I/home/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp       \
      -CFLAGS -I/home/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp \
      -CFLAGS -I/home/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp \
      -LDFLAGS "-lelf"                                                              \
      --exe                                                                         \
      /home/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/*.cc            \
      /home/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp/*.cc      \
      /home/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp/*.cc      \
      /home/ara/hardware/tb/verilator/ara_tb.cpp                                           \
      --cc                                                                          \
      --top-module ara_tb_verilator &&                                                  \
    cd build/verilator && OBJCACHE='' make -j4 -f Vara_tb_verilator.mk
    %Error: /home/ara/hardware/deps/tech_cells_generic/src/rtl/tc_sram.sv:93:38: Unsupported or unknown PLI call: $urandom
       93 |           "random": init_val[i][j] = $urandom();
          |                                      ^~~~~~~~
    %Error: /home/ara/hardware/tb/ara_tb.sv:30:28: syntax error, unexpected TIME NUMBER, expecting TYPE-IDENTIFIER
       30 |   localparam ClockPeriod = 1ns;
          |                            ^~~
    %Warning-STMTDLY: /home/ara/hardware/tb/ara_tb.sv:48:10: Unsupported: Ignoring delay on this delayed statement.
       48 |   always #(ClockPeriod/2) clk = !clk;
          |          ^
                      ... Use "/* verilator lint_off STMTDLY */" and lint_on around source to disable this message.
    %Warning-STMTDLY: /home/ara/hardware/tb/ara_tb.sv:56:7: Unsupported: Ignoring delay on this delayed statement.
       56 |       #(ClockPeriod);
          |       ^
    %Warning-STMTDLY: /home/ara/hardware/tb/ara_tb.sv:98:7: Unsupported: Ignoring delay on this delayed statement.
       98 |       #ClockPeriod;
          |       ^
    %Error: Exiting due to 2 error(s), 3 warning(s)
            ... See the manual and https://verilator.org for more assistance.
    Makefile:123: recipe for target 'build/verilator/Vara_tb_verilator' failed
    make: *** [build/verilator/Vara_tb_verilator] Error 1
    

    What could be the possible cause for that?

  • Error when compiling hello_world

    Error when compiling hello_world

    i'm trying to run make bin/hello_world, but i got the following errors:

    make bin/hello_world /home/workspace/pulp/ara_lenovo_unix/install/riscv-gcc/bin/riscv64-unknown-elf-gcc -mcmodel=medany -march=rv64gcv -mabi=lp64 -I/home/workspace/pulp/ara_lenovo_unix/apps/common -static -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -c hello_world/main.c -o hello_world/main.c.o /home/workspace/pulp/ara_lenovo_unix/install/riscv-gcc/bin/riscv64-unknown-elf-gcc -mcmodel=medany -march=rv64gcv -mabi=lp64 -I/home/workspace/pulp/ara_lenovo_unix/apps/common -static -std=gnu99 -O3 -ffast-math -fno-common -fno-builtin-printf -DNR_LANES=4 -Wunused-variable -Wall -Wextra -c common/crt0.S -o common/crt0.S.o common/encoding.h: Assembler messages: common/encoding.h:1: Error: unknown pseudo-op: ..' common/crt0.S:80: Error: illegal operandsli t0,PMP_NAPOT|PMP_R|PMP_W|PMP_X' common/crt0.S:92: Error: illegal operands li t0,(1<<CAUSE_LOAD_PAGE_FAULT)|(1<<CAUSE_STORE_PAGE_FAULT)|(1<<CAUSE_FETCH_PAGE_FAULT)|(1<<CAUSE_MISALIGNED_FETCH)|(1<<CAUSE_USER_ECALL)|(1<<CAUSE_BREAKPOINT)' common/crt0.S:102: Error: illegal operandsli t0,(MSTATUS_FS&(MSTATUS_FS>>1))' common/crt0.S:106: Error: illegal operands `li t0,(MSTATUS_VS&(MSTATUS_VS>>1))' /home/workspace/pulp/ara_lenovo_unix/apps/common/runtime.mk:66: recipe for target 'common/crt0.S.o' failed make: *** [common/crt0.S.o] Error 1 rm hello_world/main.c.o

    what could be the reason for this ?

  • ara core hangs with 2 lanes and 4 lanes configuration

    ara core hangs with 2 lanes and 4 lanes configuration

    We saw the same program (binary) runs fine on 8 and 16 lanes configuration, however hangs when running on 2 lanes or 4 lanes configuration. From the waveform, when it hangs the PC stopped moving forward. And it is not introduced by any particular vector instruction, it seems to be a mix of scalar and vector instructions that is causing this hang. Tested on latest commits and showing also the same result.

    Just wondering if this is a known issue, and do you need a minimum sequence that can reproduce this issue?

  • Hotfixes

    Hotfixes

    Priority PR - No dependencies on other PRs

    Description of the fixes in the commits.

    Changelog

    Fixed

    • AXI transactions on an opposite channel w.r.t. the channel currently in use are started only after the completion of the previous transactions.
    • Fix the number of elements to be requested for a vslidedown instruction.

    Changed

    • Cut a timing-critical path from Addrgen to Sequencer (1 cycle more to start an AXI transaction)
    • Cut a timing-critical path in the VSTU, relative to the calculation of the pointer to the VRF word received from the lanes

    Checklist

    • [ ] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed
  • make toolchain-gcc failed

    make toolchain-gcc failed

    I follow the guideline in README, and when I go to step make riscv_tests

    it seems that I need to make toolchain-gcc first 飞书20211113-122207

    when I make toolchian-gcc, another problem comes 飞书20211113-130512

    the log shows several similar errors like this "cannot stat 'xxx.gmo': No such file or directory" 飞书20211113-131020

    How should I solve this problem?

  • run RTL simulation with make verilate report no such file or directory

    run RTL simulation with make verilate report no such file or directory

    hi,I am trying to run makefile,but there were some mistakes ~/riscv/ara/hardware$ make verilate Makefile:43: "Specified QuestaSim version (questa-2020.1) not found in PATH /home/wu/riscv/gcc/riscv-unknown-elf-gcc/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" rm -rf build/verilator; mkdir -p build/verilator ./bender script verilator -t rtl -t ara_test -t cva6_test -t verilator --define NR_LANES=4 --define VLEN=4096 --define RVV_ARIANE=1 > build/verilator/bender_script /home/wu/riscv/ara/install/verilator/bin/verilator -f build/verilator/bender_script
    -GNrLanes=4
    -O3
    -Wno-BLKANDNBLK
    -Wno-CASEINCOMPLETE
    -Wno-CMPCONST
    -Wno-LITENDIAN
    -Wno-MODDUP
    -Wno-PINMISSING
    -Wno-SYMRSVDWORD
    -Wno-UNOPTFLAT
    -Wno-UNPACKED
    -Wno-UNSIGNED
    -Wno-WIDTH
    -Wno-WIDTHCONCAT
    --Mdir build/verilator --trace
    -Itb/dpi
    -CFLAGS "-std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator"
    -CFLAGS "-DNR_LANES=4"
    -CFLAGS -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp
    -CFLAGS -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp
    -CFLAGS -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp
    -LDFLAGS "-lelf"
    --exe
    /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/.cc
    /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp/
    .cc
    /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp/*.cc
    /home/wu/riscv/ara/hardware/tb/verilator/ara_tb.cpp
    --cc
    --top-module ara_tb_verilator &&
    cd build/verilator && OBJCACHE='' make -j4 -f Vara_tb_verilator.mk make[1]: Entering directory '/home/wu/riscv/ara/hardware/build/verilator' g++ -I. -MMD -I/home/wu/riscv/ara/install/verilator/share/verilator/include -I/home/wu/riscv/ara/install/verilator/share/verilator/include/vltstd -DVM_COVERAGE=0 -DVM_SC=0 -DVM_TRACE=1 -DVM_TRACE_FST=0 -faligned-new -fcf-protection=none -Wno-bool-operation -Wno-sign-compare -Wno-uninitialized -Wno-unused-but-set-variable -Wno-unused-parameter -Wno-unused-variable -Wno-shadow -std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator -DNR_LANES=4 -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp -std=gnu++14 -Os -c -o ara_tb.o /home/wu/riscv/ara/hardware/tb/verilator/ara_tb.cpp g++ -I. -MMD -I/home/wu/riscv/ara/install/verilator/share/verilator/include -I/home/wu/riscv/ara/install/verilator/share/verilator/include/vltstd -DVM_COVERAGE=0 -DVM_SC=0 -DVM_TRACE=1 -DVM_TRACE_FST=0 -faligned-new -fcf-protection=none -Wno-bool-operation -Wno-sign-compare -Wno-uninitialized -Wno-unused-but-set-variable -Wno-unused-parameter -Wno-unused-variable -Wno-shadow -std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator -DNR_LANES=4 -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp -std=gnu++14 -Os -c -o dpi_memutil.o /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/dpi_memutil.cc g++ -I. -MMD -I/home/wu/riscv/ara/install/verilator/share/verilator/include -I/home/wu/riscv/ara/install/verilator/share/verilator/include/vltstd -DVM_COVERAGE=0 -DVM_SC=0 -DVM_TRACE=1 -DVM_TRACE_FST=0 -faligned-new -fcf-protection=none -Wno-bool-operation -Wno-sign-compare -Wno-uninitialized -Wno-unused-but-set-variable -Wno-unused-parameter -Wno-unused-variable -Wno-shadow -std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator -DNR_LANES=4 -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp -std=gnu++14 -Os -c -o sv_scoped.o /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/sv_scoped.cc g++ -I. -MMD -I/home/wu/riscv/ara/install/verilator/share/verilator/include -I/home/wu/riscv/ara/install/verilator/share/verilator/include/vltstd -DVM_COVERAGE=0 -DVM_SC=0 -DVM_TRACE=1 -DVM_TRACE_FST=0 -faligned-new -fcf-protection=none -Wno-bool-operation -Wno-sign-compare -Wno-uninitialized -Wno-unused-but-set-variable -Wno-unused-parameter -Wno-unused-variable -Wno-shadow -std=c++11 -Wall -DTOPLEVEL_NAME=ara_tb_verilator -DNR_LANES=4 -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp -I/home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_simutil_verilator/cpp -std=gnu++14 -Os -c -o verilator_memutil.o /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_verilator/cpp/verilator_memutil.cc /home/wu/riscv/ara/hardware/tb/verilator/lowrisc_dv_verilator_memutil_dpi/cpp/dpi_memutil.cc:11:10: fatal error: libelf.h: No such file or directory 11 | #include <libelf.h> | ^~~~~~~~~~ compilation terminated. make[1]: *** [Vara_tb_verilator.mk:75: dpi_memutil.o] Error 1 make[1]: *** Waiting for unfinished jobs.... make[1]: Leaving directory '/home/wu/riscv/ara/hardware/build/verilator' make: *** [Makefile:127: build/verilator/Vara_tb_verilator] Error 2

    thanks

  • Hotfix for misaligned memory ops with more than 255 beats

    Hotfix for misaligned memory ops with more than 255 beats

    Hotfix for addrgen.sv. The aligned_end_addr signal should consider the misalignment by adding AxiDataWidth/8 only at the very last burst.

    Changelog

    Fixed

    • Fix misaligned memory operations with more than 255 beats (>= 256 beats)

    Checklist

    • [x] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed
  • [hardware] :arrow-up: CVA6 update: half$, FPU, bugs

    [hardware] :arrow-up: CVA6 update: half$, FPU, bugs

    Update CVA6 and FPU.

    CVA6 now has half of the L1 caches. In the future, we will probably go for the same size, but with doubled cache lines and half lines. We fixed a de-synchronization bug in the FPU and added a classify feature that more resembles the one needed by the vector specs. We also fixed CVA6's ability to track commit fp-registers when Ara should answer back with a scalar fp value.

    Changelog

    Fixed

    • Fixed de-synch bug in vector-FPU
    • CVA6 tracks writes to floating-point scalar registers by the accelerator

    Changed

    • Halve CVA6's L1 caches to ease backend timing closure
    • Remove CVA6's cache-patch

    Checklist

    • [x] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed
  • [hardware] Floating-Point Reductions

    [hardware] Floating-Point Reductions

    This PR depends on https://github.com/pulp-platform/ara/pull/133

    This PR introduces FP-reduction support to Ara.

    Thanks to @xiaorui-yin for the great work and effort!

    Changelog

    Added

    • Support for vector single-width floating-point reduction instructions: vfredusum, vfredosum, vfredmin, vfredmax
    • Support for Vector widening floating-point reductions: vfwredusum, vfwredosum
    • fdotproduct benchmark to evaluate dot products with reductions
    • fredsum benchmark to evaluate fp-reductions
    • riscv-tests for vfredusum, vfredosum, vfredmin, vfredmax, vfwredusum, vfwredosum

    Changed

    • Updated target -march to rv64gcv_zfh_zvfh0p1 to enable half-floats support

    Checklist

    • [x] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed
    • [x] No frequency degradation
  • [hardware] :bug: Fix reductions + Rework the VALU

    [hardware] :bug: Fix reductions + Rework the VALU

    The previous mechanism to handle the commit during a reduction was confusing and led to bugs. Now, the reduction triggers its commit only after the inter-lanes phase is over. Also, some recurrent lines of code have been grouped into macros

    Changelog

    Fixed

    • Description of changes

    Added

    • Description of changes

    Changed

    • Description of changes

    Checklist

    • [x] Automated tests pass
    • [x] Changelog updated
    • [x] Code style guideline is observed
    • [ ] No frequency degradation
  • Check whether we can access vs1 and vs2 in VMADC/VMSBC

    Check whether we can access vs1 and vs2 in VMADC/VMSBC

    In ara_dispatcher, when it decodes a VMADC/VMSBC instruction, you check accessibilities of vs1 and vs2 using unique case (ara_req_d.emul) LMUL_2: if ((insn.varith_type.rs2 & 5'b00001) == (insn.varith_type.rd & 5'b00001)) illegal_insn = 1'b1; LMUL_4: if ((insn.varith_type.rs2 & 5'b00011) == (insn.varith_type.rd & 5'b00011)) illegal_insn = 1'b1; LMUL_8: if ((insn.varith_type.rs2 & 5'b00111) == (insn.varith_type.rd & 5'b00111)) illegal_insn = 1'b1; default: if (insn.varith_type.rs2 == insn.varith_type.rd) illegal_insn = 1'b1; endcase I don't understand what you want to do here. Actually in LMUL_2, when rd=v6 and vs2=v2 will cause a illegal instruction according to these code. But this is legal in rvv. Can you explain it detailly? Thanks for your time!

  • scale_vl in vslide/vstore

    scale_vl in vslide/vstore

    When the new EEW is not equal to the old EEW, then a RESHUFFLE is inserted to ARA. And the scale_vl is used to scale the length of source register's elements. But I notice that the scale_vl is also asserted in vslide and vstore instructions. Why? I think the vl in these instructions doesn't change.

C Language version for yolo in risc-v
C Language version for yolo in risc-v

RISC-V C-Embedding Yolo 基于Yolo v2的蜂鸟e203 RISC-V部署代码,其中的加速器由队伍中负责硬件的人使用Verilog编写(暂不提供),并在硬件提供的C API上搭建了yolo的部署代码。其中,加速器硬件模块暂由c编写的神经网络加速器模拟器来代替。 网络实现了人脸

Jul 19, 2022
Operating system model using an assembler RISC-V RV32I instruction set.(development)

General Information Operating system model using an assembler RISC-V RV32I instruction set.(development) С++ Standard - c++17 gcc 9.3.0(Linux,unicode)

Dec 21, 2021
We implemented our own sequential version of GA, PSO, SA and ACA using C++ and the parallelized version with CUDA support

We implemented our own sequential version of GA, PSO, SA and ACA using C++ (some using Eigen3 as matrix operation backend) and the parallelized version with CUDA support. All of them are much faster than the popular lib scikit-opt.

May 7, 2022
Provide sample code of efficient operator implementation based on the Cambrian Machine Learning Unit (MLU) .

Cambricon CNNL-Example CNNL-Example 提供基于寒武纪机器学习单元(Machine Learning Unit,MLU)开发高性能算子、C 接口封装的示例代码。 依赖条件 操作系统: 目前只支持 Ubuntu 16.04 x86_64 寒武纪 MLU SDK: 编译和

Mar 7, 2022
Minctest - tiny unit testing framework for ANSI C

Minctest Minctest is a very minimal unit-testing "framework" written in ANSI C and implemented in a single header file. It's handy when you want some

Jun 23, 2022
Open-source vector similarity search for Postgres

Open-source vector similarity search for Postgres

Aug 9, 2022
Libcamera with OpenCV in Raspberry Pi 64 bit Bullseye

Libcamera OpenCV RPi Bullseye 64OS Libcamera + OpenCV on a Raspberry Pi 4 with 64-bit Bullseye OS In the new Debian 11, Bullseye, you can only capture

Apr 14, 2022
The Intel 8080 ("eighty-eighty") is the second 8-bit microprocessor designed and manufactured by Intel.
The Intel 8080 (

i8080(Intel 8080) The Intel 8080 ("eighty-eighty") is the second 8-bit microprocessor designed and manufactured by Intel. It first appeared in April 1

Aug 6, 2022
C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library

Build Status Travis CI VM: Linux x64: Raspberry Pi 3: Jetson TX2: Backstory I set to build ccv with a minimalism inspiration. That was back in 2010, o

Aug 10, 2022
Jul 31, 2022
The core engine forked from NVidia's Q2RTX. Heavily modified and extended to allow for a nicer experience all-round.

Nail & Crescent - Development Branch Scratchpad - Things to do or not forget: Items are obviously broken. Physics.cpp needs more work, revising. Proba

Jul 6, 2022
A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems
A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems

mpi-histo A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems. T

Dec 21, 2021
the C++ version of solov2 with ncnn
the C++ version of solov2 with ncnn

the C++ version of SOLOV2 with ncnn

Aug 16, 2022
Final version of Plan 9 4th Edition from Bell Labs

This is a re-release of the final version of the 4th Edition of Plan 9 from Bell Labs distributed directly by Bell Labs. 4th Edition was originally r

Jun 21, 2022
the C++ version of Seq2Seq with ncnn
the C++ version of Seq2Seq with ncnn

the C++ version of Seq2Seq with ncnn

Aug 7, 2022
This work is an expend version of livox_camera_calib(hku-mars/livox_camera_calib), which is suitable for spinning LiDAR。
This work is an expend version of livox_camera_calib(hku-mars/livox_camera_calib), which is suitable for spinning LiDAR。

expend_lidar_camera_calib This work is an expend version of livox_camera_calib, which is suitable for spinning LiDAR。 In order to apply this algorithm

Aug 13, 2022
A lightweight version of OrcVIO that uses monocular images, inertial data, as well as bounding box measurements
A lightweight version of OrcVIO that uses monocular images, inertial data, as well as bounding box measurements

OrcVIO-Lite About Object residual constrained Visual-Inertial Odometry (OrcVIO) is a visual-inertial odometry pipeline, which is tightly coupled with

Jun 30, 2022
OpenFOAM Foundation repository for OpenFOAM version 9

README for OpenFOAM-9 # About OpenFOAM OpenFOAM is a free, open source computational fluid dynamics (CFD) software package released by the OpenFOAM Fo

Jul 23, 2022
Simple inference deep head pose ncnn version
Simple inference deep head pose ncnn version

ncnn-deep-head-pose Simple implement inference deep head pose ncnn version with high performance and optimized resource. This project based on deep-he

Jun 13, 2022