Capstone disassembly/disassembler framework: Core + bindings.

Capstone Engine

Build Status Build status pypi package pypi downloads


We moved the original historical repo of Capstone from https://github.com/aquynh/capstone to an organization, where we can add more maintainers to the project, and push Capstone development forward.

Our new home is https://github.com/capstone-engine/capstone

Nov 8th, 2021.


Capstone is a disassembly framework with the target of becoming the ultimate disasm engine for binary analysis and reversing in the security community.

Created by Nguyen Anh Quynh, then developed and maintained by a small community, Capstone offers some unparalleled features:

  • Support multiple hardware architectures: ARM, ARM64 (ARMv8), Ethereum VM, M68K, Mips, MOS65XX, PPC, Sparc, SystemZ, TMS320C64X, M680X, XCore and X86 (including X86_64).

  • Having clean/simple/lightweight/intuitive architecture-neutral API.

  • Provide details on disassembled instruction (called “decomposer” by others).

  • Provide semantics of the disassembled instruction, such as list of implicit registers read & written.

  • Implemented in pure C language, with lightweight bindings for D, Clojure, F#, Common Lisp, Visual Basic, PHP, PowerShell, Emacs, Haskell, Perl, Python, Ruby, C#, NodeJS, Java, GO, C++, OCaml, Lua, Rust, Delphi, Free Pascal & Vala (ready either in main code, or provided externally by the community).

  • Native support for all popular platforms: Windows, Mac OSX, iOS, Android, Linux, *BSD, Solaris, etc.

  • Thread-safe by design.

  • Special support for embedding into firmware or OS kernel.

  • High performance & suitable for malware analysis (capable of handling various X86 malware tricks).

  • Distributed under the open source BSD license.

Further information is available at http://www.capstone-engine.org

Compile

See COMPILE.TXT file for how to compile and install Capstone.

Documentation

See docs/README for how to customize & program your own tools with Capstone.

Hack

See HACK.TXT file for the structure of the source code.

License

This project is released under the BSD license. If you redistribute the binary or source code of Capstone, please attach file LICENSE.TXT with your products.

Owner
Capstone Engine
Capstone disassembly/disassembler framework
Capstone Engine
Comments
  • Single Instruction Disassembly

    Single Instruction Disassembly

    Capstone is the best disassembler I've found till now. I'm going to develop a program to disassemble an ELF file (maybe even PE, but not soon) in a control flow form using it. So I only need capstone to disassemble a single instruction at a time and judge the next instruction to disassemble. Current function you provide requires too much memory (enough for 32 instructions) each time I call it. So it would be helpful if you provide a feature that the memory for one single instruction is allocated each time the function is called.

  • New APIs for better performance - by pre-allocting memory

    New APIs for better performance - by pre-allocting memory

    Here is a proposal to speed up the disassembling process by pre-allocating memory used for the output instructions.

    https://github.com/aquynh/capstone/wiki/New-APIs:-cs_disasm_alloc()-&-cs_disasm_buf()

    Please comment, thanks.

  • issues of M68k

    issues of M68k

    attention: @emoon & @nplanel

    i fixed some problems with M68K code here https://github.com/aquynh/capstone/commit/ac63d5b9951e0f94c117232a74874a3ff36a7eec. one notable issue: we should declare variables in the beginning of the functions or blocks to make C99 compilers happy (example is older MSVC compilers)

    another major issue is: we cannot declare static variables like https://github.com/aquynh/capstone/blob/m68k/arch/M68K/M68KDisassembler.c#L63 or https://github.com/aquynh/capstone/blob/m68k/arch/M68K/M68Kdasm.c#L194. The problem will show when 2 instances of M68K engines run at the same time, and in this case these instances will share the same static data leading to a mess.

    To solve this problem, please see how other archs passing around variables via function arguments (such as MCInst). In some special cases, if there is no better choice, it is possible to store these variables into struct MCInst (https://github.com/aquynh/capstone/blob/m68k/MCInst.h#L92) or cs_struct (https://github.com/aquynh/capstone/blob/m68k/cs_priv.h#L51)

  • RISCV support ISRV32/ISRV64

    RISCV support ISRV32/ISRV64

    This is based on PR#1198 and LLVM upstream commit b81d715c(Sat Feb 16 18:39:14 2019). Also referenced the SyestemZ TableGen patchs. I also add the TableGen patch at capstone/contrib/update_riscv.

  • cs-next llvm update

    cs-next llvm update

    After upgrading r2 to capstone-next I discovered a bunch of regressions:

    • [x] ARM64_VESS is no longer there
    • [x] On X86: X86_INS_FADDP X86_INS_UD2B X86_INS_FADD
    • [x] 83 broken tests in the r2 testsuite https://api.travis-ci.org/v3/job/518226592/log.txt

    I will update this issue with more regressions if found, could you confirm if thoes missing enums are on purpose?

    Related PR https://github.com/radare/radare2/pull/13688

  • MIPS disassembler problems

    MIPS disassembler problems

    Compare the outputs of gnu disassembler and capstone:

    • No relative offsets supported? 0x00080430
    • Invalid instructions disassembled as correct? 0x00080474
    • Show offsets instead of ‘invalid’? 0x0008046c
    • Negative hexadecimal values? 0x0008044c
    [0x00080430]> e asm.arch=mips.gnu
    [0x00080430]> pd 20
       ;      [6] va=0x00080430 pa=0x00000430 sz=176 vsz=176 rwx=-r-x .text
           ,  ;-- section..text:
           ,=< 0x00080430    01001104     bal 0x00080438
           |   0x00080434    00000000     nop
           `-> 0x00080438    01001c3c     lui gp, 0x1
               0x0008043c    e88b9c27     addiu gp, gp, -29720
               0x00080440    21e09f03     addu gp, gp, ra
               0x00080444    2120a003     move a0, sp
               0x00080448    21280000     move a1, zero
               0x0008044c    1880868f     lw a2, -32744(gp)
               0x00080450    1c80878f     lw a3, -32740(gp)
               0x00080454    6804e724     addiu a3, a3, 1128
               0x00080458    e0ffbd27     addiu sp, sp, -32
               0x0008045c    2080998f     lw t9, -32736(gp)
               0x00080460    08002003     jr t9
               0x00080464    00000000     nop
               0x00080468    00100800     sll v0, t0, 0x0
               0x0008046c    08100800     sym.__INIT_ARRAY__
               0x00080470    10100800     sym.__FINI_ARRAY__
               0x00080474    18100800     sym.__CTOR_LIST__
               0x00080478    00000000     nop
               0x0008047c    00000000     nop
    [0x00080430]> e asm.arch=mips
    [0x00080430]> pd 20
       ;      [6] va=0x00080430 pa=0x00000430 sz=176 vsz=176 rwx=-r-x .text
               ;-- section..text:
               0x00080430    01001104     bal 8
               0x00080434    00000000     nop
               0x00080438    01001c3c     lui gp, 1
               0x0008043c    e88b9c27     addiu gp, gp, -0x7418
               0x00080440    21e09f03     addu gp, gp, ra
               0x00080444    2120a003     move a0, sp
               0x00080448    21280000     move a1, zero
               0x0008044c    1880868f     lw a2, -0x7fe8(gp)
               0x00080450    1c80878f     lw a3, -0x7fe4(gp)
               0x00080454    6804e724     addiu a3, a3, 0x468
               0x00080458    e0ffbd27     addiu sp, sp, -section_end..debug_aranges
               0x0008045c    2080998f     lw t9, -0x7fe0(gp)
               0x00080460    08002003     jr t9
               0x00080464    00000000     nop
               0x00080468    00100800     sll v0, t0, 0
               0x0008046c    08100800     invalid
               0x00080470    10100800     invalid
               0x00080474    18100800     mult ac2, zero, t0
               0x00080478    00000000     nop
               0x0008047c    00000000     nop
    [0x00080430]>
    
  • 16bit segment bounds error

    16bit segment bounds error

    Using this test from radare2 capstone returns wrong result:

    NAME="16bit segment bounds - capstone" FILE=malloc://1024k CMDS=' e asm.arch=x86.cs e asm.bits=16 e anal.hasnext=0 wx e9c300 @ f000:ffaa s f000:ffaa pi 1 ' EXPECT='jmp 0xf0070 ' run_test

  • mips32r2 bugs

    mips32r2 bugs

    1. mtc0 instruction is wrongly decoded (some examples below) capstone objdump 4080e000 mtc0 $zero, $gp, 0 mtc0 zero,c0_taglo 4080e800 mtc0 $zero, $sp, 0 mtc0 zero,c0_taghi 40886000 mtc0 $t0, $t4, 0 mtc0 t0,c0_status

    2. synci instruction is not recognized by capstone capstone objdump 051f0000 .byte 0x00, 0x00, 0x1f, 0x05 synci 0(t0)

    3. jump instructions wrongly decode target address(j, jal). This instructions are not absolute (the new address is computed by taking the upper 4 bits of the PC, concatenated to the 26 bit immediate value shifted left by two, and the lower two bits are 00, so the address created remains word-aligned.) capstone objdump 0x9404040c: 09010216 j 0x4040858 j 94040858

    As as side note - I couldn't find any documentation about capstone internals so I could fix thouse bugs myself without huge effort.

  • m68k: data race initializing global g_instruction_table in build_opcode_table()

    m68k: data race initializing global g_instruction_table in build_opcode_table()

    The global array g_instruction_table init in M680XDisassembler.c:build_opcode_table() is a race condition.

    Declaration: https://github.com/aquynh/capstone/blob/de952a3e5a519b4a0d0dd2ef921a755891860ca6/arch/M68K/M68KDisassembler.c#L250-L251

    Initialization: https://github.com/aquynh/capstone/blob/de952a3e5a519b4a0d0dd2ef921a755891860ca6/arch/M68K/M68KDisassembler.c#L3787-L3827

    This caused problems while working on the Rust language bindings. The tests are run in parallel: - PR: https://github.com/capstone-rust/capstone-rs/pull/60 - Travis CI failure: https://travis-ci.org/capstone-rust/capstone-rs/jobs/498529801#L623

    Possible solutions:

    • Introduce synchronization in build_opcode_table()
      • Would require adding a threading library dependency (like pthread)
    • Declare array with appropriate values statically
      • Most efficient (run time) method
      • Does not add extra dependencies or change API
      • Requires deeper knowledge of m68k code
      • May require writing a script to generate C declaration
    • Introduce capstone_init() function that must be called before any other Capstone functions
      • API breaking change
      • Less ergonomic for users
    • Move the global variable into the private cs_struct
      • Less efficient: wastes time initializing and memory

    Similar past issue: #1171

  • Eliminates run-time initialization of global variables (fixes race condition)

    Eliminates run-time initialization of global variables (fixes race condition)

    Declare global arch arrays with contents.

    This eliminates the need for archs_enable() and eliminates the racey initialization.

    Fixes #1168.

    Progress:

    • [X] declare global arrays with values
  • Segmentation fault (2.1.2)

    Segmentation fault (2.1.2)

    I've compiled test binaries with specific options: http://pastebin.com/BxddKQSd

    All test* binaries crashes with "segmentation fault" error message. Still need to investigate what's going on here

  • ARM: cs_regs_access Missing Some Read Registers

    ARM: cs_regs_access Missing Some Read Registers

    Hello. Given the following ARM v7R instruction, e9 2d 10 00. It is correctly dissembled to stmdb sp!, {ip}. On Capstone 4.0.1, cs_regs_access correctly returns sp, and ip as read registers. On Capstone 4.0.2 however, only sp is returned.

    This was reported as a Capstone.NET issue if you would like more information. I am pretty sure I have ruled it out as a binding issue though.

    Thanks in advance.

  • expose tablegen API

    expose tablegen API

    I am writting a cryptanalysis library. I need to get statistics on opcodes repartition on the code. So I need to generate any valid assembled <-> disassembled instructions on a BiBTreeMap in rust.

    I need to implement bindings for tablegen to this library : https://github.com/capstone-rust/capstone-rs/ and I imagine the library author needs an exposed library first.

    Could you assign me to this issue please?

    The files I want to edit are here: https://github.com/capstone-engine/capstone/tree/next/suite/synctools/tablegen

  • Add support for VE architecture

    Add support for VE architecture

    VE is the ISA of NEC Vector Engine cards. This architecture has a really nice orthogonal ISA. The assembler syntax is documented here. An LLVM compiler exists and parts have been merged upstream.

    I attempt to add the arch in this branch: https://github.com/freemin7/capstone/tree/vector-engine I had chats with people and was advised against the tablegen approach. Overall i am lost where to start though. There are some ISAs which do not take the table gen approach, however their approaches are varied and i am not sure whether there is a good reason to prefer one over another.

  • Including new instructions from LLVM

    Including new instructions from LLVM

    Newer architecture with new instructions gets included to LLVM and LLVM-MC by the manufacturer. I'm finding that Capstone does not support the new instructions, some are few years old.

    Wondering what's the process of updating LLVM-MC and including the new instructions to capstone?

  • ARM Thumb: disassembly for BL instruction resolves incorrect immediate value.

    ARM Thumb: disassembly for BL instruction resolves incorrect immediate value.

    capstone v4.0.2 installed from pip (Mac OSX 12.6 and Ubuntu 20.04)

    It seems like the ARM Thumb BL immediate values are being incorrectly decoded.

    Manually decoding the instruction b"\xff\xf7\xad\xff" ought to yield bl 0xffffff5a, however, capstone gives the following:

    >>> from capstone import *
    >>> from capstone.arm_const import *
    >>> cs = Cs(CS_ARCH_ARM, CS_MODE_THUMB)
    >>> insn = next(cs.disasm(b"\xff\xf7\xad\xff", 4))
    >>> insn
    <CsInsn 0x4 [fff7adff]: bl #0xffffff62>
    

    I have tried with some other bl instructions and the immediate values are also off by 8.

Capstone disassembly/disassembler framework

Capstone Engine Capstone is a disassembly framework with the target of becoming the ultimate disasm engine for binary analysis and reversing in the se

Sep 11, 2022
Automatic Disassembly Desynchronization Obfuscator

desync-cc --- Automatic Disassembly Desynchronization Obfuscator desync-cc is designed as a drop-in replacement for gcc, which applies disassembly des

Dec 30, 2022
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window

Asm-Dude Assembly syntax highlighting and code assistance for assembly source files and the disassembly window for Visual Studio 2015, 2017 and 2019.

Jan 6, 2023
2021-Spring-Capstone-Design '전기차 무선 충전 로봇'
2021-Spring-Capstone-Design '전기차 무선 충전 로봇'

2021-Capstone-Design 광운대학교 로봇학부 2021년도 1학기 캡스톤 디자인 '로부스'팀 Repository입니다. 개발 기간 : 2021.3 ~ 2021.6 팀원 구성 팀원 맡은 역할 김범수(팀장) 전체 지휘 총괄 및 일정 조율, Fuzzy 제어기 In

Oct 21, 2022
Champlain College Capstone Game 2021-22

fa21-capstone-2021-22-t03 Repository for Fall 2021 Capstone Prototype 1 Project --Git Standards-- Basic Etiquette Since the entire team is working in

Apr 2, 2022
Arduino core for GD32 devices, community developed, based on original GigaDevice's core
Arduino core for GD32 devices, community developed, based on original GigaDevice's core

GD32 Arduino Core (New) This is a Arduino core is based off of the original GigaDevice core that was provided by the company in early June 2021 (see h

Dec 24, 2022
Chromium Embedded Framework with OpenGL Core or SDL2

Chromium Embedded Framework's cefsimple Off-Screen Rendering I needed to use a modifed version of cefsimple using either SDL or OpenGL Core. I tried t

Nov 8, 2022
A distribution of the cFS that includes the cfe-eds-framework which includes NASA's core Flight Executive(cFE) and CCSDS Electronic Data Sheets(EDS) support.

core Flight System(cFS) Application Toolkit(cFSAT) - Beta Release A distribution of the cFS that includes the cfe-eds-framework which includes NASA's

Jul 3, 2022
LLVM bindings for Node.js/JavaScript/TypeScript

llvm-bindings LLVM bindings for Node.js/JavaScript/TypeScript Supported OS macOS Ubuntu Windows Supported LLVM methods listed in the TypeScript defini

Dec 18, 2022
C# bindings for Sokol using Sokol's binding generator

C# bindings for Sokol using Sokol's binding generator

Jan 4, 2023
Android Bindings for QuickJS, A fine little javascript engine.

quickjs-android quickjs-android 是 QuickJS JavaScript 引擎的 Android 接口框架,整体基于面向对象设计,提供了自动GC功能,使用简单。armeabi-v7a 的大小仅 350KB,是 Google V8 不错的替代品,启动速度比 V8 快,内

Dec 28, 2022
Zig bindings for the excellent CRoaring library

Zig-Roaring This library implements Zig bindings for the CRoaring library. Naming Any C function that begins with roaring_bitmap_ is a method of the B

Dec 13, 2022
Python bindings of silk codec.

Python silk module. --- pysilk --- APIs See test\test.py. import pysilk as m m.silkEncode(buf , 24000) m.silkDecode(buf , 24000) #the first param is b

Oct 11, 2022
rlua -- High level bindings between Rust and Lua

rlua -- High level bindings between Rust and Lua

Jan 2, 2023
Ziggified GLFW bindings with 100% API coverage, zero-fuss installation, cross compilation, and more.

mach/glfw - Ziggified GLFW bindings Ziggified GLFW bindings that Mach engine uses, with 100% API coverage, zero-fuss installation, cross compilation,

Dec 27, 2022
Bindings, from the comfort and speed of C++ and without Qt.

KDBindings Bindings, from the comfort and speed of C++ and without Qt. From plain C++ you get: Signals + Slots. Properties templated on the contained

Dec 27, 2022
hb-xlib bindings for Harbour language.

hb-xlib hb-xlib is a Harbour module providing bindings for the Xlib graphics library. This project is intended for people who want to start to program

Feb 6, 2022
CppADCodeGen with an easy Eigen interface and Python bindings.

CppADCodeGenEigenPy CppADCodeGen with an easy Eigen interface and Python bindings. This project has been tested on Ubuntu 16.04, 18.04, and 20.04. It

May 18, 2022
Node.js bindings for the Mathematical Expression Toolkit

ExprTk.js This is the Node.js bindings for ExprTk (Github) by @ArashPartow ExprTk.js supports both synchronous and asynchronous background execution o

Dec 12, 2022