PeachPy is a Python framework for writing high-performance assembly kernels.
PeachPy aims to simplify writing optimized assembly kernels while preserving all optimization opportunities of traditional assembly. Some PeachPy features:
- Universal assembly syntax for Windows, Unix, and Golang assembly.
- PeachPy can directly generate ELF, MS COFF and Mach-O object files and assembly listings for Golang toolchain
- Automatic adaption of function to different calling conventions and ABIs.
- Functions for different platforms can be generated from the assembly same source
- Supports Microsoft x64 ABI, System V x86-64 ABI (Linux and OS X), Linux x32 ABI, Native Client x86-64 SFI ABI, Golang AMD64 ABI, Golang AMD64p32 ABI
- Automatic register allocation.
- PeachPy is flexible and lets mix auto-allocated and hardcoded registers in the same code.
- Automation of routine tasks in assembly programming:
- Function prolog and epilog and generated by PeachPy
- De-duplication of data constants (e.g. Constant.float32x4(1.0))
- Analysis of ISA extensions used in a function
- Supports x86-64 instructions up to AVX2 and SHA
- Including 3dnow!+, XOP, FMA3, FMA4, TBM and BMI2.
- Excluding x87 FPU and most system instructions.
- Rigorously tested with auto-generated tests to produce the same opcodes as binutils.
- Python-based metaprogramming and code-generation.
- Multiplexing of multiple instruction streams (helpful for software pipelining).
- Compatible with Python 2 and Python 3, CPython and PyPy.
PeachPy is actively developed, and thus there are presently no stable releases of 0.2 branch. We recommend that you use the master version:
git clone https://github.com/Maratyszcza/PeachPy.git
cd PeachPy
pip install -r requirements.txt
python setup.py generate
pip install .
# These two lines are not needed for PeachPy, but will help you get autocompletion in good code editors
from peachpy import *
from peachpy.x86_64 import *
# Lets write a function float DotProduct(const float* x, const float* y)
# If you want maximum cross-platform compatibility, arguments must have names
x = Argument(ptr(const_float_), name="x")
# If name is not specified, it is auto-detected
y = Argument(ptr(const_float_))
# Everything inside the `with` statement is function body
with Function("DotProduct", (x, y), float_):
# Request two 64-bit general-purpose registers. No need to specify exact names.
reg_x, reg_y = GeneralPurposeRegister64(), GeneralPurposeRegister64()
# This is a cross-platform way to load arguments. PeachPy will map it to something proper later.
LOAD.ARGUMENT(reg_x, x)
LOAD.ARGUMENT(reg_y, y)
# Also request a virtual 128-bit SIMD register...
xmm_x = XMMRegister()
# ...and fill it with data
MOVAPS(xmm_x, [reg_x])
# It is fine to mix virtual and physical (xmm0-xmm15) registers in the same code
MOVAPS(xmm2, [reg_y])
# Execute dot product instruction, put result into xmm_x
DPPS(xmm_x, xmm2, 0xF1)
# This is a cross-platform way to return results. PeachPy will take care of ABI specifics.
RETURN(xmm_x)
Now you can compile this code into a binary object file that you can link into a program...
# Use MS-COFF format with Microsoft ABI for Windows
python -m peachpy.x86_64 -mabi=ms -mimage-format=ms-coff -o example.obj example.py
# Use Mach-O format with SysV ABI for OS X
python -m peachpy.x86_64 -mabi=sysv -mimage-format=mach-o -o example.o example.py
# Use ELF format with SysV ABI for Linux x86-64
python -m peachpy.x86_64 -mabi=sysv -mimage-format=elf -o example.o example.py
# Use ELF format with x32 ABI for Linux x32 (x86-64 with 32-bit pointer)
python -m peachpy.x86_64 -mabi=x32 -mimage-format=elf -o example.o example.py
# Use ELF format with Native Client x86-64 ABI for Chromium x86-64
python -m peachpy.x86_64 -mabi=nacl -mimage-format=elf -o example.o example.py
What else? You can convert the program to Plan 9 assembly for use with Go programming language:
# Use Go ABI (asm version) with -S flag to generate assembly for Go x86-64 targets
python -m peachpy.x86_64 -mabi=goasm -S -o example_amd64.s example.py
# Use Go-p32 ABI (asm version) with -S flag to generate assembly for Go x86-64 targets with 32-bit pointers
python -m peachpy.x86_64 -mabi=goasm-p32 -S -o example_amd64p32.s example.py
If Plan 9 assembly is too restrictive for your use-case, generate .syso
objects which can be linked into Go programs:
# Use Go ABI (syso version) to generate .syso objects for Go x86-64 targets
# Image format can be any (ELF/Mach-O/MS-COFF)
python -m peachpy.x86_64 -mabi=gosyso -mimage-format=elf -o example_amd64.syso example.py
# Use Go-p32 ABI (syso version) to generate .syso objects for Go x86-64 targets with 32-bit pointers
# Image format can be any (ELF/Mach-O/MS-COFF)
python -m peachpy.x86_64 -mabi=gosyso-p32 -mimage-format=elf -o example_amd64p32.syso example.py
See examples for real-world scenarios of using PeachPy with make
, nmake
and go generate
tools.
When command-line tool does not provide sufficient flexibility, Python scripts can import PeachPy objects from peachpy
and peachpy.x86_64
modules and do arbitrary manipulations on output images, program structure, instructions, and bytecodes.
PeachPy links assembly and Python: it represents assembly instructions and syntax as Python classes, functions, and objects. But it also works the other way around: PeachPy can represent your assembly functions as callable Python functions!
from peachpy import *
from peachpy.x86_64 import *
x = Argument(int32_t)
y = Argument(int32_t)
with Function("DotProduct", (x, y), int32_t) as asm_function:
reg_x = GeneralPurposeRegister32()
reg_y = GeneralPurposeRegister32()
LOAD.ARGUMENT(reg_x, x)
LOAD.ARGUMENT(reg_y, y)
ADD(reg_x, reg_y)
RETURN(reg_x)
python_function = asm_function.finalize(abi.detect()).encode().load()
print(python_function(2, 2)) # -> prints "4"
PeachPy can be used to explore instruction length, opcodes, and alternative encodings:
from peachpy.x86_64 import *
ADD(eax, 5).encode() # -> bytearray(b'\x83\xc0\x05')
MOVAPS(xmm0, xmm1).encode_options() # -> [bytearray(b'\x0f(\xc1'), bytearray(b'\x0f)\xc8')]
VPSLLVD(ymm0, ymm1, [rsi + 8]).encode_length_options() # -> {6: bytearray(b'\xc4\xe2uGF\x08'),
# 7: bytearray(b'\xc4\xe2uGD&\x08'),
# 9: bytearray(b'\xc4\xe2uG\x86\x08\x00\x00\x00')}
- Nearly all instruction classes in PeachPy are generated from Opcodes Database
- Instruction encodings in PeachPy are validated against binutils using auto-generated tests
- PeachPy powers Yeppp! performance library. All optimized kernels in Yeppp! are implemented in PeachPy.