Skip to content

anthrotype/ctypes-binding-generator

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ctypes-binding-generator

ctypes-binding-generator is a Python package to generate ctypes binding from C source files. It runs under Python 2 and Python 3, and generates Python bindings that are compatible with Python 2 and Python 3. It requires libclang to parse source files.

ctypes-binding-generator provides a command-line program called cbind. You may use it to generate ctypes binding for, say, stdio.h.

$ cbind -i /usr/include/stdio.h -o stdio.py -l libc.so.6 \
    -- -I/usr/local/lib/clang/3.4/include

Note that you need /usr/local/lib/clang/3.4/include for stddef.h, etc. Then you may test the generated ctypes binding of stdio.h.

$ python -c 'import stdio; stdio.printf("hello world\n")'
hello world

In fact, cbind by default uses the libclang binding generated by cbind. You may generate the binding with:

$ cbind -i /path/to/clang/include/clang-c/Index.h \
        -l libclang.so \
        -o cbind/min_cindex.py \
        --config demo/cindex.yaml \
        -- -I /usr/local/lib/clang/3.4/include

If you would like cbind to use the official libclang binding maintained by the Clang project, run cbind with --cindex clang-cindex flag.

Configuration

In the above example we generated libclang binding by cbind. However, the "raw" binding would not be very useful, and so we provided configuration file demo/cindex.yaml, which guided cbind to generate a object-oriented interface on top of the raw binding (as the suffix suggests, the configuration file is in YAML format). The configuration file is basically a YAML mapping. The supported top-level keys of the mappings are:

  • preamble
  • import
  • rename
  • enum
  • errcheck
  • method
  • mixin

We introduce each of them below.

The preamble top-level key maps to a string which will be inserted into the binding of the output binding. Generally it could be used for import helper Python modules. Alternatively, preamble maps to a mapping that supports the following keys:

  • codes: A string of codes that will be inserted.
  • library: A string of the name of the shared library.
  • use_custom_loader: (Optional) Boolean value; if true, the codes string will be used as library loader, and the default loader codes will not be inserted.

All other top-level keys map to a list of matchers and actions. The action of the first matcher that matches the syntax tree node, and only the action of the first matched matcher, will be performed. An action is a key-value pair, and the name of action is the same with the top-level key. For example, a top-level key "import" with one matcher and one action might look like this:

import:
    - name: ^clang_createIndex$
      import: True

A matcher is a mapping that specifies how to match a syntax tree node. It supports the following keys:

  • argtypes: A list of regular expressions matching function argument types.
  • name: A regular expression matching syntax tree node's name.
  • parent: A matcher matching syntax tree node's parent.
  • restype: A regular expression matching function return type.

Note that the type string that is going to be matched is the ctypes binding for that type, i.e., Python codes, rather than C codes.

The import top-level key determines which syntax tree nodes are imported to (added to) output Python binding codes. The (optional) action is Boolean valued; if true, the matched syntax tree node will be imported. If the import top-level key is not presented, cbind will import only syntax tree nodes of input files.

The rename top-level key changes the name of output syntax tree nodes. The (simplest) action value is a substitution string, whose accompanying regular expression is specified in the "name" matcher key. For example, this matches node names containing "CX" and removes it

rename:
    - name: CX(\w+)
      rename: \1

The action value could be a list of regular expressions and substitutions. For example, this matches node names containing "CXCursor_" or "CXLinkage_", and then this inserts underscore and applies upper() function to the matched string, effectively replacing CamelStyle with UNDERSCORE_STYLE

rename:
    - name: CX(Cursor|Linkage)_
      rename:
        - pattern: '([a-z])([A-Z])'
          replace: \1_\2
        - pattern: CX(Cursor|Linkage)_(\w+)
          function: 'lambda match: match.group(2).upper()'

The enum top-level key generates extra binding codes around enum constant declarations. The action value is a Python format string; the supported keys are

  • enum_name: The name of enum declaration.
  • enum_type: The integral type of enum values.
  • enum_field: The name of enum constants.
  • enum_value: The value of that constant.

The errcheck top-level key attaches errcheck function to ctypes functions. If the action value is empty, no errcheck function will be attached. You may use this feature combined with the fact that cbind applies matcher sequentially to avoid attach errcheck function to some ctypes functions. For example, this attaches check_cursor to errcheck of functions whose return type is Cursor, except clang_getNullCursor function

errcheck:
    - name: clang_getNullCursor
      errcheck:
    - restype: Cursor
      errcheck: check_cursor

The method top-level key matches ctypes functions, and adds these function to ctypes classes. The action value is a "class.method"-style string.

The mixin top-level key inserts mix-in classes when subclassing ctypes Structure and Union, and when generating subclass for C-enum. Note that the mix-in classes are placed at first of inheritance in subclass definition so that they may override methods of ctypes classes. For example, if Foo is a C-struct, given the config below

mixin:
    - name: ^Foo$
      mixin: [FooMixin]

The output binding would be like

class Foo(FooMixin, Structure):
    pass

Macros

Since macros are an important part of C headers, cbind may translate simple C macros to Python codes. For those complicated macros that cbind cannot understand, you have to translate them manually. Let's consider Linux input.h header as an example, and write a small program that dumps input events, such as mouse movements.

To enable macro translation, just provide --enable-macro flag to cbind.

$ cbind -i /usr/include/linux/input.h -o demo/linux_input.py -v \
    --enable-macro \
    -- -I/usr/local/lib/clang/3.4/include
macro.py: Could not parse macro: #define EVIOCGID (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x02)) << 0) | ((((sizeof(struct input_id)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGREP (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x03)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSREP (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x03)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGKEYCODE (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGKEYCODE_V2 (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(struct input_keymap_entry)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSKEYCODE (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(unsigned int[2])))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSKEYCODE_V2 (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x04)) << 0) | ((((sizeof(struct input_keymap_entry)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCGABS(abs) (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x40 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSABS(abs) (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0xc0 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSFF (((1U) << (((0 +8)+8)+14)) | (('E') << (0 +8)) | ((0x80) << 0) | ((sizeof(struct ff_effect)) << ((0 +8)+8)))

Note that we provide -v flag to cbind, which enables verbose output, and cbind reports macros that it cannot understand. However, not all of them are incomprehensible to cbind - it just needs some hints. cbind may translate constant integer expressions, thanks to Clang, but you have to tell cbind which macros are indeed integer expressions with --macro-int.

$ cbind -i /usr/include/linux/input.h -o demo/linux_input.py -v \
    --enable-macro --macro-int EVIO \
    -- -I/usr/local/lib/clang/3.4/include
macro.py: Could not parse macro: #define EVIOCGABS(abs) (((2U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0x40 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))
macro.py: Could not parse macro: #define EVIOCSABS(abs) (((1U) << (((0 +8)+8)+14)) | ((('E')) << (0 +8)) | (((0xc0 + (abs))) << 0) | ((((sizeof(struct input_absinfo)))) << ((0 +8)+8)))

For the remaining two macros you have to translate manually:

EVIOCGABS = lambda abs: (2 << 30) | (ord('E') << 8) | (0x40 + abs) | (sizeof(input_absinfo) << 16)
EVIOCSABS = lambda abs: (1 << 30) | (ord('E') << 8) | (0xc0 + abs) | (sizeof(input_absinfo) << 16)

Under demo/ directory there is the evtest program which uses the linux_input.py binding we generated. It will require root permission to access device file. Press Ctrl-C to break evtest.

$ sudo demo/evtest /dev/input/event0
input driver version     : 1.0.1
input device ID          : bus 0x3 vendor 0x46d product 0xc05b version 0x111
input device name        : 'Logitech USB Optical Mouse'
supported events:
  event type 0 (Sync)
  event type 1 (Key)
    event code 272 (LeftBtn)
    event code 273 (RightBtn)
    event code 274 (MiddleBtn)
    event code 275 (SideBtn)
    event code 276 (ExtraBtn)
    event code 277 (ForwardBtn)
    event code 278 (BackBtn)
    event code 279 (TaskBtn)
  event type 2 (Relative)
    event code 0 (X)
    event code 1 (Y)
    event code 6 (HWheel)
    event code 8 (Wheel)
  event type 4 (Misc)
    event code 4 (ScanCode)
testing ... (interrupt to exit)
event: time 1374999609.141463, type 2 (Relative), code 0 (X), value 1
event: time 1374999609.141466, type 2 (Relative), code 1 (Y), value -1
event: time 1374999609.141472, -------------- Report Sync ------------
event: time 1374999609.149452, type 2 (Relative), code 0 (X), value 4
event: time 1374999609.149454, type 2 (Relative), code 1 (Y), value -1
event: time 1374999609.149459, -------------- Report Sync ------------
^C

You should see evtest shows driver and device info, supported events, and dumps input events.

About

Generate ctypes binding from C source files with clang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Shell 0.9%