Skip to content

Ledenel/fast

 
 

Repository files navigation

Build Status Coverage Status

fast -- Flat Abstract Syntax Trees

Abstract syntax trees (ASTs) are hierarchical, recursive structures for representing source code. Parsing them typically requires a full traversal of the code, which costs O(n) operations.

But, can we think differently?

Instead of parsing code structures, why cannot we load them into memory as an efficient binary structure before any further analysis? This would only cost O(1) operations.

This project adopts flatbuffers, a one-dimensional array to represent the ASTs as a binary file, and demonstates the improved efficiency and applicability to software development.

Once installed, our tool manipulates source code 10x faster.

Table of Contents

Created by gh-md-toc

When to use fast?

Your software development projects may benefit directly from fast if you find the following activities slow:

Parsing => 10-100x faster

Program slicing => 2.5x faster and 1.5x smaller

Diff-Patching => 35x faster

Synchronisation => 1.5x smaller for slicing

Version History

Feature requests

TODOs

  • Create a Java wrapper for the tool (requested by @Chris2011)
  • Merge Python grammar with SrcML grammar to allow slicing python
  • Generate Pickle AST from FAST representation using flatbuffers

0.0.8 TBD

0.0.7 (November 3, 2017)

  • Integrated with bi-tbcnn
  • Supported Solidity grammar

0.0.6 (October 5, 2017)

  • Created an Python3 parser in C++ based on the official ANTLR4 grammar in Java and extended the FAST schema accordingly, merging the branch `python3'; Currenly error handling feature is turned off.
  • Implemented docker image based on the alpine:edge image, which is much smaller than the ubuntu image
  • Generated Pickle AST from FAST representation (requested by @bdqnghi )

0.0.5 (August 25, 2017)

  • Integrated with biyacc
  • Created a Dockerfile to simplify the deployment
  • Implemented normalisation concept from meaningful changes tool, ASE'11 by migrating the txl-based implementation, see -n option
  • Rewritten the interface to speedup gumtreediff, ASE'14 and treedifferencing, ASE'16
  • Added colors to the output of diff results so that it is possible to integrate with git on the command line interface
  • Added -u option for the YUML extraction (see srcYUML)
  • Generated the patch from the diff records of GumTreeDiff integration with BiYacc
  • Reduced the size of FAST for slicing

0.0.4 (August 1, 2017)

  • Updated schema's Kinds as a union type, accommodating more ANTLR4 languages when needed (currently, Kind => srcml; SmaliKind => smali)
  • Removed the ANTLR3 branch to take full advantage of latest ANTLR4
  • Fixed some lexer errors in smaliLexer.g4 (now all code of Instagram apk can be processed 10x faster)
  • Added apk2pb script to process an APK into a tarball of protobuf representations
  • Modified the Pairs schema to include hashes
  • Formed the `f-ast' team to maintain the project
  • Complete slice-diff feature
  • Added JSON output for decoding FAST and pipe to jq for further querying
  • Added -w option to report the maximum width of the AST (i.e. number of children of the tree nodes), -W limit option to limit the width to the limit
  • Added -i option to report the identifiers appeared as function/variable names or comment tokens and tokenize them using intt
  • Added -b option to convert bug reports into protobuf format

0.0.3 (July 6, 2017)

  • Generalised the code schema to support automated software engineering activities, e.g. slicing, diffing, cloning
  • Placed "tail" information after "child" in schema to remove shift-reduce errors in the application of BiYacc
  • Added support to ANTLR4 in C++ (which unfortunately caused a conflict in the older dependencies of antlr@2 (required by srcml). A workaround (see an update to the installation guide.)
  • Converted srcSlicing CSV output into the supported protobuf schema

0.0.2 (June 21, 2017)

  • Added support for smali code through its ANTLR3 grammar in Java
  • Added srcSlice support to improve the speed of forward slicing by 2x
  • Added ANTLR3 libraries to improve GumTreeDiff speed

0.0.1 (April 11, 2017)

  • Initial public release: support round-trip translation between srcML and protobuf/flatbuffers binary ASTs, improving the parsing speed by 10x

© 2017 F-AST team. FAST is released under BSD license, see license.txt for details.

About

Flat Abstract Syntax Tree

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 62.8%
  • HTML 16.1%
  • ANTLR 12.3%
  • Python 5.5%
  • Makefile 1.7%
  • C 1.6%