Prerequisites: python 3.6 with libs such as uiautomator, subprocess, numpy, scikit-learn, bs4, etc. Dynamic taint analysis supported firmware, such as TaintDroid.
AppInspector
- Set up a clean TaintDroid environment, with UiDroid_TaintNotify installed (to extract logs from TaintDroid).
- Run the exerciser (e.g. UIDroid) to automatically collect sensitive transmissions and the corresponding app-level contexts.
- Run pcap_tdroid_matcher to filter app contexts (with the pcaps) who do not generate any sensitive flows.
- (cd the filtered directory, manually label the app contexts to be 'expected' or not based on the sensitive info transmitted.)
- Having the labeled contexts, run "ContextProcessor.py " to build ML models.
TrafficAnalyzer
- analyzer.py to analyze the traffic data specified by illegal contexts, or any self-provided flows.
- predictor.py to leverage the model trained by analyzer to predict unseen data.
If you are looking for the conference paper, please click here.
The bibtex:
@inproceedings{flowintent,
title={FlowIntent: Detecting Privacy Leakage from User Intention to Network Traffic Mapping},
author={Fu, Hao and Zheng, Zizhan and Das, Aveek K and Pathak, Parth H and Hu, Pengfei and Mohapatra, Prasant},
booktitle={IEEE International Conference on Sensing, Communication, and Networking (SECON)},
year={2016}
}
The dataset is available here.