Web traffic generator that simulates a real user browsing behaviour.
Starting from a list of web pages, with the associated timestamps, this tool drives the Firefox browser to visit them, simulating a real user thinking time. Time intervals between 2 visits are characterized by the same statistical distribution present in the provided timestamps.
At the end of the simulation, the tool provides the graphs of:
- The distribution of the thinking time
- The distribution of the total page load time
- The distribution of the single resources timings:
- Blocked: time spent in a queue waiting for a network connection
- DNS: time required to resolve a host name
- Connect: time required to create TCP connection
- Send: time required to send the HTTP request to the server
- Wait: time spent waiting for a response from the server
- Receive: time required to read entire response from the server (or cache)
- SSL: time required for SSL/TLS negotiation. If present, this is also included in the connect time
Moreover a file with more detailed data on each requests is provided for further analysis. The file contains a list of JSON objects in the form of HTTP archive (HAR), containing information and performance statistics about the web page download.
In order to run the Web Traffic Generator you need python3 and a set of libraries. In a Debian-based environment you can install them with the following command:
sudo apt-get install python3 python3-pip libfreetype6-dev python3-cairocffi
Some python packages are needed as well. You can install them with:
sudo pip3 install numpy scipy matplotlib browsermob-proxy selenium
Then, you need BrowserMob Proxy, available here.
Please download the latest release and extract it in a folder called browsermob-proxy
in the same folder
containing the Web Traffic Generator.
Alternatively you can set the location of the BrowserMob Proxy executable in the environment variable BROWSERMOBPROXY_BIN;
for example if you extract it in the current folder, type:
export BROWSERMOBPROXY_BIN=./browsermob-proxy-2.1.1/bin/browsermob-proxy
A Java Runtime Environment is also required. In case you don't have it already, you can install it with:
sudo apt-get install python3 default-jre
To run the Web Traffic Generator, you must execute this command line:
web_traffic_generator.py [-h] [--version]
[--max-interval <max_interval>]
[--timeout <timeout>] [--headers] [--no-sleep]
[--browsers <number>] [--limit-urls <number>]
[--no-https]
input_file output_file
Positional arguments:
input_file
history file.output_folder
output folder name.
Optional arguments:
-h, --help
show this help message and exit--version
show program's version number and exit--max-interval <max_interval>
use statistical intervals with maximum value <max_interval> seconds. Default is 30 sec.--timeout <timeout>
timeout in seconds after declaring failed a visit. Default is 30 sec.--headers
save headers of HTTP requests and responses in the HAR structs (e.g., to find referer field).--no-sleep
do not sleep between requests.--browsers <number>
number of browsers to open (to simulate multi-tabbing). Default is 3--limit-urls <number>
limit requests to urls--no-https
do not replay pages on https.
This tool creates a folder with the graphs of the timings distributions. In the same folder, it also creates an output file with the list of output HARs of all the requested URLs. Some HAR could be missing, due to failed downloads (very slow pages that caused a timeout, pages that required an HTTP authentication, etc.).
One or more output HAR files can be post-processed using the provided parser. The HAR parser can provide the graphs of the aggregate distribution of timings gathered in multiple files (e.g. generated by multiple runs of the Web Traffic Generator).
The HAR parser has the following command line:
HARparser.py [-h] [--version] [--no-https] input output_folder
Positional arguments:
input
HAR file, or folder with HAR files.output_folder
output statistics folder name.
Optional arguments:
-h, --help
show this help message and exit--version
show program's version number and exit--no-https
do not plot requests on https.