Skip to content

Web Traffic Generator that simulates a real user browsing behaviour.

Notifications You must be signed in to change notification settings

netgroup-polito/WebTrafficGenerator

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Traffic Generator

Web traffic generator that simulates a real user browsing behaviour.

1. Description of the software

Starting from a list of web pages, with the associated timestamps, this tool drives the Firefox browser to visit them, simulating a real user thinking time. Time intervals between 2 visits are characterized by the same statistical distribution present in the provided timestamps.

At the end of the simulation, the tool provides the graphs of:

  • The distribution of the thinking time
  • The distribution of the total page load time
  • The distribution of the single resources timings:
    • Blocked: time spent in a queue waiting for a network connection
    • DNS: time required to resolve a host name
    • Connect: time required to create TCP connection
    • Send: time required to send the HTTP request to the server
    • Wait: time spent waiting for a response from the server
    • Receive: time required to read entire response from the server (or cache)
    • SSL: time required for SSL/TLS negotiation. If present, this is also included in the connect time

Moreover a file with more detailed data on each requests is provided for further analysis. The file contains a list of JSON objects in the form of HTTP archive (HAR), containing information and performance statistics about the web page download.

2. Dependencies

In order to run the Web Traffic Generator you need python3 and a set of libraries. In a Debian-based environment you can install them with the following command:

sudo apt-get install python3 python3-pip libfreetype6-dev python3-cairocffi

Some python packages are needed as well. You can install them with:

sudo pip3 install numpy scipy matplotlib browsermob-proxy selenium 

Then, you need BrowserMob Proxy, available here. Please download the latest release and extract it in a folder called browsermob-proxy in the same folder containing the Web Traffic Generator. Alternatively you can set the location of the BrowserMob Proxy executable in the environment variable BROWSERMOBPROXY_BIN; for example if you extract it in the current folder, type:

export BROWSERMOBPROXY_BIN=./browsermob-proxy-2.1.1/bin/browsermob-proxy

A Java Runtime Environment is also required. In case you don't have it already, you can install it with:

sudo apt-get install python3 default-jre

3. Usage

To run the Web Traffic Generator, you must execute this command line:

web_traffic_generator.py [-h] [--version]
                                [--max-interval <max_interval>]
                                [--timeout <timeout>] [--headers] [--no-sleep]
                                [--browsers <number>] [--limit-urls <number>]
                                [--no-https]
                                input_file output_file

Positional arguments:

  • input_file history file.
  • output_folder output folder name.

Optional arguments:

  • -h, --help show this help message and exit
  • --version show program's version number and exit
  • --max-interval <max_interval> use statistical intervals with maximum value <max_interval> seconds. Default is 30 sec.
  • --timeout <timeout> timeout in seconds after declaring failed a visit. Default is 30 sec.
  • --headers save headers of HTTP requests and responses in the HAR structs (e.g., to find referer field).
  • --no-sleep do not sleep between requests.
  • --browsers <number> number of browsers to open (to simulate multi-tabbing). Default is 3
  • --limit-urls <number> limit requests to urls
  • --no-https do not replay pages on https.

4. Output format

This tool creates a folder with the graphs of the timings distributions. In the same folder, it also creates an output file with the list of output HARs of all the requested URLs. Some HAR could be missing, due to failed downloads (very slow pages that caused a timeout, pages that required an HTTP authentication, etc.).

5. HAR parser

One or more output HAR files can be post-processed using the provided parser. The HAR parser can provide the graphs of the aggregate distribution of timings gathered in multiple files (e.g. generated by multiple runs of the Web Traffic Generator).

The HAR parser has the following command line:

HARparser.py [-h] [--version] [--no-https] input output_folder

Positional arguments:

  • input HAR file, or folder with HAR files.
  • output_folder output statistics folder name.

Optional arguments:

  • -h, --help show this help message and exit
  • --version show program's version number and exit
  • --no-https do not plot requests on https.

About

Web Traffic Generator that simulates a real user browsing behaviour.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%