GitHub - schaap/p2p-testframework: A Bash based testing framework for P2P applications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
ClientWrappers		ClientWrappers
ControlScripts		ControlScripts
Docs		Docs
TestSpecs		TestSpecs
Utils		Utils
.gitignore		.gitignore
CHANGES		CHANGES
HOWTO		HOWTO
README		README
REFERENCE		REFERENCE

Repository files navigation

The P2P Testing Framework is a framework for automated running of peer to peer clients. It supports different clients, different types of host and several ways of processing the output. Most importantly it is easily extended to include your needs. Campaign and scenario files decribe how a test is to be run (which client, which hosts, how to connect, how to process data, etc) in a simple declarative language. Python modules, subclasses of the core modules, are loaded to handle most of the specific parts, all glued together by the core of the framework.

This document contains the general documentation, including a version history, the design of the framework and the parameters to the core modules. The HOWTO document gives a more introductory description of how to use the framework and how to extend it, both including examples. For a complete documentation of the framework, please run 'doxygen doxy' inside the Docs directory.

= Version History =
See CHANGES for details.
2.4.0
- Support for desktop notifications
- Multiple instances of the same execution
- Multiple root hashes per file
- Fixes in the way multiple files are handled
2.3.0
- Multiplexed connections on host:das4
- Multiple file support
- Argument selectors are supported when specifying references to hosts, files or clients
2.2.0
- Changed connection setups for stability
2.1.0
- Important extensions to the framework, including much better host support and multiple parser support
- New workload module type
- Strongly improved performance
- client:libtorrent module added
- file:remote module added
- parser:cpulog module added, along with client CPU logging
- processor:savetimeout module added
- Logs will be salvaged on unsuccesfull runs
2.0.0
- Complete port of the framework to python
- Many changes in the internal structure, campaign and scenario files should still work

= Framework design =
The python framework consists of the core script (ControlScripts/run_campaign.py), the core modules (ControlScripts/core/) and the extension modules (ControlScripts/modules). The core script parses the settings on the command line as well as campaign and scenario files in order to create the CampaignRunner and ScenarioRunner objects. This includes loading all necessary core modules and extension modules. The ScenarioRunner object knows how to run a full scenario, which is basically just stepping through all stages and instructing the loaded objects on what to do for each stage.

== Core script ==
The core script initializes everything and glues the parts together. Its ScenarioRunner class is of interest to extension modules, since it contains all the objects in an execution. Those objects are managed using the addObject(...), getObjects(...), getObjectsDict(...) and resolveObjectName(...) methods.

The general flow of the framework is documented below in the Stages section.

== Core modules ==
The core modules are always available. Most of them are the parent classes for the extension modules and will be described below. The other core modules provide global services.

See the description of the extension modules below or the HOWTO document for more information on using parameters. The REFERENCE contains a full reference of parameters.

=== core.campaign.Campaign ===
A static class with global properties that hold for all campaigns being executed. Elements such as loggers and paths to important local directories are found here, but also a reference to the campaign currently being executed. An important service provided by the Campaign class is the ability to load modules. Using the loadModule(...) and loadCoreModule(...) functions all core and extension modules can be dynamically loaded.

=== core.coreObject.coreObject ===
The parent class of all extension modules and hence the parent class of all extension module parent classes. This class provides a few basic functions, such as naming and cleanup.

=== core.debuglogger.debuglogger ===
An instance of this class is always available through core.campaign.Campaign.debuglogger. It is used for logging communication between the commanding host (the host running the framework) and the hosts doing the actual work. Host modules use this object for logging their communications.

=== core.execution ===
The core of the P2P Testing Framework revolves around executions of clients on hosts operating on files. The execution object contains that combination: host, client and file. It also knows whether the execution is a seeder or a leecher and has a unique number across the campaign and hence across the scenario.

Execution objects are declared along with the other objects and have the following parameters:
- host The name of the host object on which the execution should run. Required
- client The name of the client object which should be executed. Required
- file The name of a file object which is to be transferred. Optional, may be specified muliple times
- parser The name of the parser object which should parse the logs of this execution. Optional, can be specified multiple times. See the description of client extension modules for how parsers are selected.
- seeder Set to anything but '' to mark this execution as a seeding execution. Optional, defaults to ''
- timeout A non-negative floating point number that indicates a number of seconds to wait before actually starting the client after the scenario starts. Optional, defaults to 0
- keepSeeding Set to anything but '' to make sure this seeding execution has to end by itself before the scenario ends; normally seeders are killed when all leechers have finished
- multiply Specify a positive integer number of copies to be created of this execution. Optional, defaults to 1.

The host and file parameters of an execution are the most important places for use of argument selectors.

=== core.logger.logger ===
The generic scenario logger object. An instance is always available through core.campaign.Campaign.logger. It is used for logging about anything that needs logging. Several convenience functions are provided to handle exceptions and tracebacks.

=== core.meta.meta ===
Contains a few static functions that allow creation of meta data, such as Merkle root hashes or torrent files.

=== core.parsing ===
This module provides several functions that make it easier to parse arguments in the scenario files. Often used are isPositiveInt(...) and isPositiveFloat(...).

== Extension modules ==
The extension modules provide all the actual functionality. Described here are the parent classes to the extension modules and their parameters. For the parameters of the specific extension modules see the documentation of their classes.

To get a better idea of how to use the parameters in the configuration of the framework, please see the HOWTO document.

=== host ===
Host modules provide the connection to a host and the services to run commands on that host and to send or retrieve files. Each host modules keeps track of, usually, a number of connections to the host which can be used to send commands and send or retrieve files.

Host objects also include information about traffic control that needs to be put on the host. Such traffic control is defined using parameters to the host object, but is implemented by the tc extension modules.

Parameters:
- name The name of the host object. This name is used to refer to the host object in throughout the scenario. Usually required (particular extension modules sometimes provide a default)
- remoteDirectory The path to a directory on the remote host which can be used to store temporary files in during the scenario. Optional, a temporary directory will be created by default (in /tmp usually)

By default the following host modules are provided:
- host:local Uses the local host, mainly for testing. If you wish to use the local host for serious scenarios consider using host:ssh to 127.0.0.1.
- host:ssh Uses a host that can be approached via SSH. This is the preferred way of contacting hosts.
- host:das4 Special handler for those with access to the DAS4 system.

==== tc ====
TC modules provide traffic control to a host. They probably have special requirements on the host and your access to it, so be sure to always check that. Traffic control can be used to simulate a different networking environment, e.g. with lower speeds and lossy connections.

TC modules don't have parameters of themselves, but use the parameters set on the host object they operate on. The following parameters are therefor set on the host object, but used by the TC module.

Parameters:
- tc The name of the TC module to load, without any prefixes. E.g. tc=netem to load the tc:netem module on the host. Optional, empty by default which disables TC
- tcInterface The name of the interface on which traffic control is to be applied. This should be an existing networking interface on the remote host. Optional, defaults to eth0
- tcMaxDownSpeed Maximum download speed to allow. To be specified in bits per second, possibly postfixed by kbit or mbit. E.g. tcMaxDownSpeed=10mbit for 10 mbit speeds. Optional, defaults to 0 meaning no restrictions
- tcMaxDownBurst Maximum burst in the download speed. Not allowed if tcMaxDownSpeed is not set. To be specified in bits per second. Optional, defaults to equal to tcMaxDownSpeed
- tcMaxUpSpeed Like tcMaxDownSpeed, but for upload speed.
- tcMaxUpBurst Like tcMaxDownBurst, but for upload speed.
- tcLossChance Chance to drop a packet. A floating point number between 0.0 and 100.0 inclusive, specifying the chance as a percentage. Optional, defaults to 0.0
- tcDuplicationChance Chance that a packet will be duplicated. A floating point number between 0.0 and 100.0 inclusive, specifying the chance as a percentage. Optional, defaults to 0.0
- tcCorruptionChance Chance that a packet will be corrupted. A floating point number between 0.0 and 100.0 inclusive, specifying the chance as a percentage. Optional, defaults to 0.0
- tcDelay The delay to introduce on each packet in ms, given as a positive integer. Optional, defaults to 0
- tcJitter The maximum deviation on the introduced delay, as set by tcDelay, in ms. Optional, defaults to 0

By default the following tc modules are provided:
- tc:netem Uses the netem kernel module with the tc utility

=== file ===
File modules describe data to be transferred. Usually this will consist of one or more files. Also includes metadata for the file, such as Merkle root hashes or torrent files.

Parameters:
- name The name of the file object. This name is used to refer to the file object in throughout the scenario. Required.
- rootHash[xx] A Merkle root hash of the file. Consists of 40 hexadecimal digits. Replace xx in the parameter name with the chunksize in kbytes upon which the root hash is based (that is the size of the data from which
each leaf hash is calculated). Postfix this by L if you wish to have legacy root hashes, i.e. where the root is always the 63rd level in the tree. Examples of parameter names: rootHash[1]=... rootHash[8L]=...
Optional, may be specified multiple times but not for the same chunksize.
- metaFile A file with metadata, such as a torrent file. This file will be made available to all client executions, both seeders and leechers. Should be a path to a file on the command machine. Optional.

By default the following file modules are provided:
- file:local Specifies a local file or directory to use as data.
- file:remote Specifies a remote file or directory to use as data.
- file:fakedata Creates fake data on the remote host that is always the same, non-trivial, of configurable size and real.

The file:none module exists, but is deprecated: just don't pass any file parameters to the execution.

=== client ===
Client modules run the client application that are to be tested. To run this they use the services and information from all parts of the execution. The client module is responsible for everything regarding the client, from downloading and compiling via running to killing and retrieving logs. This burden is partly offloaded to the builder and source extension modules, and mostly present in the parent class.

Client objects also include information about where they are located and how they are to be built. Such information is defined using parameters to the client object, but is used by the source and builder extension modules.

On parser selection: each execution runs one or more parsers after all the raw logs have been retrieved. The first set of parsers in the following list is used:
1) All parsers specified in the execution object
2) All parsers specified in the client object
3) The declared parser object with the same name as the client module subtype (i.e. a parser object named 'swift' for [client:swift], no matter the type of the parser object or the name of the client object)
4) A new parser of the same subtype as the client (i.e. a new [parser:swift] for a [client:swift])
For steps 1 and 2 the actual parser objects are looked up by first looking at the declared parser objects. If the name is among the declared parser object's names, that one is used. However, if no parser object is declared with that name, but the name is the same as a parser module subtype that subtype will be loaded. So to run a parser:cpulog with no parameters on a given client just specifying parser=cpulog with the client is enough.

Parameters:
- name The name of the client object. This name is used to refer to the client object in throughout the scenario. Optional, defaults to the name of the extension module used
- extraParameters Extra parameters to be appended on the command line to the client. Client specific. Optional, defaults to ''
- parser The name of the parser object to be used to parse logs from this client. Optional, defaults to a new parser with the same name as the name of the extension module used; may be specified multiple times
- profile Set this to anything but "" to include external profiling code that will inspect CPU and memory usage every second, which will be captured in the raw cpu.log. Optional, defaults to ''
- logStart Set this to anything but "" to log the starting time of the client, which will be captured in the raw starttime.log. Note that this uses the local clock of the remote host. Optional, defaults to ''

By default the following client modules are provided:
- client:http Uses lighttpd and aria2 to provided HTTP(S) downloads
- client:opentracker Allows running the opentracker BitTorrent tracker software, useful in combination with other BitTorrent clients
- client:utorrent Uses the uTorrent binary clients with the webui
- client:swift Uses the libswift command line client
- client:libtorrent Uses the libtorrent mini command line client distributed via https://github.com/schaap/p2p-clients

==== source ====
source modules instruct the framework how to retrieve the source or binaries of the client.

source modules do not have parameters of themselves, but use the parameters set on the client object they operate on. The following parameters are therefor set on the client object, but used by the source module.

Parameters:
- source The name of the source module to load, e.g. source=local to use source:local. Optional, defaults to source:directory
- remoteClient Set to anything but '' to signal that the sources are to be loaded, or found, on the remote host instead of the commanding host. Optional, defaults to ''
- location The location of the sources. The contents of this parameter depends on the source module used. Required.

By default the following source modules are provided:
- source:directory Assumes the sources or binaries to be present in the directory pointed to by location; if remoteClient is set this is a directory on the remote host, otherwise on the commanding host
- source:local Assumes the sources or binaries to be present in the directory on the commanding host pointed to by location; if remoteClient is set this means the local sources are first uploaded before the builder starts
- source:git The location is a valid git repository that can be cloned

==== builder ====
builder modules know how to compile the sources of a client.

builder modules do not have parameters of themselves, but use the parameters set on the client object they operate on. The following parameters are therefor set on the client object, but used by the builder module.

Parameters:
- builder The name of the builder module to load, e.g. builder=make to use builder:make. Optional, default to builder:none

By default the following builder modules are provided:
- builder:none The client has already been built. Compilation is skipped.
- builder:make Uses (GNU) make to build the client
- builder:scons Calls the scons building program to build the client

=== workload ===
workload generator modules can change the executions such that the arrival times of the clients simulate specific workloads.

Parameters:
- apply Specifies the name of a client object to apply the workload to; this means every execution of that client (but see applyToSeeders) will be changed to be included in the generated workload. Optional and may be specified multiple times. If apply is enver specified then all clients that are part of a (non-seeding) execution are added as soon as all objects have been loaded.
- applyToSeeders By default a workload generator will only change non-seeding executions. Set this to 'yes' to have it change seeding executions as well. Optional
- offset Starting time of the simulated workload from the start of the scenario in seconds. Optional, floating point

Please note that workloads do not accept parameters with argument selectors.

By default the following workload generator modules are provided:
- workload:linear Creates a division of the clients to arrive at a linear rate
- workload:poisson Creates a division of the clients to arrive like a poisson process

=== parser ===
parser modules know how to parse the output of a client. They are specified per execution or per client. A parser module takes as input the raw logs as retrieved from the remote host and outputs parsed logs.

Parameters:
- name The name of the parser object. This name will be used to refer to the parser object throughout the scenario. Optional, defaults to the name of the extension module used

By default the following parser modules are provided:
- parser:none A dummy implementation parsing nothing
- parser:http A copy of parser:none for easier use with client:http
- parser:aria2 The parser for logs from aria2 as retrieved by client:http
- parser:lighttpd The parser for logs from lighttpd as retrieved by client:http
- parser:opentracker A copy of parser:none for easier use of client:opentracker
- parser:utorrent The parser for logs from utorrent as retrieved by client:utorrent
- parser:swift The parser for logs from swift as retrieved by client:swift
- parser:cpulog A parser for CPU logs as generated by having the profile parameter set on a client
- parser:libtorrent The parser for logs from libtorrent as retrieved by client:libtorrent

All parsers that are provided by default, except for parser:cpulog, parser:none and its clones, provide the same output format. It is not required to use this format: any format is fine as long as it's documented.

=== processor ===
processor modules can process raw and/or parsed logs into nicer datasets or visualizations or whatever.

processor modules do not have generic parameters: just declaring their object to be present is usually enough. Do look at the particular extension module you use for parameters it might need, though.

By default the following processor modules are provided:
- processor:savehostname Creates a simple text file for each execution with the name of the host object the execution ran on.
- processor:saveisseeder Creates a simple text file for each execution with "YES" in it if the execution was a seeder; "NO" is in it otherwise.
- processor:savetimeout Creates a simple text file for each execution with the timeout in seconds (float) before the client was launched.
- processor:gnuplot Runs a given gnuplot script for each parsed log in an attempt to create nice graphs.

For the processor:gnuplot two scripts are provided as well:
- TestSpecs/processors/simple_log_gnuplot Generates a graph for the output of the provided parsers for client logs
- TestSpecs/processors/simple_cpu_gnuplot Generates a graph for the output of parser:cpulog

=== viewer ===
viewer modules take all the data together and provide a nice view of the data.

viewer modules do not have generic parameters: just declaring their object to be present is usually enough. Do look at the particular extension module you use for parameters it might need, though.

By default the following viewer modules are provided:
- viewer:htmlcollection Creates an HTML page that describes the whole scenario.

= Stages =
This section described the workflow inside the testing framework for each scenario. The code below is pseudocode that matches what the ScenarioRunner does.

0) Read the combined scenario file
a) Create the new object, call parseSetting(...) on it for each parameter and finally call checkSettings() on it
b) With each host object: call .doPreprocessing() on it
c) With each file object: call .doPreprocessing() on it
d) With each object: call .resolveNames() on it
e) With each execution:
I) Add the client object in the execution to the execution's host's client set
II) Add the file objects in the execution to the execution's host's files and,
if needed, seedingFiles sets
1) Collect all hosts that are part of an execution in a set executionHosts
2) Call host.prepare() on each host in executionHosts, this prepares that hosts and sets up connections to them
3) Collect all hosts that are part of an execution in a set executionHosts
4) With each workload generator
a) Call workload.applyWorkload(), which changes the executions
5) On all client object, call client.prepare(), this will prepare the client binaries,
including up/downloading source and compilations
6) With each host in executionHosts
a) If the host requests TC
I) Analyse the hosts and clients to see which TC (inbound/outbound) (port restricted/fully restricted)
is needed
II) Load the tc module for this host
III) Check to see that the TC option needed can actually be loaded (calls tc.check(host) )
0) If not, try and fall back to more restrictive TC
0) If including fallbacks nothing is possible, fail the scenario
IV) Save the way TC is to be done in the host
b) Call client.prepareHost(host) for each client in host.clients,
which contains all clients that will run on the host
7) With each host in executionHosts
a) Call file.sendToHost(host) for each file in host.files,
which contains all the files that will be seeded from or leeched to the host
b) Call file.sendToSeedingHost(host) for each file in host.seedingFiles,
which contains all the files that will be seeded from the host
8) With each execution
a) Call execution.client.prepareExecution(execution)
9) With each host in executionHosts
a) If the host reqests TC
I) Install the TC (calls tc.install(host) )
10) With each execution
a) Prepare an execution specific connection to the host of the execution
11) With each execution, in parallel (with each other and with step 12)
a) Wait the specified timeout
b) Call execution.client.start(execution) to start the client on the host
12) While the timelimit has not been reached
a) Sleep at most 5 seconds
b) With each execution that is not a side service
I) Call execution.client.isRunning(execution) to see if the client is still running
0) If so, stop checking the other executions and continue with 12
13) With each execution, in parallel
a) Call execution.client.isRunning(execution) to see if the client is still running
I) If so, call execution.client.kill(execution) to have the client killed
14) With each host in executionHost
a) If the host requests TC
I) Remote the TC (call tc.remove(host) )
15) With each execution for which the client is not a side service, in parallel
a) Call execution.client.retrieveLogs(execution) to retrieve the client logs for the execution
b) Call execution.runParsers(...)
I) If parsers were set for the execution
0) Call execution.parser.parseLogs(...)
II) Otherwise
0) Retrieve a number of parsers plist by calling execution.client.loadDefaultParsers(execution)
0) Call p.parseLogs(...) for each p in plist
16) With each host
a) Create a new connection to the host to use for cleanup
17) With each execution
a) Call execution.client.hasStarted(execution) and execution.client.isRunning(execution)
to find out if the client is running
I) If so, call execution.client.kill(execution) to kill the client
18) With each file
a) Call file.cleanup()
19) With each host
a) With each client in host.clients, which contains all the clients that will/have run on the host
I) Call client.cleanupHost( host )
20) With each client
a) Call client.cleanup()
21) With each host
a) If the host requests TC
I) Try and remove TC (calls tc.remote(host) )
b) Call host.cleanup(), which also cleans up the connections, including the cleanup connection
22) With each processor
a) Call processor.processLogs(...)
23) With each viewer
a) Call viewer.createView(...)

Steps 16 through 21 are the cleanup, which is at each call guarded against errors and will run always. Note that it can also run at any moment in time, e.g. due to an Exception being raised. So from any step before 16 one can always jump straight into 16. In case of such a jump the scenario stops after step 21. A particular jump is after step 6a: if the run is just a testrun step 6b will not be executed and once step 6 is done the jump to cleanup will be taken.

Step 15 exists in two different versions: at any point during steps 1 through 15 an error might occur; in that case a specially guarded version of 15 is ran before cleanup is started. During normal execution, step 15 is not guarded.

Please note the peculiar collections of executionHost: once before and once after preparing the hosts. The framework explicitly allows for the collection of objects to be changed by the preparation of the hosts, as long as any host that is added is guaranteed to have been prepared if it's also part of an execution. This is for example exploited by host:das4 which creates a host object for every node in its preparation phase.