Companion repository to the paper Enhancing Siamese Neural Networks through Expert Knowledge for Predictive Maintenance.
Please note, that the MS-SNNs approach is here referred to as CaseBasedSimilarity (CBS) and CNN2D + MAR is called cnn2dWithAddInput.
The implementation of some components is based on the one presented in NeuralWarp (GitHub).
- The sub directory supplementary_resources of this repository contains additional information about the data sets used and the architecture of the CNN2D + MAR model.
- An overview of all relevant conducted experiments
- The detailed logs for each of those experiments
- The raw data recorded with this simulation factory model used to generate the training and evaluation data sets.
- The preprocessed data set we used for the evaluation.
- Clone the repository
- Download the preprocessed data set and move it to the data folder
- Navigate to the neural_network folder and start the training and test procedure via python TrainAndTest.py > Log.txt
Used python version: 3.6.X
Used packages: See requirements.txt
CPU | 2x 40x Intel Xeon Gold 6138 @ 2.00GHz |
RAM | 12 x 64 GB Micron DDR4 |
GPU | 8 x NVIDIA Tesla V100 32 GB GPUs |
- All settings can be adjusted in the script Configuration.py, whereby some rarely changed variables are stored in the file config.json, which is read in during the initialization.
- The hyperparameters of the neural networks can be defined in the script Hyperparameter.py or can be imported from a file in configuration/hyperparameter_combinations/ (this can also be changed in the configuration).
- For training, the desired adjustments should first be made at the parts mentioned above and then the training can be started by running Training.py.
- The evaluation of a trained model on the test dataset can be done via Inference.py. To do this, the folder which contains the model files, must first be specified in the configuration.
- For executing the real-time data processing using RealTimeClassification.py first a kafka server must be configured and running. Also the topic names and mappings to prefixes must be set correctly.
- The data/ directory contains all required data. Central are the pre-processed training data in data/training_data/ and the trained models in data/trained_models/. A detailed description of what each directory contains is given in corresponding parts of the configuration file.
The following section gives an overview of the packages, directories and included Python scripts in this repository.
Python script | Purpose |
---|---|
ExampleCounter.py | Displays the example distribution in the training data and the case base. |
ExtractCases.py | Automatically determines the time intervals at which simulated wear is present on one of the motors and exports these into to a text file. |
LightBarrierAnalysis.py | Used for manual determination of error case intervals for data sets with light barrier errors. |
PressureAnalysis.py | Used for manual determination of error case intervals for data sets with simulated pressure drops. |
CaseGrouping.py | Is used to generate an overview of the features used for each error case and to create a grouping of cases based on this. |
The archive contains currently unused code fragments that could potentially be useful again, old configurations and such.
Python script | Purpose |
---|---|
BaselineTester.py | Provides the possibility to apply other methods for determining similarities of time series, e.g. DTW, to the data set. |
Python script | Purpose |
---|---|
CaseBasedSimilarity.py | Contains the implementation of the case-based similarity measure (CBS). |
Inference.py | Evaluation of a CBS model based on the test data set. |
Training.py | Used for training a CBS model. |
Python script | Purpose |
---|---|
Configuration.py | The configuration file within which all adjustments can be made. |
Hyperparameters.py | Contains the class that stores the hyperparameters used by a single neural network. |
Python script | Purpose |
---|---|
CaseBaseExtraction.py | Provides extraction of a case base from the entire training data set. |
DataImport.py | This script executes the first part of the preprocessing. It consists of reading the unprocessed sensor data from Kafka topics in JSON format as a *.txt file (e.g., acceleration, BMX, txt, print) and then saving it as export_data.pkl in the same folder. This script also defines which attributes/features/streams are used via config.json with the entry "relevant_features". Which data is processed can also be set in config.json with the entry datasets (path, start, and end timestamp). |
DataframeCleaning.py | This script executes the second part of the preprocessing of the training data. It needs the export_data.pkl file generated in the first step. The cleanup procedure consists of the following steps: 1. Replace True/False with 1/0, 2. Fill NA for boolean and integer columns with values, 3. Interpolate NA values for real valued streams, 4. Drop first/last rows that contain NA for any of the streams. In the end, a new file, called cleaned_data.pkl, is generated. |
DatasetCreation.py | Third part of preprocessing. Conversion of the cleaned data frames of all partial data sets into the training data. |
DatasetPostProcessing.py | Additional, subsequent changes to a dataset are done by this script. |
RealTimeClassification.py | Contains the implementation of the real time data processing. |
Python script | Purpose |
---|---|
FabricSimulation.py | Script to simulate the production process for easier development of real time evaluation. |
Used to store the outputs/logs of inference/test runs for future evaluation.
Python script | Purpose |
---|---|
BasicNeuralNetworks.py | Contains the implementation of all basic types of neural networks, e.g. CNN, FFNN. |
Dataset.py | Contains the class that stores the training data and meta data about it. Used by any scripts that uses the generated dataset |
Evaluator.py | Contains an evaluation procedure which is used by all test routines, i.e. SNNs, CBS and baseline testers. |
Inference.py | Provides the ability to test a trained model on the test data set. |
Optimizer.py | Contains the optimizer routine for updating the parameters during training. Used for optimizing SNNs as well as the CBS. |
SimpleSimilarityMeasure.py | Several simple similarity measures for calculating the similarity between the enbedding vectors are implemented here. |
SNN.py | Includes all four variants of the siamese neural network (classic architecture or optimized variant, simple or FFNN similiarty measure). |
TrainAndTest.py | Execution of a training followed by automatic evaluation of the model with best loss. |
Training.py | Used to execute the training process. |
Due to the high amount of different models and configuration options, not all components can be used together. The following table shows the current compatibility status of the different models with the execution variants. Also please note that the real time classification is still under development and may not work currently.
SNN | CBS | |||||||
Standard | Fast | Standard | Fast | |||||
Encoder | Simple | FFNN | Simple | FFNN | Simple | FFNN | Simple | FFNN |
CNN | Working | Working | Working | Working | Working | Working | Working | Working |
RNN | Working | Working | Working | Working | Working | Working | Working | Working |
cnn2dwithaddinput | Working | Not working / Error | Not implemented yet | Not implemented yet | Not planned / necessary | Not planned / necessary | Not planned / necessary | Not planned / necessary |
cnn2d | Working | Working | Working | Working | Not working / Error | Not working / Error | Not working / Error | Not working / Error |