PredM

Companion repository to the paper Enhancing Siamese Neural Networks through Expert Knowledge for Predictive Maintenance.
Please note, that the MS-SNNs approach is here referred to as CaseBasedSimilarity (CBS) and CNN2D + MAR is called cnn2dWithAddInput. The implementation of some components is based on the one presented in NeuralWarp (GitHub).

Supplementary Resources

The sub directory supplementary_resources of this repository contains additional information about the data sets used and the architecture of the CNN2D + MAR model.
An overview of all relevant conducted experiments
The detailed logs for each of those experiments
The raw data recorded with this simulation factory model used to generate the training and evaluation data sets.
The preprocessed data set we used for the evaluation.

Quick start guide: How to start the model?

Clone the repository
Download the preprocessed data set and move it to the data folder
Navigate to the neural_network folder and start the training and test procedure via python TrainAndTest.py > Log.txt

Requirements

Used python version: 3.6.X
Used packages: See requirements.txt

Hardware

CPU	2x 40x Intel Xeon Gold 6138 @ 2.00GHz
RAM	12 x 64 GB Micron DDR4
GPU	8 x NVIDIA Tesla V100 32 GB GPUs

General instructions for use

All settings can be adjusted in the script Configuration.py, whereby some rarely changed variables are stored in the file config.json, which is read in during the initialization.
The hyperparameters of the neural networks can be defined in the script Hyperparameter.py or can be imported from a file in configuration/hyperparameter_combinations/ (this can also be changed in the configuration).
For training, the desired adjustments should first be made at the parts mentioned above and then the training can be started by running Training.py.
The evaluation of a trained model on the test dataset can be done via Inference.py. To do this, the folder which contains the model files, must first be specified in the configuration.
For executing the real-time data processing using RealTimeClassification.py first a kafka server must be configured and running. Also the topic names and mappings to prefixes must be set correctly.
The data/ directory contains all required data. Central are the pre-processed training data in data/training_data/ and the trained models in data/trained_models/. A detailed description of what each directory contains is given in corresponding parts of the configuration file.

Software components

The following section gives an overview of the packages, directories and included Python scripts in this repository.

analytic_tools

Python script	Purpose
ExampleCounter.py	Displays the example distribution in the training data and the case base.
ExtractCases.py	Automatically determines the time intervals at which simulated wear is present on one of the motors and exports these into to a text file.
LightBarrierAnalysis.py	Used for manual determination of error case intervals for data sets with light barrier errors.
PressureAnalysis.py	Used for manual determination of error case intervals for data sets with simulated pressure drops.
CaseGrouping.py	Is used to generate an overview of the features used for each error case and to create a grouping of cases based on this.

baseline

Python script	Purpose
BaselineTester.py	Provides the possibility to apply other methods for determining similarities of time series, e.g. DTW, to the data set.

case_based_similarity

Python script	Purpose
CaseBasedSimilarity.py	Contains the implementation of the case-based similarity measure (CBS).
Inference.py	Evaluation of a CBS model based on the test data set.
Training.py	Used for training a CBS model.

configuration

Python script	Purpose
Configuration.py	The configuration file within which all adjustments can be made.
Hyperparameters.py	Contains the class that stores the hyperparameters used by a single neural network.

data_processing

Python script	Purpose
CaseBaseExtraction.py	Provides extraction of a case base from the entire training data set.
DataImport.py	This script executes the first part of the preprocessing. It consists of reading the unprocessed sensor data from Kafka topics in JSON format as a *.txt file (e.g., acceleration, BMX, txt, print) and then saving it as export_data.pkl in the same folder. This script also defines which attributes/features/streams are used via config.json with the entry "relevant_features". Which data is processed can also be set in config.json with the entry datasets (path, start, and end timestamp).
DataframeCleaning.py	This script executes the second part of the preprocessing of the training data. It needs the export_data.pkl file generated in the first step. The cleanup procedure consists of the following steps: 1. Replace True/False with 1/0, 2. Fill NA for boolean and integer columns with values, 3. Interpolate NA values for real valued streams, 4. Drop first/last rows that contain NA for any of the streams. In the end, a new file, called cleaned_data.pkl, is generated.
DatasetCreation.py	Third part of preprocessing. Conversion of the cleaned data frames of all partial data sets into the training data.
DatasetPostProcessing.py	Additional, subsequent changes to a dataset are done by this script.
RealTimeClassification.py	Contains the implementation of the real time data processing.

fabric_simulation

Python script	Purpose
FabricSimulation.py	Script to simulate the production process for easier development of real time evaluation.

logs

Used to store the outputs/logs of inference/test runs for future evaluation.

neural_network

Python script	Purpose
BasicNeuralNetworks.py	Contains the implementation of all basic types of neural networks, e.g. CNN, FFNN.
Dataset.py	Contains the class that stores the training data and meta data about it. Used by any scripts that uses the generated dataset
Evaluator.py	Contains an evaluation procedure which is used by all test routines, i.e. SNNs, CBS and baseline testers.
Inference.py	Provides the ability to test a trained model on the test data set.
Optimizer.py	Contains the optimizer routine for updating the parameters during training. Used for optimizing SNNs as well as the CBS.
SimpleSimilarityMeasure.py	Several simple similarity measures for calculating the similarity between the enbedding vectors are implemented here.
SNN.py	Includes all four variants of the siamese neural network (classic architecture or optimized variant, simple or FFNN similiarty measure).
TrainAndTest.py	Execution of a training followed by automatic evaluation of the model with best loss.
Training.py	Used to execute the training process.

Compatibility

Due to the high amount of different models and configuration options, not all components can be used together. The following table shows the current compatibility status of the different models with the execution variants. Also please note that the real time classification is still under development and may not work currently.

	SNN				CBS
	Standard		Fast		Standard		Fast
Encoder	Simple	FFNN	Simple	FFNN	Simple	FFNN	Simple	FFNN
CNN	Working	Working	Working	Working	Working	Working	Working	Working
RNN	Working	Working	Working	Working	Working	Working	Working	Working
cnn2dwithaddinput	Working	Not working / Error	Not implemented yet	Not implemented yet	Not planned / necessary	Not planned / necessary	Not planned / necessary	Not planned / necessary
cnn2d	Working	Working	Working	Working	Not working / Error	Not working / Error	Not working / Error	Not working / Error

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
analytic_tools		analytic_tools
archive		archive
baseline		baseline
case_based_similarity		case_based_similarity
configuration		configuration
data		data
data_processing		data_processing
fabric_simulation		fabric_simulation
logs		logs
neural_network		neural_network
supplementary_resources		supplementary_resources
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

PredM/SiameseNeuralNetwork

Folders and files

Latest commit

History

Repository files navigation

PredM

Supplementary Resources

Quick start guide: How to start the model?

Requirements

Hardware

General instructions for use

Software components

analytic_tools

archive

baseline

case_based_similarity

configuration

data_processing

fabric_simulation

logs

neural_network

Compatibility

About

Resources

Stars

Watchers

Forks

Languages