Neural network training progress - machine learning examples

Vaimal Machine Learning Examples

On this page you can find machine learning examples using Vaimal.  You can get a feel for how Vaimal handles a data set and its reporting capabilities.

Iris Classification

This is the well-known Iris data set that was downloaded from:  http://archive.ics.uci.edu/ml/datasets/Iris

The Iris data set is a classification problem with 3 classes.  Three models were trained: Support vector machine (SVM), multilayer  perceptron neural network (MLP), and probabilistic neural network (PNN).  Due to the limited size of the data set (150 points), k-fold cross-validation was performed to estimate classification accuracy.  Then each model was trained on all data after k-fold.

Workbook

Iris

Summary

Model Avg. k-fold accuracy
SVM 0.96
MLP 0.993
PNN 0.94

Class Distribution

  • Class 1: 33.33%
  • Class 2: 33.33%
  • Class 3: 33.33%

Data Set Citation

Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Car Evaluation

This is the car evaluation data set that was downloaded from:  https://archive.ics.uci.edu/ml/datasets/Car+Evaluation

This data set is a classification problem with 4 classes.  Three basic models and one bagging ensemble were trained:

  • Linear support vector machine (SVM).
  • SVM with polynomial kernel.
  • SVM bagging ensemble with polynomial kernel, over-sampling.
  • Multi-layer perceptron (MLP) neural network with 1 hidden neuron, under-sampling, and L2 regularization.

Workbook

Car Evaluation

Summary

Model Overall Accuracy Accuracy Class 1 Accuracy Class 2 Accuracy Class 3 Accuracy Class 4
SVM Linear 0.849 0.945 0.845 0 0
SVM Non-linear 0.942 0.989 0.948 0.7 0.3
SVM Non-linear Bagging (5 models) 0.965 0.978 0.948 0.9 0.9
MLP 1  Neuron 0.919 0.917 0.983 0.8 0.7

Data Partition

Data was stratified into the train/validation/test partition.

  • Train (70%): 1210 cases
  • Validation: (15%): 259 cases
  • Test (15%): 259 cases

Class Distribution

  • Class 1 (unacceptable): 70.02%
  • Class 2 (acceptable): 22.22%
  • Class 3 (good): 3.99%
  • Class 4 (very good): 3.76%

Data Set Citation

Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Airfoil Noise Regression

This is a data set of airfoil sound pressure level.  It was downloaded from:  http://archive.ics.uci.edu/ml/datasets/Airfoil+Self-Noise

The airfoil data set is a regression task.  There are five input variables, and one dependent variable.  Generalized regression (GRNN) and multilayer perceptron (MLP) neural networks were trained.  A voting ensemble was created to combine all basic models.  The models were trained using hold out cross-validation and tested on a separate testing set.  Data flagged as TRAINING was used to train the models.  Data flagged as VALIDATION was used to calculate validation error.  Data flagged as TESTING was used for testing after training.

Workbook

Airfoil

Summary

Model Accuracy, +/- 1% Accuracy, +/- 2% Accuracy, +/- 5% Accuracy, +/- 10% Accuracy, +/- 15%
GRNN 0.298 0.511 0.907 0.996 1
MLP 0.236 0.458 0.836 0.973 0.991
MLP Deep 0.28 0.507 0.884 0.978 0.996
Voting Ensemble 0.32 0.551 0.898 0.991 1

Data  Partition

Data was randomly selected for training, validation, and testing.
Input data was normalized on [0,1].
Training: 1053 (70%)
Validation: 225 (15%)
Testing: 225 (15%)

Data Set Citation

Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.