Machine Learning in Excel
Vaimal is a machine learning add-in that allows you to train and deploy machine learning algorithms without programming. You can make predictions on new data using models that are trained on historical data. Create decision trees, support vector machines and neural networks all within Excel®.
It also includes more powerful ensemble methods to combine models for even better predictive performance. The easy to use interface allows you to focus on your data. There's also no worrying about learning mundane programming tasks required with common machine learning platforms.
Preprocessing is fundamental to achieving successful machine learning. Vaimal has numerous preprocessing tools to simplify the task of getting your data ready for training and prediction.
Machine Learning Workflow
The workflow for creating and using machine learning models with Vaimal:
- Import Data Place data in an Excel worksheet.
- Data Preprocessing Handle missing data, data normalization, and encoding categorical inputs. Vaimal has several utilities for preprocessing data.
- Select a Model to Use and Design. Select which model to use and the design parameters.
- Train the Model Using training data with known outputs, train the model.
- Test the Model Using different data than the training data, test the model’s ability to predict versus known outputs.
- Prediction Use the model to make predictions of data with unknown output. A preprocessing template can be used to preprocess data for prediction in one step.
Data Check Data check scans your data and alerts you to missing data, non-numeric data, and columns with all values the same. This is a good initial check to help find data that needs to be cleaned or transformed.
Normalize data Feature scaling on [0,1], feature scaling on [-1,1], and standardization.
Categorical data encoding 1 to n, 1 of c, 1 of c-1, and 1 vs all
Missing data Replace missing values with artificial or derived data. Delete rows with missing values or specified value. Clear cells with non-numeric or specified value.
Variable Importance Decision trees can be used to determine each input variable's relative importance. This can be used to eliminate unnecessary data.
When your model goes to production, templates allow you to preprocess new data in a single step. They also alert you to potential problems with missing or invalid data.
Support vector machines
Polynomial, Gaussian, and hyperbolic tangent non-linear kernels
Binary and multi-class classification
Up to 10 hidden layers
ReLU, leaky ReLU, logistic, and hyperbolic tangent hidden layer activation functions
L1 and L2 regularization
Probabilistic neural networks
Generalized regression neural networks
Bagging ensembles, including feature bagging for decision trees.
Three methods for handling imbalanced class distributions:
Over-sampling For classes that don’t have the most instances, additional copies of data points are added to the training set.
Under-sampling For classes that don’t have the least instances, some data points are removed from the training set.
Balance-sampling The average of the classes with the most and least instances is calculated. This average is used to over or under sample each class so that all classes have the same number of instances.
Leave one out cross-validation
Test after training automatically
Test existing model
Decision tree visualizer to draw trees.
Simulation with a Machine Learning Model Function to supply random variable inputs to a machine learning model and simulate output from the model. The output from a machine learning model can also be used as an input to an analytic model for simulation. This function is compatible with Simulation Master.
Excel: 2007 to 2019, Office 365. 32 and 64 bit versions.
Windows: Vista to Windows 10. 32 and 64 bit versions.
Excel for Mac not supported at this time.
Excel is a registered trademark of Microsoft Corporation.