Optimization with simulation sounds ominous, confusing, and generally nasty. In this article, I will try to explain what all this means and how you can use it to solve some pretty interesting, and tough, problems.
With traditional optimization we change some value so that the output of a model reaches a desired goal, such as a minimum value for example. An example is Excel's solver which changes cell values to find an optimum. When one or more of the inputs to a model are random variables, the problem gets more complicated. Recalculating the model results in a new output value since one or more inputs are always changing. To get around this complication, we use optimization with simulation to solve the optimization problem. Instead of optimizing a single output value as is done in traditional optimization, we are optimizing an output statistic of a simulation.
The context of our discussion will be limited to spreadsheet models.
Before we get into how everything fits together, let's start with some definitions.
Model In our case, a model is one or more spreadsheets that describe a concept or physical thing, and its reaction to decision and random variables.
Optimization Changing decision variables to achieve a specified optimization goal subject to any constraints.
Simulation A simulation in our case is a Monte Carlo simulation. For a simulation, decision variables are held at fixed values for the duration of the simulation. Random variables are sampled from their underlying probability distributions each iteration.
Iteration Repeating steps in a process or algorithm. Optimizations are iterated to find a feasible solution. Simulations are iterated to find an output value each time.
Simulation Output The output of the model, the thing that we are calculating, is calculated repeatedly to create a set of output data, and simulation statistics.
Simulation Statistic A statistic of the set of simulation output values. This can be mean, variance, percentiles, etc. If the simulation is run for 10,000 iterations, there will be 10,000 data points from which to calculate statistics.
Objective Function In traditional optimization, the objective function is the thing that is minimized, maximized, or set to a target value.
Objective Measure In optimization with simulation, the objective measure is a simulation statistic that is minimized, maximized, or set to a target value.
Optimization Goal What we want to do to the objective measure. We can try to minimize or maximize the objective measure. We can also try to make the objective measure a specific value, also known as a target value.
Constraint A constraint is a limit imposed on decision variables that define the feasible solution space. There are two types of constraints:
Cell Constraint A constraint can be an equation or inequality comprised of some or all decision variables. The value of the constraint is calculated in a spreadsheet cell.
Statistic Constraint A constraint can also be an equation or inequality comprised of a simulation statistic.
Decision Variable Variables in the model that we can control. We decide what value to use, or the optimizer decides what value to use. A decision variable may have bounds on what values it can assume, also known as side constraints. For example, if we have mass as a decision variable, it cannot be less than 0.
Random Variable Variables in the model that are random in nature and follow some probability distribution.
Feasible Solution A solution that satisfies any constraints present. A feasible solution is not necessarily the optimum solution.
A Traditional Optimization Problem
We'll start with a basic problem as defined in traditional optimization. We have an objective function F, an inequality constraint G, and an equality constraint H. The optimization problem is defined as follows:
F(X1, X2) = 2*X1^4- X2^2 + 3
Subject to the constraints:
G(X1, X2): X1 + X2 ≤ 0
H(X1,X2): X1-X2 + 1 = 0
X1 and X2 bounds (side constraints):
-100 ≤ X1 ≤ 100
-100 ≤ X2 ≤ 100
In English, the above problem is we want to minimize the value of F. We use an optimization algorithm to find potential values of X1 and X2. Then we check if the constraints are violated.
If G is greater than 0, we can't use the solution and need to find another combination of X1 and X2 to satisfy G. If H is not equal to 0, we can't use the solution and need to find another combination of X1 and X2 to satisfy H.
If the constraints are not violated, we have a feasible solution (although not necessarily optimum). If one or more constraint is violated, the solution is not feasible.
Translating to an Optimization with Simulation Problem
Now let's translate the problem above in terms of an optimization problem that uses simulated values.
Objective function, F is the objective measure. We want to minimize the mean value of F. We simulate F each time the optimization algorithm calls for F, and the mean of the simulated values is returned.
G and H are cell constraints. In one cell, G is calculated, and in another cell H is calculated.
X1 and X2 are decision variables.
We could treat the constant term, 3 in F as a random variable.
In very general terms, an optimization algorithm is usually a numerical method, or a population based heuristic search algorithm. For our discussion, we will use a population based algorithm. The algorithm creates a population of candidate solutions, and searches in some fashion to find an optimum feasible solution.
The simulation portion of the algorithm is simply a Monte Carlo simulation. For a single simulation, all decision variables are not changed. For each iteration of the simulation, a value for each random variable is sampled, and the output of the model is calculated.
Once the simulation is complete, the statistic that is used for the objective measure is calculated and returned to the optimization algorithm.
Putting Them Together: Optimization with Simulation
As you may have guessed by now, optimization with simulation is basically a loop within a loop. A diagram of the process is shown below.
The optimization algorithm starts by creating an initial population of candidate solutions. A candidate solution is one combination of the decision variable values.
Then the simulation loop is entered. Each member of the population is simulated and the objective measure of each member is calculated. The simulation loop is exited.
For each member, the optimization algorithm checks for constraint violations, and the best solution is chosen from the population.
If the stop criteria is not met, the optimization algorithm adjusts the decision variables for each member and starts the simulation loop again.
If the stop criteria is met, the optimization loop is stopped. Stop criteria might be a maximum iteration limit, a maximum time limit, or manually stopped by the user.
Example: Portfolio Asset Weighting
Let's tie all of this together using an example. Consider the following model of an investment portfolio with five assets. The model simulates the portfolio return over one time period. Because we are modeling one time period, we will assume each asset's return is normally distributed.
We have a risk appetite that allows for a negative portfolio return 5% of the time.
The goal is to determine the weighting of each asset in the portfolio to maximize return while limiting losses to 5% of the outcomes.
Setting Up the Problem
Objective Measure: Mean value of output (portfolio return)
Optimization Goal: Maximize
Decision Variable Definitions: Cells B8:F8
Decision Variables: Cells B7:F7 (asset weights)
Side Constraints: Each weight must be >= .1 and <= .4
Cell Constraint: Cell B14 must = 1 (sum of weights = 1)
Statistic Constraint: 5th percentile of output >= 0
Using the Simulation Master Premium add-in, we will optimize/simulate the model with the following parameters.
Optimization iterations: 30
Iterations per simulation: 1000
While the software is running, you can see a running update of the best objective measure in the progress window, as shown below.
Once the run is complete, we can see the simulation results below. The mean portfolio return with the optimum decision variable values is 7.89%. Note that the 5th percentile is 0.29 which is greater than 0, so that constraint is not violated.
If we click on the "Optimization" tab, we can see the optimization data. In the Constraints box we see that the B14 constraint is violated by .000754. This constraint is that the asset weights must sum to 1. This is a small violation so we will assume it is OK.
The lower box shows the optimum decision variable values.
What to Do Next
Before we finish, we need to address the issue of how good is the result. We will discuss factors that affect the quality.
Determining optimization iterations is somewhat of a trial and error process. If you run an initial optimization, and the objective measure is still changing up to the final few iterations, you probably didn't run enough iterations. Another cause of instability is using too few simulation iterations.
In the example, we only used 1000 iterations per simulation. This was done because, during the course of an optimization, numerous simulations are run which results in a computationally intensive process.
A good practice is to run a full simulation (using a suitable number of iterations) with the optimum decision variable values. In our case, if the mean return from the full simulation is not close to the final optimization simulation (1000 iterations), then a more robust optimization should be run. This entails using more simulation iterations to get more stable statistics, which in turn are used to optimize decision variables.
Combining optimization with simulation has some powerful benefits. Decision variable values can be optimized, while allowing for random variables. This allows us to build models that avoid single point estimates. Optimization with simulation also avoids the manual trial and error required to find an optimum solution when random variables are present.