DTace Tutorials > Decision Tree Basics
This tutorial covers decision tree basics, especially as they pertain to DTace. However, much of the material in this tutorial is the same for all decision trees.
A decision tree is constructed as a one-to-many relationship where a node can have one or more children, but can only have one parent. Decision trees are comprised of nodes that can be one of three types:
Decision Node A decision node is represented by a square and denotes a decision to be made.
Chance Node A chance node is represented by a circle and denotes outcomes from the node that happen by chance. Each outcome has an associated probability of occurrence.
End Node An end node is represented by a triangle and denotes the outcome of a path through the decision tree.
The following decision tree shows the various text fields that can be shown on a decision tree. Text fields can be shown or hidden using tree settings, depending on what is needed to be shown. DTace also allows for optional labels to be shown to describe each field. The data labels can be omitted to reduce tree clutter.
Left/Top Text Box
The left/top text box shows user input data and the node ID. It is called the left/top text box because it can be shown on top or to the left of the node.
Right/Bottom Text Box
The right/bottom text box shows calculated values. It can be shown on the bottom or to the right of the node.
Node ID is a unique identifier used by the software to keep track of each node’s data in the diagram worksheet. The ID is automatically generated by the software. Showing node ID is useful during construction of the tree to keep track of nodes, especially if more than one node share the same name.
Node name is a descriptive name of the node entered by the user.
If a node’s parent is a chance node, then the probability of this node’s occurrence is shown. Probability is entered by the user.
Value is the cash flow (or other value) that occurs at each node. For example, if there is an initial investment of 10 at the root node, the root node’s value would be either 10 or -10 depending on your cash flow sign convention. It’s important to be consistent with signs when entering values, such as cash outflow is negative and cash inflow is positive. Value is entered by the user.
Expected Value/Expected Utility/Certainty Equivalent
When a tree is rolled back, one of the following three values are calculated depending on settings:
- Expected monetary value
- Expected utility
- Certainty equivalent
These values are shown on decision and chance nodes.
Path probability of an end node is the probability of that node occurring if the optimal path is followed. If an end node is not on an optimal path, its path probability will be zero. Path probability is calculated when the decision tree is rolled back.
The sum of values along the path from the root node to each end node is that end node’s terminal payoff. This is the net payoff for each end node and is used for rollback calculations.
If exponential utility, certainty equivalent, or a custom utility function is used, the terminal payoff for each end node will be converted to a utility called terminal utility. This value will be used for rollback instead of terminal payoff.
Node data can be referenced to spreadsheet cells which in turn can be referenced to a database or other external source of data.
Referenced Node Data
The following node data fields can get their values from cells in the workbook containing the diagram:
- Node Name
These data field values can be driven by the cells they reference. Referenced cells are also used when performing Monte Carlo simulation of a decision tree.
Referenced cells cannot be sorted. Sorting data, for example table data, will change what is shown in each node. Cell references will show what is in a given cell address. Sorting will change these values and destroy data relationship integrity.
Tree settings, node data, cell references, and connector data are stored within the worksheet containing a decision tree. This data is contained in columns A through Z and will be hidden when a new tree is created. These columns are protected to maintain data integrity.
The visible portion of the tree worksheet can be modified as normal, except for the following:
- Rows cannot be inserted or deleted.
- Columns cannot be inserted or deleted.
- Tables and pivot tables cannot be placed on the tree sheet.
Node data in a decision tree can be extracted to either a new worksheet or a new workbook by creating a node report. The report will show all user entered node data including incoming and outgoing connections.
When a decision tree is rolled back, the expected value at the root node is calculated. The root node expected value is the expected value of all outcomes based on the decision rule chosen. The expected value that is calculated at the root node is one of the following:
- Expected monetary value.
- Expected utility.
- Certainty equivalent.
Expected monetary value is used in a risk-neutral situation. In this case, the goal is to either maximize or minimize expected value. This is often the case when a project being evaluated is small relative to the overall organization and a project failure would not endanger its ongoing viability.
Expected utility is used in a risk-averse situation. DTace has the exponential utility function built-in and allows for a custom utility function to be specified in the workbook. The utility function alters the values of the payoffs to adjust for the user’s risk appetite. Bad outcomes are heavily penalized, while increasingly good outcomes receive less weight when calculating expected utility.
DTace allows for the certainty equivalent from exponential utility to be calculated.
There are two possible decision rules in DTace:
- Choose maximum
- Choose minimum
A decision rule is invoked at each decision node to calculate expected value and to determine the optimum path through the tree. If the goal is to maximize the outcome, such as profit, the “choose maximum” decision rule would be used. If the goal is to minimize the outcome, such as cost, the “choose minimum” decision rule would be used.
The sign of values will also affect which decision rule is chosen. If costs are represented as positive values, then to minimize cost the minimum decision rule would be used. If costs are represented as negative numbers, then to minimize cost the maximum decision rule would be used (least negative value).
Expected value of a chance node is based on the sum of each child node probability multiplied by value.
Expected value of a decision node is based on choosing the maximum or minimum expected value out of its child nodes. Whether maximum or minimum is chosen depends on the decision rule used.
During a roll back, starting at the end nodes, the expected values of node parents are calculated until the root node is reached. Assuming we want to maximize expected value, we would use the choose maximum decision rule. The roll back calculation procedure for the example decision tree is shown below.
Calculate expected value for Chance 2:
Chance 2 expected value = (End 4 terminal payoff)(End 4 probability) + (End 5 terminal payoff)(End 5 probability)
Chance 2 expected value = (20)(.6) + (-5)(.4) = 10
Calculate expected value for Decision 1:
Decision 1 expected value = MAXIMUM[Chance 2 expected value, End 3 terminal payoff]
Decision 1 expected value = MAXIMUM[10, -10] = 10
Calculate expected value for Chance 1:
Chance 1 expected value = (End 1 terminal payoff)(End 1 probability) + (End 2 terminal payoff)(End 2 probability)
Chance 1 expected value = (.3)(19) + (.7)(-1) = 5
Calculate expected value for Root node:
Root node expected value = MAXIMUM[Decision 1 expected value, Chance 1 expected value]
Root node expected value = MAXIMUM[10, 5] = 10
When expected utility is used, the roll back calculation procedure is the same as the expected value calculation above, except that terminal utility is used instead of terminal payoff. To get terminal utility, the software first determines terminal payoff and then uses the utility function to convert terminal payoff to terminal utility.
Exponential utility is determined using the following equation:
Exponential Utility = 1 – e-X/R
R = risk tolerance constant
X = terminal payoff
Below is a plot of exponential utility vs. payoff. As R decreases, we are more risk-averse based on the concave shape of the curve. In other words, there is a bigger penalty in utility for lower payoffs.
When a custom utility function is supplied by the user, the calculation is the same except the custom utility function is used to convert terminal payoffs to terminal utilities.
Certainty equivalent is determined at each node by first calculating expected exponential utility at each node as described in the Expected Utility Calculation section. The expected utilities are then converted to certainty equivalent using the following equation:
Certainty Equivalent = -R*ln(1 – EU)
R = risk tolerance constant
EU = expected exponential utility
One of the objectives for creating a decision tree is to find the optimal path through the tree. By calculating the highest (or lowest, depending on the decision rule) expected value we can determine the optimal path through the tree.
After calculating expected values, nodes on the optimal path will be flagged as TRUE, and nodes not on the optimal path will be flagged as FALSE. Nodes that have a decision node as a parent will have an optimal path flag. Nodes that have a chance node as a parent will not have an optimal path flag since all chance node outcomes are possible.