The documentation below provides step-by-step instructions illustrating how to use the software. It also describes the intent behind and meaning of the analyses. At the very least, please read the "Getting Started" section or view the video prior to using the software. "Getting Started" provides an overview of the three main features of the software: causally modeling the connections between stocks, forecasting stock performance and risk, and optimizing portfolios. The "Create an Investment Model" section shows you how to build a causal network using the software. "Analyze Unconditional Forecasts", "Create Unconditional Forecasts", and "Create Conditional Forecasts" show you how to forecast and interpret the performance and risk of individual nodes in the network. "Optimize and Analyze Portfolios" show you how to optimize and analyse portfolios based upon your forecasts. "Evaluating Forecast Accuracy" shows you how to assess the accuracy of forecasts for the portfolio and individual nodes.

This tutorial will show you how to view investment models, forecasts, and optimized portfolios. Additional articles and case studies detail methods for analyzing, modifying, and creating each of these core components.

1. Click View Network

The software represents investment models using a mathematical framework called a “Bayesian network.”

2. Select GettingStarted and click OK

This selects an existing network named “GettingStarted.”

As the arrows suggest, the network represents a causal model between factors and stocks. Such a diagram is known as an “influence diagram.” Each rectangle is called a “node.” For example, both the United States and Google are nodes.

This network indicates that the performance of the United States stock market influences the stock prices of both Exxon and Google.

3. Click the Adjust button in the Judgment grouping on the ribbon.

4. Select SPY as the Node ID

The software identifies each node by its stock ticker. This model uses the SPDR S&P 500 ETF with ticker symbol “SPY” as a proxy for the US market.

The graphic above displays the unconditional probabilities and expectations for the US. It indicates that SPY will advance 50% of the time. When the market advances, it is expected to gain .55% of its value on average.

5. Select XOM

The network specifies that the US market influences Exxon Mobil. Investment forecasts for dependent nodes are represented using conditional probabilities and expectations.

The conditional forecasted probabilities and expectations indicate the following:

- If the US declines on any given day, the probability of an Exxon loss is .75% and the expected loss is 1% on average.
- If the US declines on any given day, the probability of an Exxon gain is 25% and the expected gain is .50% on average.
- If the US advances on any given day, the probability of an Exxon loss is 25% and the expected loss is .50% on average.
- If the US advances on any given day, the probability of an Exxon gain is 75% and the expected gain is 2% on average.

In summary, the column header shows the parent outcome; whereas, the row header indicates the child outcome for each probability and expectation.

Forecasts are also needed for the conditional probabilities and expectations for Google. To simplify this tutorial, Google has the same forecasts as Exxon.

6. Click the Portfolio button in the Analysis Scope grouping on the ribbon

7. Click the Optimization button

8. Type “0” for the Minimum Portfolio Return and click Calculate.

This module calculates the optimal investment amounts for each stock based upon the forecasts.

The Expected Portfolio Return is one measure of reward. “.38%” means you expect the portfolio to gain .38% of its total value on any given day. The Expected Portfolio Stdev (Standard Deviation) is one measure of risk. This number has more relative than absolute meaning. For example, if portfolio A and B have standard deviations of 2 and 1, respectively, and all else is equal, then A is riskier than B.

This module determines the investment amounts that will minimize Expected Portfolio Stdev subject to some minimum threshold for Expected Portfolio Return. In this case, the Minimum Portfolio Return is set to 0. Therefore, the optimizer will attempt to minimize portfolio standard deviation subject to the constraint that the portfolio return must be greater than 0.

The optimal portfolio holds equal proportions, or Weights, of both Google and Exxon as a percentage of portfolio value. This occurs because the forecasts for each are identical. As an aside, the dollar amounts are set in Analysis Properties in the Create Analysis grouping.

Notice that the weights of Google and Exxon are not exactly equal. This occurs for two reasons.

- The software simulates the forecasts using Monte Carlo, a numerical method. More simulations increase both result precision and calculation time. Therefore, a tradeoff exists between error and time. This tutorial set the number of simulations to 10,000 in Analysis Properties. You can verify the solution error by pressing Calculate multiple times, which will yield different results each time. In practice, significantly more simulations are usually required.
- The software also uses a decision optimizer, another numerical method. Similar to Monte Carlo, optimizers trade precision for calculation time; however, its precision is set sufficiently high that it is unlikely to cause a significant amount of error in this example.

This tutorial covered the basic steps required to view an investment model, forecasts, and optimized portfolio holdings. The next tutorial details the steps required to build a network.

This tutorial will show you how to build an investment model

1. Click Create Network

2. Name the network “BuildModel” and click OK

3. Select SPY as the Node ID, type United States as the Node Name, and select Level 1 (Gray) as the Node Level. Then click Add.

This adds the SPDR S&P 500 ETF with ticker symbol SPY as a node in the model. The node appears as a gray rectangle with United States as the text when the Add button is clicked.

4. Select XOM as the Node ID, type “Exxon Mobil” as the Node Name, select SPY as the Parent ID, and set the Node Level to Level 2 (Green). Then click Add.

Other than the top node, each node in the network must have a single parent. Therefore, the Parent ID dropdown appears after the top node (SPY) is added.

Once you click Add, the software creates the green Exxon node and draws a connector between it and the SPY node.

5. Select GOOG as the Node ID, type “Google” for the Node Name, select SPY as the Parent ID, and set the Node Level to Level 2 (Green). Then Click Add.

The software will add another green node for Google and another connector from SPY.

If all went well, you should see the network above.

6. Click OK.

This concludes the steps required to build a model. In the next tutorial, you will learn to create and analyze an unconditional investment forecast for the top node.

This tutorial will show you several unconditional analyses that will aid the creation of both conditional and unconditional forecasts. You will use these to explore historical data, validate your forecasts, and evaluate the accuracy of your predictions.

- Introduce several analysis properties you will need to set prior to performing the analyses
- Explain each of the unconditional analyses including scenario analyses, descriptive statistics, charts, and scores

1. Type the following: “10/15/2014” into Begin Date, “04/15/2015” into End Date, “10000” into Number of Trials, and “30” into Rolling Days.

The Date Range fields set the historical period you will use as the benchmark for your forecasts.

As mentioned in the Getting Started Guide, the software simulates your forecasts using Monte Carlo. The Number of Trials parameter determines the number of simulations run per node. This parameter will influence the precision and speed of not only the Optimization but also the Validate analyses, which use simulated data derived from the forecasts.

Lastly, the Rolling Days parameter determines the number of days used to calculate the Rolling Stdev and Correlation Charts.

2. Type “-10,” “-3,” “-1,” “0,” “1,” “3,” and “10” for the Histogram interval edges and “Large Loss,” “Medium Loss,” “Small Loss,” “Small Gain,” “Medium Gain,” and “Large Gain” for the labels. To create a new row, click the “+” button. To remove a row, click the “-“ button.

The forecasts, scenarios, and scores require you to define intervals and labels. The labels contain the scenario names, and the intervals determine upper and low bounds for each scenario. For example, a Large Loss is between -10% and -3%. Each observation falling between these two bounds is included in this category. Note: an observation less than -10% will be treated as an outlier and excluded from all scenario based calculations.

A tradeoff exists between the realism of the scenarios and time required to forecast. More granular scenarios are likely to result in more realistic forecasts but will increase both the number of probabilities and expectations required for each forecast. As a rule of thumb, you should have no fewer than four scenarios and no more than ten.

3. Type “2000” for $ Value for both XOM and GOOG and click OK.

You can the set the amount invested in each stock in Portfolio Holdings. It will automatically calculate the weights after you click outside of the input box. These values will be used in the calculations for the portfolio analyses.

4. After the network appears on the screen, click the Network, Explore, and Scenarios buttons on the Ribbon.

5. Click Unconditional Probabilities

The graphic above should appear. In step 4, clicking the Network button indicates to the software that you want to view the analysis within the context of the network. Clicking Explore selects the historical gains and losses from the time period 10/15/2014 to 04/15/2015 (Step 1). In step 5, clicking Unconditional Probabilities sets the type of analysis displayed under each node.

The unconditional probabilities for the United States are magnified above. The labels for the histogram are identical to those you set in Step 2. The probabilities are the proportion of the daily gains and losses falling within each interval. For example, the US market declined between -3% and -1% for 7% of the days during the period. Note: no Large Gains or Large Losses occurred between 10/15/2014 and 04/15/2015.

6. Click the Scenarios button in the Ribbon

7. Click Unconditional Expectations

As before, you should see the analysis within the context of the network. This tutorial only shows the top node.

The expectations are calculated as the average of the observations within each interval. For example, the average of all Medium Losses is -1.5%. When no observations fall within an interval, the lower interval edge is assigned as the value of the expectation. In this case, no Large Losses occurred during the period, and Large Losses include the interval from -10% to -3%. Therefore, -10 is set as the default expectation.

As you will see in Creating Unconditional Forecasts, the Explore Unconditional Probabilities and Expectations are set as the default forecasts for SPY.

8. Click Statistics

The summary statistics describe the daily percentage gains and losses, or returns, of the US market over the period. The largest loss was 1.8% and the largest gain was 2.5%. If the returns were ranked in ascending order, the median observation would have half the data above it. If there are an even number of observations, it is the average of the middle two observations. The mean is the average. As mentioned in the Getting Started Guide, standard deviation (stdev) is a proxy for risk. It indicates the dispersion of the observations about the mean. If you were to balance the data on a seesaw, the skewness indicates the direction and magnitude of the tilt. When no positive outliers exist and a few negative ones do, then the skewness will be negative, and vice-versa. Excess kurtosis indicates the distance of the outliers from the other observations. Outliers far from the other observations will result in large kurtosis; whereas, no outliers or ones close to the other observations will result in negative or small excess kurtosis.

The historical data contain 125 observations, which might be too short a period to capture a variety of market conditions. However, a tradeoff exists between the number of historical observations and their relevance to the current forecast.

The annual statistics annualize the mean and standard deviation. These can be easier to interpret than the raw daily statistics.

9. Click Charts on the Ribbon

10.

- Click Prices: the historical prices for the node are plotted over time.
- Click Returns: the daily percentage gains and losses are plotted over time.
- Click Cumulative Returns: simulates the cumulative value of a $100 investment in each node starting at the beginning of the historical period. The returns are compounded daily and plotted over time.
- Click Rolling Standard Deviation: This is the same calculation as standard deviation in the statistics module, but it is only performed over the number of days set in the rolling period parameter. In this case, the standard deviation is calculated for each 30 day sub-period within the historical period and plotted against the ending date within each sub-period.

**Prices**

**Returns**

**Cumulative Returns**

**Rolling Standard Deviation**

In the Getting Started Guide, you created forecasts for each node in the network. You might have wondered, “How do I forecast reasonable probabilities and expectations?” This tutorial will explain how to create and verify unconditional forecasts.

1. Click Adjust in the Judgment grouping

2. Select SPY as the Node ID

After you select the Node ID, default values appear for both the probability and expectation forecasts. These defaults are the historical unconditional probabilities and expectations you saw in the Unconditional Analysis Guide. Default probabilities are orange, and default expectations are purple. Although you could forecast using only your judgment, it is much easier to start with historical data and adjust probabilities and expectations until the forecasts reflect your views.

3. Input the Probabilities as follows: Large Loss = “.01”, Medium Loss= “.13”, Small Loss = “.33”, Small Gain = “.37”, Medium Gain = “.15”, Large Gain = “.01”. Then click Adjust.

The graphics above show the starting and ending probabilities. Increasing probabilities and expectations causes the cells to turn green; whereas, decreasing probabilities and expectations causes the cells to turn red. This highlights positive and negative bets compared with the historical data.

In this case, the probability of a Large Gain increased from 0 to 1%. Despite the lack of Large Gains in the historical period, there is some chance they will occur in the future, and this should be included in the forecasts. A 1% chance is equivalent to one in 100 days, or approximately one Large Gain occurring every five months.

These forecasted probabilities anticipate a more volatile US market with significantly fewer small moves and many more extreme ones. Since a Large Loss is slightly more probable than a Large Gain, skewness should decrease.

4. Type the expectations as follows: Large Loss = “-3.3”, Medium Loss = “-1.5”, Small Loss =” -.40”, Small Gain = “.40”, Medium Gain = “1.5”, Large Gain = “3.2”. Then click Adjust.

Since no Large Losses or Gains occurred in the historical period, the defaults are set to the lower bounds of each of these intervals. An expected loss of -10% is unrealistic and must be changed. Based upon experience, the author set this value to -3.3%. The expected Large Gain is set to 3.2%, or .1% lower in magnitude than the Large Loss. This will decrease skewness. Otherwise, the expected losses and gains are symmetrical and close to the defaults.

As an aside, unconditional forecasts may be used for scenario analysis or stress testing.

The graphic above shows the forecast probabilities for a simple stress test. In this case, the forecast is set to be the scenario when a Large Loss occurs in the US market. This forecast will then influence each dependent node in the network. As a result, you can determine the affect a large market loss has on each position within your portfolio and on the portfolio as a whole.

5. Click OK to exit the Adjust Judgment module.

6. Click Network, Validate, and Scenarios on the Ribbon

7. Click Unconditional Probabilities

As you saw in the Unconditional Analysis Guide, selecting network and unconditional probabilities shows the unconditional probability graphics within the context of the network. Unlike before, you selected Validate as the solution type. With this option selected, calculations are performed using simulated data, which is derived from the forecasts. In contrast, Explore shows the results of calculations based upon historical data.

As shown above, the Validate unconditional probabilities appear to be equal to the forecasts. In reality, a small difference exists. This occurs because the Monte Carlo simulations constituting the Validate data are random and, consequently, will cause the proportions of simulated returns to differ from the forecast probabilities. In general, an increase in the number of simulations will decrease the difference between the forecast and Validate unconditional probabilities. However, as discussed in the Getting Started Guide, more simulations also increase calculation time. Therefore, the tradeoff between precision and calculation time exists for the analyses as well as the optimizer.

8. Click Network, Validate, and Scenarios on the Ribbon

9. Click Unconditional Expectations

The network will show the unconditional expectations for each node. The graphic above only shows the top node. The forecasted expectations are also shown for comparison.

The forecasted expectations are exactly equal to the Validate expectations. This always holds true. The simulated returns for each interval are set equal to the expectation for the interval. For example, each simulated Large Loss is -3.3%. The Validate expectations are the average of the simulated returns in each interval. Therefore, the Validate expected Large Loss is also -3.3%.

10. Click Score on the Ribbon

11. Click on Validate-Explore Probabilities

The network will show the scores analysis for each node. The graphic above only shows the top node. Validate-Explore probabilities show the difference between Validate and Explore unconditional probabilities. As you can see, Large Gain differs by 1%. This occurs because we set the forecasted probability equal to 1%; whereas, the historical data contained 0% Large Losses.

The score shown at the top of the table refers to the Brier Score calculation applied to all of the intervals. It is the Mean Squared Error between the Validate and Explore Unconditional Probabilities and is a type of Proper Scoring Metric. These calculations can be used to evaluate forecasting performance and cannot be gamed. Please see the Evaluating Forecasts Guide for more information.

12. Click Score on the Ribbon

13. Click Validate-Explore Expectations

The above graphic shows the scored expectations for the top node in the network.

Similar to the probabilities, the scored expectations calculate the difference between the Validate and Explore unconditional expectations. For example, the Validate unconditional expectation for the Large Loss interval is -3.3%, and the Explore unconditional expectation is -10%. The Explore scored expectation is the difference between these two, or 6.7%. The score at the top of the table is calculated the same way as for the probabilities.

Although you could view each analysis within the Network, you might want to compare the Explore and Validate analyses directly. This section will show you how to perform this type of comparison.

14. Click Node on the Ribbon

15. Click on Unconditional Statistics

16. Select SPY as the Node. Then click and hold the left mouse button down on the top of the Statistics border. As you hold, move the mouse to the top of the screen and release. The Statistics menu should follow your mouse and change position to the top of the list.

17. Click on the “+” button next to Statistics.

This should expand the statistics container and show the graphic above. Notice that the results for the Evaluate and Explore column are identical. This occurs because the initial Evaluate data is set to be the Explore data by default. The Evaluating Forecasts Guide will show you how to set the Evaluate data appropriately.

You can now compare the Validate and Explore statistics. The outliers for the simulated data (minimum and maximum) are larger in magnitude than for the historical period. This partially explains the following: although the median of the simulated data exceeds that of the historical data, the mean is less than the historical mean. The negative outliers for the simulated data pull down the mean and also cause the skewness of the simulated data to be negative. In contrast, the skewness of the historical data is positive. These outliers also explain why the excess kurtosis of the simulated data exceeds that of the historical data.

As expected, the simulated data contains 100,000 observations.

Lastly, the annual statistics can help you verify the forecasts from an absolute perspective. An expected Annual Return of 11% for the US market is close to realistic; whereas, 26% is likely unrealistic. The simulated standard deviation of 15 can be compared to the implied volatility of options contracts on SPY. To be conservative, you might consider either increasing the probability of a Large Loss slightly in the forecast or decreasing its expectation to both lower the mean and increase standard deviation.

18. Click on the “-“ button in the bottom right corner to collapse the statistics display

Similarly, the Scores and Unconditional Probabilities and Expectations are exactly the same as those shown in the Network.

19. Click the Node Button on the Ribbon

20. Click on Time Series

21. Select SPY as the Node

22. Click the “+” buttons next to Returns Charts, Cumulative Returns Charts, and Rolling Standard Deviation Charts.

23. The graphic above shows graphs of the Explore and Validate returns plotted over time.

The Validate returns graph only contains a sample of 100 observations rather than the full 100,000 simulations. This ensures it is readable. Notice that the Validate returns only have one Large Gain occur in the sample and no Large Losses. This occurs because the sample is small.

These two graphs enable you to verify that the “roughness” of the Validate returns is close to that of the Explore returns. In this case, the two data sets appear close; although, slightly more Medium observations occur in the simulated data and more Small observations arise in the historical data. The number of outliers appears approximately equal. These results are expected since the forecast probability of a Medium Loss is nearly double the historical value and both Small probability forecasts are less than the defaults.

24. The graphics above shows Explore and Validate cumulative returns plotted over time.

The Validate cumulative returns are rougher than the Explore cumulative returns. This matches the observation that more Medium returns occur in the simulated data set. Although the simulated data differs from the historical, it is still plausible.

25. The graphics above show the Explore and Validate rolling standard deviation.

Unless you are forecasting significantly higher market volatility than the historical period, the Validate rolling standard deviations should typically fall within the range of the Explore rolling standard deviation. In this case, the simulated data is at the upper end of the historical range. This observation aligns with the intent to make the forecast slightly more volatile than history.

This tutorial has explained how to create unconditional forecasts and verify their plausibility. All of these analyses can also be used to verify conditional forecasts as well. In the Creating Conditional Forecasts Guide, you will learn additional techniques for creating and verifying forecasts for child nodes.

The Creating and Verifying Unconditional Forecasts Guide explained how to create and verify the forecasts for a single node. In contrast to unconditional forecasts, conditional forecasts require specific analyses to examine the connection between parent and child nodes. This guide will show you how to create conditional forecasts and verify the plausibility of the forecasted parent and child relationship.

Prior to creating conditional forecasts, you should determine if a relationship between factors and stocks exists in the historical data.

1. Click Network, Explore, and Regression on the Ribbon

You should see the graphic above. Notice that the United States node lacks an analysis underneath it. This occurs because regressions examine the connection between a parent (independent variable x) and the child (dependent variable y) and are shown underneath the child.

The first graphic above shows the results of the linear regression between Exxon and the United States. This regression fits a line to the Exxon and United States returns, which is shown in the second graphic. The slope, also known as Beta, in the second column measures the sensitivity of Exxon to changes in the US market. In this case, a 1% increase in US market corresponded with a 1.1% advance in Exxon on average. TStat (T-Statistic) measures the statistical significance of the relationship. As a rule of thumb, a t-statistic with an absolute value greater than two indicates a statistically significant relationship between the parent and child. In the absence of a significant t-statistic, you might either consider rebuilding the network and determining a better parent factor or determine why you think a relationship will occur in the future. In this case, the t-statistic is much larger than 2 and indicates a causal relationship between the United States market and Exxon is plausible. R-Squared measures the amount of variation in Exxon explained by the United States. In this case, the US accounts for approximately 46% of the variation in Exxon.

The intercept tends to be less important. However, if the magnitude of its t-statistic or value is large relative to the slope, it may mean a better factor exists or another factor is needed to explain the variation in the child. In the former case, you might try rebuilding the network with a different factor. In the latter case, you may want to build multiple networks with a different parent for the child in each. In the above results, the absolute value of the intercept t-statistic and value are small, and no additional action is necessary.

2. Click Network, Explore, and then Scenarios on the Ribbon

3. Click Conditional Probabilities

You should now see the graphic above. Notice that unconditional probabilities are shown for the United States. This occurs because the U.S. lacks a parent and, consequently, conditional probabilities.

The graphic above shows the enlarged Explore conditional probabilities for Exxon Mobil. As discussed in the Getting Started Guide, conditional probabilities and expectations have the following form: given the occurrence of x, the probability or expectation of y is z. The truncated version is: if x, then the probability or expectation of y is z. x is the column header, y is the row header, and z is the conditional probability. For example, given a Medium Loss for SPY, the probability of a Large Loss for Exxon is 11%.

Explore conditional probabilities are calculated as follows:

- For each date, the return of SPY is assigned to an interval based upon its value. For example, a return of -2% on 4/15/2015 would be assigned to Medium Loss. This determines the column for the observation.
- Likewise, the return of Exxon is also assigned to an interval based upon its value for the selected date. For example, a return of -3.5% on 4/15/2015 would be assigned to Large Loss. This determines the row of the observation.
- The total count for each cell and column is tallied, and the cell total divided by the column total determines the relative frequency. For example, if 11 observations occurred on dates in which the U.S. and Exxon had Medium and Large Losses, respectively, and the US had a total of 100 Medium Losses for the period, then the historical conditional probability would be 11%.

Notice that the probabilities in the Large Loss and Gain columns are both zero. This occurs because the U.S. market never lost or gained more than 3% in the historical period.

Lastly, the bulk of the observations occur on a diagonal from the upper left to the bottom right of the table. This indicates a high correlation between Exxon and the U.S. market. If all observations lay on this diagonal, then Exxon and the U.S. would be perfectly correlated. For example, each Large Loss in the US would correspond to a Large Loss in Exxon, and the upper left corner would be 1.

4. Click Network, Explore, and then Scenarios on the Ribbon

5. Click on Conditional Expectations

You should see the conditional expectations for each node within the context of the network. The graphic above shows the conditional expectations for Exxon. The values are read similarly to the conditional probabilities. For example, given a Medium Loss in the US market, the expected value of a Large Loss in Exxon is -3.3%. This is the average of all Exxon returns on dates when a Medium Loss in the U.S. and a Large Loss in Exxon occur. Similar to the unconditional expectations, the default value for the conditional expectations is the lower bound for the row interval. Therefore, the default value for any cell in the Large Loss row is -10.

6. Click the Adjust button on the Ribbon.

7. Select XOM as the Node Id

As shown in the graphic above, the default forecasted conditional probabilities and expectations will appear.

The Getting Started Guide only contained two intervals, Loss and Gain, for the purpose of simplicity. This is inadequate for realistic simulations. The graphic above illustrates the tradeoff between realism and effort. Each table requires a total of 36 estimates, or 72 per node. Increasing the intervals to 7,8 and 9, results in 98, 128, and 162 total estimates. This mathematical reality underlies the recommendation to limit the number of intervals to be between 4 and 10. It also explains a constraint the software imposes: only one parent per child node. Even if only two parents were allowed per node, the number of estimates for only five intervals would increase five-fold from 50 to 250.

The graphics above show the default conditional probability forecasts. Similar to the default unconditional forecasts, the defaults are set equal to the historical values.

When creating conditional forecasts, each probability must be between 0 and 1, and the values in a column must sum to 1. As noted previously, the probabilities on the upper-left to lower-right diagonal are the largest in the table and vary between 32% and 74%. This informs suitable estimates for the upper-left and lower-right corners. You could also change the Begin and End Date in the Analysis Properties to include a larger historical period and, consequently, understand past estimates for these corner values. Exxon has a Beta and Annualized Standard Deviation of 1.1 and 21. These indicate the stock has medium volatility and sensitivity to the market. In the experience of the author, 60% should be a reasonable guess for the corner probabilities of such a stock. This means the following: if the US market declines or gains more than 3%, there is a 60% probability Exxon will decline or gain more than 3% as well, respectively. To simplify the other estimates in this column, I assumed the following power law applies: Probability of a Medium Gain or Loss equals half of the Probability of a Large Gain or Loss and so on for a Small Gain or Loss. Therefore, the probability of a Medium Loss is 30% and Small Loss is 15%. Since the probabilities in each column must sum to 1, the Small Loss becomes 10%. Clearly, the forecast might benefit from a more thorough review of past data combined with some statistical analysis to determine a more precise relationship between the probabilities in a column. For many purposes, however, the above considerations should suffice. Also, the remainder of probabilities in each column could be assigned small values rather than zeros. Since a Large Loss or Gain in the market is unlikely, small probabilities in these columns will be very unlikely and estimates for them can be omitted.

After dealing with the extremes, it might be prudent to review the US Small Gain and Loss columns. When the market is placid, the returns of stocks tend to be mostly bell-shaped; although, the ends of the bell tend to be thick and skewed slightly negatively.

The values of the Small Gain column were set such that the largest probability, 40%, was assigned to the diagonal. In other words, a Small Gain in the US market causes a Small Gain in Exxon 40% of the time. The Large Gain and Losses should contain small probabilities instead of zero because positive or negative news about Exxon may cause its stock to perform differently than the market.

The remainder of the table is populated using a combination of historical data, statistical concepts, interpolation between existing values, and best guesses. Once you have assigned values to each probability in the table, click the Adjust button to store your forecasts in the database. This will enable to you to easily access them later if you decide to alter your estimates. To complete the analysis, you should forecast conditional probabilities for Google as well. The Evaluating Forecasts Guide will show you how to measure the accuracy of your forecasts, so you can learn from them and improve them in the future.

The graphic above shows the default for the forecasted conditional expectations, which are also set to the historical values. You should only forecast expectations when the corresponding probabilities are nonzero. Otherwise, your forecasts will never be simulated.

The graphic above shows the forecasted conditional expectations. Similar to the conditional probabilities, the forecasted expectations are based on both historical data and judgement. Unlike the conditional probabilities, it is probably easiest to create forecasted expectations for an entire row at a time and to start with the extremes. The historical Exxon Large Losses occurred when the US market experienced Medium and Small Losses. These ranged between 3.3% and 4.2%. As with the probabilities, it would be worth conducting the historical analysis over a longer period to improve the precision of the historical values. In general, the expectations will likely be close to the upper bound of the interval because high magnitude losses are extremely rare. Based upon the experience of the author, the US Medium Loss expectation of 3.3% tends to be about right. In the event of a US Large Loss, this value might be expected to decrease to approximately -3.5%. If the US gains, then the magnitude should decrease and be close to the upper bound of the interval. A reasonable guess might be 3.2%. The rest of the Large Loss row values were interpolated. Next, the Large Gains might mirror the Large Losses; although, these have a slightly lower magnitude so that the resulting simulated returns will have negative skewness. Each additional row can be estimated using similar logic.

8. After you have finished entering your forecasted probabilities and expectations, save them by clicking OK.

You should use the same unconditional analyses shown in the Creating Unconditional Forecasts Guide to verify that the conditional forecasts create reasonable simulated returns. As mentioned previously, you should also analyze the connections between nodes. This section will show you how to analyze these connections for simulated data based on the conditional forecasts.

9. Click the Node button on the Ribbon.

10. Click Conditional Statistics

11. Select XOM as the Node and click the “+” next to Validate Conditional Probabilities, Validate Conditional Expectations, and Regressions to expand them. Similar to the unconditional statistics, you can view each of these analyses within the context of the network as well.

The two graphics above show the forecasted and Validate conditional probabilities, which are the calculations from the simulated results. Although the forecast and simulated values appear the same for the middle columns, the Large Loss and Gain columns differ significantly. This occurs because the probability of a Large Loss for the US market is small and, consequently, few simulations occur for these columns and, therefore, the statistical error of calculating the relative frequency of the simulated returns is high. You can decrease this statistical error by increasing the number of simulations. However, this will increase computation time.

In contrast, the Validate conditional expectations are exactly equal to the forecasted conditional expectations. This occurs for the same reason as for the unconditional forecasts: the simulated returns consist of the forecasted conditional expectations. For example, if a simulated US market return is a Large Loss and the simulated Exxon return is also a Large Loss, then its value is set to -3.5%.

The graphic above shows the results of regressions applied to both the historical and simulated data. Similar to the unconditional statistics, the Evaluate data is set equal to Explore by default. The Evaluating Forecasts Guide will show you how to set the Evaluate data.

The Validate regression results are similar to Explore. The Beta of Exxon to the US market is 1.01 compared with the historical value of 1.1. In other words, the forecast is for Exxon to be slightly less sensitive to the U.S. market in the future. The Validate R-squared, which is the square of the correlation between Exxon and the US market, is .434 compared with .457 for Explore. This indicates the forecasted correlation is similar to the past. Although, the intercept for Validate appears to be statistically significant, it is an anomaly and should be ignored.

12. Click the Node Button on the Ribbon.

13. Click the Time Series button

14. Select XOM as the Node. Click the “+” sign button next to Rolling Correlations Charts.

You should now see the graphics above. Unless you expect future correlations to drastically increase or decrease, the Validate rolling correlations should fall within the same range as the Explore values. In this case, the Explore 30 day correlations range between approximately .44 and .86, and the Validate correlations stay in that range for most periods.

In summary, the Exxon Validate connections are similar to the Explore ones. You should also verify the Exxon Validate unconditional statistics correspond with your views. After you repeat these analyses for the rest of the nodes in your model (e.g. Google in this case), you are ready to analyze and optimize a portfolio consisting of the bottom nodes in the network. The Portfolio Analysis and Optimization Guide will show you how to do this.

As you saw in the Unconditional Analysis Guide, you can set the amount invested in each stock in the Portfolio Holdings module. The portfolio is the aggregate of these holdings. This guide will show you how to analyze both historical and simulated portfolios and determine the optimal holdings for a given set of investment criteria.

1. Click the Portfolio button on the Ribbon

2. Click on Heat Maps

Note: The unconditional statistics and time series analyses are the same as those explained in the Unconditional Analysis Guide but applied to the portfolio instead of individual nodes. Therefore, this guide omits a discussion of these options.

3. Select Explore as the Solution Type.

As shown above, the expected returns, correlations, and standard deviations appear. Expected returns measure the reward for each node. In this case, the Explore expected returns are simply the arithmetic average of the historical returns. These are approximately zero for both Google and Exxon.

Correlations indicate the similarity in performance between nodes. For example, nodes that advance and decline simultaneously have a correlation of one. If one node declines when the other advances, they have a correlation of negative one. If they are independent, correlation is zero.

Each node is perfectly correlated with itself and, consequently, the upper-left to lower-right diagonal shows all ones. The heat map shows a correlation of one as red, zero as white, and negative one as green. The other colors are interpolated between these three.

In this case, Exxon has a higher historical correlation with the US (.68) than with Google (.28). Google has a lower correlation with the US (.45) than Exxon does.

The table at the right of the graphic shows the historical standard deviations of each node. Standard deviation is typically used as a measurement of risk. In this case, the US market has a lower risk than either Exxon or Google. Exxon and Google have very similar historical standard deviations.

4. Select Validate as the Solution Type

The graphic above shows the expected returns, correlation matrix, and standard deviations for the simulated data. The simulated expected returns compare to the historical as follows: Exxon are higher, the U.S. are lower, and Google is approximately the same. The simulated standard deviations and correlations are higher than the historical ones.

In the next step, you will determine the optimal combination of Exxon and Google given these risk and reward values. The graphic above can help you determine what the optimal solution will be. If the risk of one holding is larger than the other, then the optimizer will tend to choose the less risky holding when attempting to minimize the risk of the portfolio. In this case, the optimizer should slightly favor Google in the optimal portfolio. The optimizer needs to also account for the correlations between the stocks. If the correlation between two stocks is low, the optimizer will attempt to reduce risk through diversifying across the two and weighting them equally. Finally, the optimizer needs to ensure that the minimum expected portfolio return value is achieved. Both Google and Exxon have sufficiently high simulated expected returns, and this will not be a factor in the decision.

5. Click the Portfolio button on the Ribbon

6. Click Optimization

7. Click Calculate in the Minimum Variance Portfolio form

The optimization module above shows several portfolio and holdings results. Portfolio Results shows the expected mean and standard deviation of the optimized portfolio. Portfolio Holdings shows both the original and optimized amounts invested and percentage weights for each holding. It also shows the difference between the optimized and original amounts, which suggests the trades required to obtain the optimal portfolio.

As expected, the optimized portfolio contains slightly more Google than Exxon; although, they are close to being equally weight. Such a small difference between optimized and actual holdings would require no rebalancing of holdings. If these differences are large, however, it may mean it is time to rebalance the portfolio.

8. Click Replace

This replaces the original portfolio holdings with the optimized ones. You can now analyze the resulting unconditional portfolio statistics and time series introduced earlier in this guide.

Thus far, you have built an investment model, created forecasts, and determined the optimal holdings for your portfolio. Suppose you invest in the optimal holdings from 4/15/1015 to 10/09/2015 and want to measure your performance. This guide will show you how to evaluate the performance of your portfolio and each investment relative to your forecasts.

1. Click the Network and then the Evaluate button on the Ribbon

2. Enter “04/15/2015” as the Begin Date, “10/09/2015” as the End Date. Click OK.

3. Click the Portfolio button on the Ribbon.

4. Click Unconditional Statistics

5. Click the “+” button next to all of the analyses to expand all of them.

The graphic above shows the historical, simulated, and realized values for the unconditional probabilities. The Evaluate probabilities are very similar to the forecast. One area of concern should be that the forecasted Large Gain probabilities are double the realized results. This indicates the forecasts are overly optimistic. This guide will analyze the source of this bias later.

The graphic above shows the historical, simulated, and realized unconditional expectations. The realized versus forecasted expectations diverge mostly in the tails, or Large Losses and Gains. This guide will examine the causes of these divergences later.

The graphic above shows the descriptive statistics for each dataset and indicates the source of the divergence between the forecast and realized expectations. The largest forecasted loss was 3.5% versus a realized loss of 4.5%; whereas, the maximum forecasted gain was 3.4% versus a realized gain of 8.4%. In general, a positive outlier of this magnitude is of less concern than a negative one. As a result of this outlier, the realized skew and kurtosis are both large and positive.

The above graphics shows the scores of the unconditional probabilities and expectations. The Validate column calculates the difference between realized and forecasted results; whereas, Evaluate calculates the difference between realized and historical.

The Validate column highlights the difference in the realized versus forecasted Large Gain probabilities and expectations.

6. Click the Portfolio button on the Ribbon

7. Click Time Series

8. Click the “+” next to the Values, Returns, and Rolling Standard Deviation Charts to expand them

The graphic above shows the value of the portfolio plotted over the historical, simulated, and realized time periods.

Large advances and declines in value occurred in mid-July and mid-to-late August 2015, respectively. These periods were likely the sources of the outliers.

The graphic above shows the portfolio returns plotted over time. It confirms that the outliers occurred in mid-July and late-August 2015.

The graphic above shows the portfolio rolling standard deviation over time. It indicates that the original volatility estimates were initially reasonable but the mid-July jump started a period of high volatility that only started declining towards the end of the evaluation period. The forecasts most likely should have been adjusted immediately after the first outlier to adjust to the new conditions.

9. Click the Network, Evaluate, and Charts buttons on the Ribbon

10. Click Returns

Each Returns chart will be displayed within the context of the Network.

The above graphic shows the returns for Google. After you examine the returns for each node, it should be apparent that Google is the primary source of both portfolio return outliers. Interestingly, Google is almost solely responsible for the first outlier; whereas, a large advance in the U.S. market causes a large gain in Google and Exxon for the second. A further examination reveals that Google gained 16.26% on July 16 after it released its second quarter results. The second gain occurred when the US market advanced 3.9% on August 26th, which drove up the prices of both Exxon and Google.

11. Click Security on the Ribbon

12. Click Unconditional Statistics

13. Click the “+” signs next to each analysis to expand it

The graphic above shows the unconditional probabilities for Exxon. The forecasts are close to the realized results; although, the forecasts are too optimistic and have excessively fat tails. In particular, the Large Gain forecast is 2.5 times the realized results and most likely should be adjusted.

The graphic above shows the unconditional expectations for Exxon. In contrast to the probabilities, the tail expectations are not extreme enough and most likely should be adjusted. In particular, any improvements to the Large Loss forecasted expectation could cause an increase in the volatilities of Exxon and, consequently, may yield significantly different results for the optimized portfolio.

The graphic above shows the descriptive statistics for Exxon. Unsurprisingly, the largest magnitude gain and loss in the realized results exceed those for the simulations. The mean and median of the simulations are greater than those for the realized results and suggest the forecasts were overly optimistic. In general, a forecasted annual return of 21% is high and corroborates this conclusion. The kurtosis of the realized results is influenced by the outlier 5.5% gain. Although the skew of Exxon is reasonable, the Kurtosis is too low. This could be improved through adjusting the tail expectations to be more extreme. On the other hand, the forecast for the standard deviation is accurate.

As shown above, the scored probabilities for the realized versus simulated results are much closer together than the realized versus historical data are. This suggests the analyst has forecasting skill.

The graphic above shows the scored expectations. These indicate that the analyst inaccurately forecast the Large Losses compared with historical data.

14. Click the Node button on the Ribbon.

15. Click Conditional Statistics

16. Select XOM as the Node and click the “+” buttons next to all Validate and Evaluate analyses and the Regressions

The graphic above shows both the simulated and realized conditional probabilities. The realized values appear more concentrated along the diagonal, which indicates the U.S. market and Exxon were more highly correlated than the forecasts anticipated.

The graphic above shows the regressions for all three periods. Although the forecast and realized slope are nearly identical, the higher realized R-squared confirms that the US market and Exxon were more correlated than forecast.

17. Click the Node button on the Ribbon

18. Click Time Series

19. Select Exxon as the Node and click the “+” sign next to Rolling Standard Deviation Charts and Rolling Correlation Charts to expand them

The graphic above shows the rolling standard deviation for Exxon. It indicates that the initial forecasts were too high compared to actual results, but then the share price of Exxon became more volatile starting in mid-July 2015, and the forecasts were reasonable.

The graphic above shows the rolling correlation for Exxon and the U.S. market. Similar to the rolling standard deviation, it shows that the initial correlation forecasts were too high and then were accurate later in the evaluation period. Unlike the rolling standard deviation, correlations started increasing in early August 2015. In general, correlations between stocks tend to increase when the market becomes more volatile. The evaluation chart follows this pattern.