System Testing and Optimization Friend or Foe

Why Test Trading Systems?, Optimization, Trades Analysed, Maximum Drawdown, Average Trade

Course: [ THE COMPLEAT DAY TRADER II : The Compleat Day Trader ]

The days of untested systems are gone forever. In fact, the pendulum is now swinging in the other direction. While unscrupulous operators once sold systems and methods for which they claimed fantastic results, today's unethical operators use statistics as a tool of deception.

System Testing and Optimization Friend or Foe?

The verdict of the world is final.

ST. AUGUSTINE

The days of untested systems are gone forever. In fact, the pendulum is now swinging in the other direction. While unscrupulous operators once sold systems and methods for which they claimed fantastic results, today's unethical operators use statistics as a tool of deception. These individuals who, paradoxically, will benefit from the trend toward the statistical validation of systems can easily dupe the public. Manipulating statistics is not difficult. Just as Archimedes once said, "Give me a place to stand on and I can move the earth," the modern systems promoter would likely say, "Give me enough statistics and I can prove anything."

This sermonette on system validation makes the point that merely testing a system and generating highly favorable hypothetical results does not guarantee success with that system. Nor should such statistics be used as a security blanket or crutch by traders. Statistics can easily be manipulated, systems can be (and are) curve-fitted, and results, unless realistic, will not reflect actual performance when the system is implemented.

While many systems are developed to show optimum performance, it is imperative that systems be tested to show the worst- case performance.

Why Test Trading Systems?

Traders test systems for various reasons. Some test a system merely to say they've done so, only to disregard the outcome or to accept mediocre results, rationalizing the negative aspects of their system. Other traders test systems in order to sell them to the public—their goal is to optimize systems in order to show maximum performance. Then there's the serious futures trader who tests systems to achieve several goals, including but not limited to the following:

  • To determine whether a theory or hypothetical construct is valid in historical testing
  • To summarize the overall hypothetical performance of a system and to analyse its various aspects in order to isolate its strong and weak points
  • To determine how different timing indicators interact with one another to produce an effective trading system
  • To explore the interaction of risk and reward variables (i.e., stop loss, trailing stop loss, position size, etc.) that would have returned the best overall performance with the smallest draw-down

Test Your Trading System

While it may seem that the last item listed above refers to optimization, you will see from the discussion of optimization later in this chapter that it is not optimization according to my definition of the term. The purpose of testing systems is simply to find what will work best for you based on what appears to have worked best in the past. In so doing, we must remember that what worked in the past in hypothetical testing may not necessarily work in the future.

A thorough test of your trading system should include at least the following information:

Number of Years Analysed. Although it is desirable to test as much data as possible, many trading systems and indicators do not withstand the test of time. The further back you test, the less effective most systems will be. Many system developers test only 10 years of historical data, since that best shows their systems. You must make your own decision regarding the length of your test.

Number of Trades Analysed. More important than the number of years analysed is the number of trades. You need not analyse many years of data if you have a large sample size of trades. I recommend at least 100 trades, provided your system will generate this number of trades in back-testing. If you are truly interested in determining the effectiveness of your system, the more trades you test, the better. Remember that there will always be a tendency to test fewer trades when you realize that the system is not holding up under back-testing. Some traders argue that the factors underlying futures market trends 25 years ago were distinctly different from those during the past 10 years. They feel that testing 25 years of data distorts the picture. If they were correct, how would we know when the current market forces change and that we must therefore change our trading systems? We are much better off finding systems that work in all types of markets.

Maximum Drawdown. This is one of the most important aspects of a trading system. A very large drawdown is a negative factor, since it eliminates most traders from the game well before the system would have turned in its positive performance. Because most traders are not well capitalized, they cannot withstand a large drawdown. However, drawdown is a function of account size. Obviously, a $15,000 drawdown in a $100,000 account is not unusual; however, the same drawdown in a $35,000 account is serious. You may decide to risk large drawdown in order to achieve outstanding performance, but this is your decision.

Consider also the source of the drawdown by examining the largest losing trade. If the majority of the drawdown occurred on only one trade, you will be better off than if the drawdown was spread out over numerous successive losses.

Maximum Consecutive Losses. This performance variable is more psychological than anything else is. An otherwise excellent trading system may have lost money on many trades in succession. Few traders can maintain their discipline through four or more successive losing trades. Even after the third loss, many traders are ready to either abandon their system or to find ways of changing it. However, at times it is necessary to weather the storm of 10 or more successive losses. If you know ahead of time what the worst-case scenario has been, you will be prepared. That's why it's important for your system test to give you this information.

Largest Single Losing Trade. This important piece of information indicates how much of the maximum drawdown is the result of a single losing trade. And this allows you to adjust the initial stop loss in retesting the system so as to see how large the average losing trade has been. If the average losing trade, for example, was $1055 and the largest single loser was $8466, you can readily see that a good portion of the average losing trade was a function of the largest loser. This shows that if you had a better way of managing the large loser (in hindsight, of course), your overall system performance would have been considerably better.

I strongly recommend close examination of the trade that resulted in the single largest loss if this loss is clearly much higher than the average losing trade. Another question to ask is "Why was the largest single losing trade so much larger than the stop loss selected?" A single largest losing trade that is several times larger than your selected stop loss points to a potential problem, perhaps with the system test. You must investigate further in such cases.

Largest Single Winning Trade. Perhaps more important than the largest single losing trade is the largest single winning trade. If, for example, your hypothetical profits total $96,780, and $33,810 of this is attributed to only one trade, you have a distorted average trade figure. It's often a good idea to remove this one trade from the overall results and re compute them in order to show the performance without this extraordinary winner. You may find that the system you have tested is mediocre, perhaps even a loser, when the single largest trade has been eliminated from the performance summary. If you can wait 10 years for the one big trade, then use the system—but do so against my advice. What you're looking for in any system with regard to average winning and losing trades is consistency—far more important than one or two extremely large winning trades that give a distorted performance picture.

On occasion only several trades may account for a considerable portion of the net system profits. While some traders feel that this somehow diminishes the value of the system, I disagree. As long as at least one-half of the overall system performance is due to trades other than the largest single winning long and short trade combined, the system is valid. As far as numbers are concerned, I would not use any system that, after deducting reasonable slip-page and commission as well as the largest single long and short winners, does not show at least $100 average profit per trade.

More importantly, because a large portion of profits in many systems derives from a very small number of trades, it is imperative that you follow each and every trade as closely to the rules as possible. Trading systems are not money machines; they don't grind out one profit after another. Trading systems make their money on the bottom line. There are many losers and few winners. The losers are kept in check by using money management stop losses that must, in most cases, be reasonably large.

And the winners, only a few of which are very large, make the game worth the candle. The trader who can't stick with a position, or let it ride, is the trader who will be surely disappointed with the results, because the big winners will be cut short.

Later in this book I will make a case for systematic market entry and less rigid market exit. Bear in mind, however, that when this procedure is followed, you must stick with the original system as closely as possible for market entry. Such adaptations are recommended for the skilled trader only!

Percentage Winning Trades. This statistic is not nearly as important as one might think. In actuality, few systems have more than 65 percent winning trades, and the more trades in your sample, the smaller this figure will be. Systems that are correct as little as 30 percent of the time can still be good systems, and systems that are accurate as much as 80 percent of the time can be bad systems. It's easy to see that even a high degree of accuracy with a large average losing trade and small average winning trade does not make a good system.

Average Trade. This statistic will tell you what the average hypothetical trade has been. You must make certain that when you test your system, you deduct slippage and commission from your average trade. Commissions add up, even discount commissions. And slippage is an important factor when determining system performance. As a rule of thumb, I recommend deducting between $75 and $100 per trade for slippage and commission.

Once this has been done, you will often significantly reduce the average trade figure. As I pointed out earlier, you must also pay close attention to the largest winning trade and the largest losing trade when evaluating the average trade. The average trade figure is important, since it considers all profits, all losses, slippage, and commission.

Optimization

There has been considerable controversy about trading system optimization. What exactly is wrong with optimizing systems? Can you go too far? Is there a happy medium?

The real issues in system optimization are complex, and they've been exacerbated by the tendency of systems developers to optimize their programs above and beyond any reasonable degree. To optimize a system is to discover the parameters that provide the best results in hypothetical back-testing. In other words, an optimization is a form of discovering what would have produced the best results using numerous if-then scenarios.

Before affordable computer hardware and software were available, optimization was a long and laborious procedure. To discover the best fit, the systems developer would need to repeatedly backtrack and test several variables. If the system parameters were numerous, the process was virtually impossible. Obviously, computers have made this a quick and efficient task. Now any trader with several thousand dollars can develop optimized systems.

Such ease of testing and optimizing is both good and bad. On the one hand, it allows traders to develop, test, and refine (i.e., optimize) systems much more rapidly. On the other hand, it has opened the door to what is called curve-fitting. The simple fact is that the powerful system-testing programs now available allow traders as well as systems vendors to repeatedly test a host of timing variables, stop losses, and other risk management schemes in order to determine which combinations would have produced the best results. In effect, this procedure fits the best parameters on past history to produce the best hypothetical results. However, the conclusions reached by such methods are often specious.

The trader who tests and retests to find the best fit will eventually reach his or her goal, but the goal itself may be nothing more than a reflection of the curve-fitted results. Tests tell us what has worked in the past but may not reveal anything worthwhile about the future. Since the past is not a carbon copy of the future, it is doubtful that the optimized parameters will work in the future. The more parameters in the decision-making model, the less likely they are to work in the future.

Overly optimized results lead to false conclusions. The result will likely mean losses. For those who develop and sell futures trading systems as a business, optimization is an amazing tool that allows the creation of outstanding hypothetical performance results that, in turn, allow systems developers to make incredible claims. And claims sell systems.

Time will tell if I am wrong about overly optimized systems. Vast personal experience, however, strongly validates my conclusions. I recall recent developments regarding several popular trading systems sold by a software developer. The advertised claims were fantastic. Systems were sold for T-bond futures, S&P futures, and currency futures. The outstanding performance claims provided a strong media campaign.

Naturally, all of the proper disclaimers were made to comply with the then-current regulatory requirements. There were no disclaimers regarding optimized results, however, nor was it disclosed that not all buyers of the systems would be using the same system parameters. Because the systems were continually optimized for best results, the hypothetical track records were truly impressive. However, the results did not jibe with results experienced by those who had old versions of the software—versions that did not reflect the new optimized parameters. This is high-tech deception. Recognizing that there might be legal liability, the systems developers eventually disclosed this fact in small print. Few buyers understood the meaning of the disclosure and even fewer cared, given the impressive hypothetical performance record. Naturally, buyers of the software felt that they could match the hypothetical performance.

In many cases, these traders did well initially. A customer in my brokerage firm purchased one of these programs and began trading it strictly according to the rules. The results were impressive. I began to watch intently every time a trade was made. It was uncanny how well the system entered and exited trades. It was as if the system had internalized a sixth sense about the market.

Then, after several months and excellent results, the system began to unravel. Numerous large losses occurred and performance deteriorated more rapidly than it had climbed. The dangers of an overly optimized system became apparent once again.

A Rational Approach to System Development

I do not totally oppose optimizing trading systems; however, I do favor a rational approach to this procedure. My rule of thumb is simple: Your trading system should have no more than four to six variables. You should search for the best combination of entry and exit variables, as well as a reasonable combination of stop loss and trailing stop loss amounts. But this is where the optimization should end. The more variables you build into the system, the less likely will be the future performance of the parameters.

Another aspect of system development relates to market personality a topic that has received little attention by most traders and market analysts. Rather than heavily optimizing a system, I recommend tailoring your system to the personality characteristics of the individual markets, provided that such characteristics exist and that they are sufficiently stable.

Summary

The development and testing of trading systems is perhaps one of the three most important issues in trading. System results can be specious if the developer uses faulty rules, optimizes excessively, or fails to understand the differences between reality and fantasy. Armed with a computer, historical data, and a few ideas, a trader can easily fall into the trench of highly optimized systems that look good on paper but that fail to produce results commensurate with the back-test.

In addition, guidelines for effective system development were presented with the proper caveats. I defined a number of terms and gave you some ideas of how to differentiate systems that were likely to go forward with similar results to their back-tests and systems that were unlikely to perform as expected.

 

THE COMPLEAT DAY TRADER II : The Compleat Day Trader : Tag: Fundamental Analysis, Forex Trading : Why Test Trading Systems?, Optimization, Trades Analysed, Maximum Drawdown, Average Trade - System Testing and Optimization Friend or Foe