One corollary of the Cullen Schaffer's Conservation Law of Generalization Performance indicates that no learner is generally better than another learner.

Information Gain Versus Gain Ratio: A Study of Split Method Biases
Download Resources
PDF Accessibility
One or more of the PDF files on this page fall under E202.2 Legacy Exceptions and may not be completely accessible. You may request an accessible version of a PDF using the form on the Contact Us page.
One corollary of the Cullen Schaffer's Conservation Law of Generalization Performance indicates that no learner is generally better than another learner. If the first learner performs better than the second learner on some learning situations, the first learner must perform worse than the second learner on other learning situations. Unfortunately, the corollary does not provide a description of the circumstances where a specific learner has an advantage. This article focuses on two decision tree learners. One uses the information gain split method and the other uses gain ratio. It presents a predictive method that helps to characterize problems where information gain performs better than gain ratio (and vice versa). To support the practical relevance of this research, it shows that the predictive method works effectively on the contraceptive method choice problem from the Cal-Irvine Machine Learning Repository. This article brings new insight on how these two split methods affect a decision tree learner's bias.