A different reason data science models fail and an ROI-first™ approach

Why do most data science models suck? In short: organizations focus on the wrong thing.

The last few years have been filled with the promise of efficiency gains from data science models, but relatively few companies have actually achieved success in capitalizing from model deployment. There are many reasons that this may be the case, including difficulty in framing the correct problem, lack of good data, or failure to plan an adoption strategy. We can help with that. However, a good data science team working with a good data science platform may be able to overcome these difficulties and still fail to deliver a model that delivers positive returns. One likely but not well-understood reason is that if the model objective is not aligned with the business objective, even seeming excellent model results do not translate into business success. The focus is on error and not success.

The solution to this problem requires thinking about ROI as a science rather than as a dream. While data science applies the scientific method to ensure that truth emerges from data, ROI-Science™ takes a similar approach, applying the scientific method to the combination of business data plus business process with the direct goal of optimizing ROI. This approach deviates from standard data science since it requires directly including ROI in a model as an objective. The ROI-first™ approach is designed to think about return at the same time as data. Consider the following data science question from a traditional approach and from an ROI-first™ approach:

Traditional:

Question: What machine parts are most likely to fail?

Objective: Predict probability of failure

ROI-first™:

Question: How can I choose the correct replacement so that my total repair cost is minimized?

Objective: Minimize the total predicted repair costs of part replacement.

These approaches are really the same question from the data side, but differ in that the second question incorporates the repair process as an objective, and the machine learning model can actually be designed so that it learns how to minimize repair costs. In ROI-Science™, the objective is an explicit mathematical construct rather than an abstraction. Not only will the model perform better in terms of your business goals, but the model output also tells you exactly what your expected return is, a great improvement over standard data science approaches. The second question formulation directly leads to the proper business action.

This ROI-first™ approach requires a great deal of precision and expertise for proper implementation to properly embed business processes into a machine learning objective function. To try to understand how objective functions work and the potential difficulties of an ROI-first™ approach, let's first consider at a high level the mathematics behind logistic regression since it is an acceptable analog that demonstrates how these ideas can be implemented. Without equations, we can say that logistic regression solves the following:

What is the set of coefficients such that the likelihood of the input data is a linear combination of the inputs?

In a thorough derivation, we would write down the likelihood function as a linear combination, and solve for the gradient of the likelihood function equal to zero, as in any standard optimization procedure. For logistic regression, the assumptions on optimization give rise to the sigmoid or logit function. The coefficients are then determined by iterating through a gradient ascent algorithm using Newton's method. Ultimately, the inverse logit of probabilities are given as a linear combination of inputs as determined by the coefficients.

The key point is this: Optimizing relative to an objective requires finding coefficients that satisfy a zero-gradient condition.

Machine learning algorithms operate in much the same way, with an objective function, analogous to the logit function, used along with an iterative procedure that calculates optimal coefficients. There are certain nice properties of the likelihood and logit function that make logistic regression appealing, including that it is scale and rotation invariant, which reduces the work of the data scientist in preparing data. Additionally, it is very nice that the algorithms always converges and the optimization procedure always finds coefficients that are associated with a global maximum of the likelihood function. However, logistic regression, linear regression, and most machine learning algorithms have the drawback of being very sensitive to multicollinearity, or highly correlated inputs. The reason for the problem is that the Hessian second-derivative matrix of the input function becomes ill conditioned and non-invertible. No matter what technique is used, multi-collinearity cannot be avoided.

In the ROI-first™ approach, let's ask the question not of whether the log-likelihood of probabilities is optimized, but whether a more general business profit function is optimized. If a suitable function is found, machine learning algorithms can be developed that directly lead to profit rather than to some esoteric function with little business relevance. Learning from logistic regression, we can look at some of the similar properties that must be avoided.

  1. Multicollinearity will still lead to a nonsingular Hessian resulting in potentially large and incorrect coefficients.
  2. Additionally, an objective function cannot collapse data to create multicollinearity.
  3. For some problems, scale invariance, rotation invariance, or translation invariance may be required, and the function must be either designed to be invariant or the scale must be applied to results.
  4. For the logit function, optimization ensured a global maximum of the likelihood, but in general, we would not expect that to be true, and it is possible that local maxima cause a poor set of coefficients to be found. To ensure global maxima, the objective function must be convex.

In summary, far more powerful models can be built by considering business optimization during model training and building the appropriate objective functions so that business optimization drives model optimization. Designing the right functions with the right structure can ensure that you get the greatest ROI from your machine learning solution.

Published on 12/17/2020

Authored by Blake Rutherford