Special Objective Functions

When building a trading strategy, we need to optimize financial KPIs, like Sharpe ratio or total profit with slippage, rather than KPIs normally used in vanilla ML, like squared or absolute error and, therefore, we cannot use standard ML methods.

Back to our AI Platform


Corresponding Fast Version of ML Methods

The vast majority of ML regressors use squared deviation as the main KPIs which is optimized during the training. In addition to that, many ML regressors support other KPIs but all of them similar to the squared deviation in the sense that they measure how close are model outcomes (predictions) to the targets. In contrast, a trading strategy is optimized with some financial KPIs like Sharpe ratio or total profit with slippage. The financial KPIs are very different in their nature, since they are not based on difference between the model outcomes (predictions) and targets. Instead, they are based on a product of the model outcomes (positions) and targets (price changes). This fact makes standard implementations of ML regressors useless for our purposes.

To overcome this problem, we have reimplemented many basic machine learning models to optimize financial KPIs and, at the same time, be as time efficient as standard ML regressors. For example, the standard linear regressor, optimizing squared error, is very efficient since it is based on an exact analytical solution. Similarly, we have derived an analytical solution for a linear regressor optimizing Sharpe ratio and implemented a very fast model based on this analytical solution. Similarly, we have very efficient step-function models for optimization of Sharpe ratio or total profit with slippage. These fast basic models become building blocks for more complex and advanced models. Moreover, a significant speed up of several orders of magnitude makes it possible to perform advanced, computationally intensive, non-parametric statistical tests.

Redefinition of Training

An optimization of a trading model involves financial KPIs, like Sharpe ratio, which are quite unusual for standard vanilla machine learning. This has far going consequences. If we have two predictive models, one very accurate and another one inaccurate, then adding the inaccurate model to the accurate one will most likely lead to a degradation of performance. On the other hand, if we have two trading models, one with very good Sharpe ratio an another one with no so good Sharpe ratio, then by a proper mixing of these two models, in most of the cases, we can get a new model that outperforms both original models. This example demonstrates why in vanilla machine learning, we usually want to find a single best performing model, while in the case of a development of a trading model, we should search for the best mixture (portfolio) of candidate models instead of searching for the single best performing model. The above-described fact completely redefines the idea of ML training, which stands in the core of the ML modeling. This shift should be considered as a generalization, since instead of a search for a single point in the space of model parameters we should find an optimal cloud of weights (scalar field) in the space of parameters. 

A transition from a search of a single point to a complex cloud makes the final outcome of training more flexible and expressive (there are much more degrees of freedom to define the final model). As a result, the danger of overfit becomes even larger. To address this challenge, we have also developed the problem specific mechanisms of protection from over-fit of this kind. In more details, a definition of optimal portfolio of sub-strategies is largely based on an estimation of a correlations between all possible pairs of sub-strategies. The covariance matrix, providing those correlation, is larger and flexible object with many degrees of freedoms and, therefore, is the main source of over-fit. We address the problem of overfit two-folds: (1) by a special in-house method for a good prior estimation of correlations between sub-strategies, (2) by a special in-house regularization of the covariance matrix. The final outcome of this approach is a trading strategy composed of a large number of carefully weighted sub-strategies.