Our Use of AI

Arabesque AI Engine design

Model diversity

One of the most important design choices of the AI Engine is that it never relies on one machine learning model for generating forecasts. This decision is based on the simple observation that different models (e.g. Neural Networks, Decision Trees) perform differently under different market conditions. The AI Engine simultaneously uses a spectrum of models and ensembles their results after inference to maximize their utility. The set of model architectures used is constantly expanding according to research. Current families of models include Support Vector Machines (SVM), Trees, Gaussian Processes (GP), Neural Networks (NN), which have the potential for increased explainability and the capacity to analyze more diverse datasets.

Scale and classification

There are around 8,000 models trained on c.20 bn datapoints. These models' learning tasks can be categorized as a classification problem, providing a probabilistic recommendation of a company's share class either increasing or decreasing during a given window.

Explainability of our models

DWS whitepaper

Arabesque has invested in explainability of its models, including this whitepaper that we published with a top 20 global asset manager, DWS. The document is available on request.

Explainable AI approaches

Within the document we cover two approaches to explainable AI on page 8, namely perturbation correlation analysis (where we modify the inputs to see what the impact will be on the model) and local surrogate model (where we apply a simpler model the AI model output to investigate its behavior). Details are shown below:

Institutional grade backtest

The AI engine's stock-level signals can be used both for portfolio rebalancing and historical backtesting.

This section provides a detailed explanation of our backtesting methodology.

Backtesting involves applying a predefined set of rules to historical data that was observable and investable at specific points in time to simulate trading decisions. The process moves forward chronologically, applying these rules at each subsequent time point to generate new trading decisions. This "out-of-sample" approach is particularly well-suited for rules-based trading strategies like ours.

Arabesque has made significant efforts in providing investors with a backtest performance calculation that we believe is a meaningful representation of the anticipated behavior of a strategy, for instance:

Reliability bias - Open, high, and low price data is independent of volume and liquidity concerns. To avoid a simulated transaction, that in reality would not have been tradable, Arabesque uses only simulated market-on-open or market-on-close orders during the backtest performance calculation (i.e. no limit orders and no stop orders).
Neglect/underestimation of trading costs - After consultation with brokers, realistic trading costs per transaction are used in our backtested performance calculations.
Neglect/underestimation of volume limits - Arabesque systems take into account daily trading limitations within their backtesting approach. Within our backtest we limit our trading in single stocks to a small percentage of the ADTV (Average Daily Trading Volume). In doing so, we ensure that Arabesque strategies are scalable and do not cause adverse market impacts.
Survivorship bias - The performance of our universes and the resulting strategies are not backtest-optimized.

Our backtesting framework prioritizes realism and accuracy. The underlying models and data strictly prevent "look-ahead bias" - all forecasts and models utilize only information available at each historical rebalance date, eliminating any potential information leakage or "time-travelling." Our framework incorporates these specific parameters:

Trading execution at daily closing prices
Transaction costs e.g. 10 basis points based on user inputs
Market impact modeling for large, illiquid trades
One business day implementation/execution delay
Elimination of impractical fractional or minimal share trades
Comprehensive daily cash reconciliation simulation

Google case study

Our earlier work, which underpins our AI infrastructure is referenced at the following case-study conducted together with Google: https://cloud.google.com/customers/arabesque-ai

Our Use of AI

Our Use of AI

Arabesque AI Engine design

Model diversity

Scale and classification

Explainability of our models

DWS whitepaper

Explainable AI approaches

Institutional grade backtest

Google case study

On this page