
Our Use of AI
Our Use of AI
Arabesque AI Engine design
Model diversity
One of the most important design choices of the AI Engine is that it never relies on one machine learning model for generating forecasts. This decision is based on the simple observation that different models (e.g. Neural Networks, Decision Tress) perform differently under different market conditions. The AI Engine simultaneously uses a spectrum of models and ensembles their results after inference to maximise their utility. The set of model architectures used is constantly expanding according to research. Current families of models include Support Vector Machines (SVM), Trees, Gaussian Processes (GP), Neural Networks (NN), which have the potential for increased explainability and the capacity to analyse more diverse.
Scale and classification
There are around 8,000 models trained on c.20 bn datapoints. These models' learning tasks can be categorised as a classification problem, providing a probabilistic recommendation of a company's share class either increasing or decreasing during a given window.
Explainability of our models
DWS whitepaper
Arabesque has invested in explainability of its models, including this whitepaper that we published with a top 20 global asset manager, DWS. The document is available on request.
Explainable AI approaches
Within the document we cover two approaches to explainable AI on page 8, namely perturbation correlation analysis
(where we modify the inputs to see what the impact will be on the model) and local surrogate model
(where we apply a simpler model the AI model output to investigate its behaviour). Details are shown below:

Institutional grade backtest
The AI engine's stock-level signals can be used both for portfolio rebalancing and historical backtesting.
This section provides a detailed explanation of our backtesting methodology.
Backtesting involves applying a predefined set of rules to historical data that was observable and investable at specific points in time to simulate trading decisions. The process moves forward chronologically, applying these rules at each subsequent time point to generate new trading decisions. This "out-of-sample" approach is particularly well-suited for rules-based trading strategies like ours.
Arabesque has made significant efforts in providing investors with a backtest performance calculation that we believe is a meaningful representation of the anticipated behaviour of a strategy, for instance:
- Reliability bias - Open, high, and low price data is independent of volume and liquidity concerns. To avoid a simulated transaction, that in reality would not have been tradable, Arabesque uses only simulated market-on-open or market-on-close orders during the backtest performance calculation (i.e. no limit orders and no stop orders).
- Neglect/underestimation of trading costs - After consultation with brokers, realistic trading costs per transaction are used in our backtested performance calculations.
- Neglect/underestimation of volume limits - Arabesque systems take into account daily trading limitations within their backtesting approach. Within our backtest we limit our trading in single stocks to a small percentage of the ADTV (Average Daily Trading Volume). In doing so, we ensure that Arabesque strategies are scalable and do not cause adverse market impacts.
- Survivorship bias - The performance of our universes and the resulting strategies are not backtest-optimized.
Our backtesting framework prioritizes realism and accuracy. The underlying models and data strictly prevent "look-ahead bias" - all forecasts and models utilize only information available at each historical rebalance date, eliminating any potential information leakage or "time-travelling." Our framework incorporates these specific parameters:
- Trading execution at daily closing prices
- Transaction costs e.g. 10 basis points based on user inputs
- Market impact modeling for large, illiquid trades
- One business day implementation/execution delay
- Elimination of impractical fractional or minimal share trades
- Comprehensive daily cash reconciliation simulation
Google case study
Our earlier work, which underpins our AI infrastructure is referenced at the following case-study conducted together with Google: https://cloud.google.com/customers/arabesque-ai
