Closed-form Approximations in Multi-asset Market Making: Applied Mathematical Finance: Vol 28, No 2

rl algorithm

In 2008, Avellaneda and Stoikov published a procedure to obtain bid and ask quotes for high-frequency market-making trading . The successive orders generated by this procedure maximize the expected exponential utility of the trader’s profit and loss (P&L) profile at a future time, T , for a given level of agent inventory risk aversion. Inventory management is therefore central to market making strategies , and particularly important in high-frequency algorithmic trading. In an influential paper , Avellaneda and Stoikov expounded a strategy addressing market maker inventory risk.

Starting with the strategy name, you have to enter avellaneda_market_making to use this new strategy. After that, use config order_book_depth_factor and config risk_factor to set your custom values. On hummingbot, you choose what the asset inventory target is, and the bot calculates the value of q. This parameter is used to calculate what is the difference between the current inventory position and the desired one. But for now, it is essential to know that using a significant κ value, you are assuming that the order book is denser, and your optimal spread will have to be smaller since there is more competition on the market. There is a lot of mathematical detail on the paper explaining how they arrive at this factor by assuming exponential arrival rates.

Do not sell my personal information

Finally, as noted above, implementations of the AS procedure typically use the reservation price as an approximation for both the bid and ask indifference prices. The main contribution of this paper is a new integral deep LOB trading system that embraces model training, prediction, and optimization. Inspired by the model architecture in Zhang et al., 2018, Zhang et al., 2019, we adopt the deep convolutional neural network model, which has a structure of convolutional layers and includes an inception module and LSTM module.

Finally, the avellaneda-stoikov model can also be used on longer periods and we exhibit the use of the algorithm on a period of two hours, to sell a quantity of shares equal to 20 times the ATS, representing here around 5% of the volume during that period . The resulting process modeling the number WAVES of stocks in the portfolio. However, the framework of the model imposes to trade with orders of constant size, an hypothesis that is an approximation of reality since orders may in practice be partially filled.

Market making models: from Avellaneda-Stoikov to Gue´ant- Lehalle, and beyond

In most of the many applications of RL to trading, the purpose is to create or to clear an asset inventory. The more specific context of market making has its own peculiarities. DRL has been used generally to determine the actions of placing bid and ask quotes directly [23–26], that is, to decide when to place a buy or sell order and at what price, without relying on the AS model. Spooner proposed a RL system in which the agent could choose from a set of 10 spread sizes on the buy and the sell side, with the asymmetric dampened P&L as the reward function (instead of the plain P&L). Combining a deep Q-network (see Section 4.1.7) with a convolutional neural network , Juchli achieved improved performance over previous benchmarks. Kumar , who uses Spooner’s RL algorithm as a benchmark, proposes using deep recurrent Q-networks as an improved alternative to DQNs for a time-series data environment such as trading.


And as you can see, the ask offers will be created closer to the market mid-price since the optimal spread is calculated with the reservation price as reference. Another feature of the model that you can notice in the above picture is that the reservation price is below the market mid-price in the first half of the graphic. The second part of the model is about finding the optimal position the market maker orders should be on the order book to increase profitability.

But as its value increases, the distance between the mid-price and the reservation price will increase when the trader inventory is different from his target. These are additional parameters that you can reconfigure and use to customize the behavior of your strategy further. To change its settings, run the command config followed by the parameter name, e.g. config max_order_age. Olivier Guéant, Charles-Albert Lehalle, and Joaquin Fernandez-Tapia. The last trade happens less than 1 minute before the end of the period.

We implement the proposed algorithm with its competitors on a widely used dataset. From extensive measurements, we obtain that the algorithm produces WCVC with less weight at the same time its monitor count and time performances are reasonable. Figure3 depicts one simulation of the profit and loss function of the market maker at any time t during the trading session in the left panel. The profit and loss performance of the trading is displayed by the cash level histogram in the left panel.

Avellaneda Market Making

When parameters is closer to 1, will increase chances of one side of bid/ask to be executed with respect to the other, in that way forcing inventory to converge to target while decreasing the final profit. It means that the further away from the “fair price” an order is posted, the less transactions it will obtain. In practice, if the limit order price is far above the best ask price, the trading gain may be high but execution is far from being guaranteed and the broker is exposed to the risk of a price decrease.

risk aversion

Moreover, Yang et al. have improved the existing models with Heston stochastic volatility model, to characterize the volatility of the stock price with price impact and, implemented an approximation method to solve the nonlinear HJB equation. They have considered a constant price impact using the same counting processes for both arrival and filled limit orders. More recently, Baldacci et al. have studied the optimal control problem for an option market maker with Heston model in an underlying asset using the vega approximation for the portfolio. For more developments in optimal market making literature, we refer the reader to Guéant , Ahuja et al. , Cartea et al. , Guéant and Lehalle , Nyström and Guéant et al. . In electronic markets, any trader can become a market maker who provides the liquidity to the markets in Limit Order Books ; and market makers are allowed to submit the orders on both buy and sell sides of the market by the trading mechanisms. Deciding for the best bid and ask prices that a market maker sets up is a hard and complex problem in many aspects due to the fact that the problem should be tackled as a combined problem of the modeling the asset price dynamics and the optimal spreads.

1 The Alpha-AS model

2, we the framework in continuous time and formulate the optimization problem in terms of the expected return of the trader. Section3 is dedicated to the study of the stochastic control and Hamilton-Jacobi-Bellman equations for the model proposed in Sect. 3.2.1, we consider the case of the jumps in volatility of the price. The paper is also equipped with an Appendix on how to use the method of finite differences for the numerical solution of the corresponding nonlinear differential equation. The question of the truncation of the interval of possible state feature values remains open, or there seems to be some misunderstanding between the authors and the reviewer. For instance, how are market prices (or actually differences to the mid-price) truncated to the interval [-1,1]?

  • For every day of data the number of ticks occurring in each 5-second interval had positively skewed, long-tailed distributions.
  • The optimal bid and ask quotes are obtained from a set of formulas built around these parameters.
  • Hasselt, Guez and Silver developed an algorithm they called double DQN.
  • Most of the data, the Java source code and the results are accessible from the project’s GitHub repository .
  • So, as the trading session is getting closer to the end, order spreads will be smaller, and the reservation price position will be more “aggressive” on rebalancing the inventory.
  • By our numerical results, we deduce that the jump effects and comparative statistics metrics provide us with the information for the traders to gain expected profits.

That is, these agents decide the bid and ask GALA prices of their orderbook quotes at each execution step. The main contribution we present in this paper resides in delegating the quoting to the mathematically optimal Avellaneda-Stoikov procedure. What our RL algorithm determines are, as we shall see shortly, the values of the main parameters of the AS model.

Market making is a high-frequency trading problem for which solutions based on reinforcement learning are being explored increasingly. Two variants of the deep RL model (Alpha-AS-1 and Alpha-AS-2) were backtested on real data (L2 tick data from 30 days of bitcoin–dollar pair trading) alongside the Gen-AS model and two other baselines. The performance of the five models was recorded through four indicators (the Sharpe, Sortino and P&L-to-MAP ratios, and the maximum drawdown). Gen-AS outperformed the two other baseline models on all indicators, and in turn the two Alpha-AS models substantially outperformed Gen-AS on Sharpe, Sortino and P&L-to-MAP. Localised excessive risk-taking by the Alpha-AS models, as reflected in a few heavy dropdowns, is a source of concern for which possible solutions are discussed.

An Avellaneda strategy feature that recalculates your hanging orders with aggregation of volume weighted, volume time weighted, and volume distance weighted. The number of ticks used as a sample size for volatility calculation. The spread (from mid-price) to defer the order refresh process to the next cycle. Vol_to_spread_multiplier will act as a threshold value to override max_spread when volatility is a higher value. The minimum spread related to the mid-price allowed by the user for bid/ask orders.

Risk measures and fine tuning of high frequency trading strategies. We also, alternatively, randomized the choice with probabilities that depend on the respective proximity to the neighboring quotes. And the only risk he faces is linked to the non-execution of his orders. All these special cases allow to comment on the role of the parameters, before we carry out comparative statics in the next section. Once the whole portfolio is liquidated, we assume that the trader remains inactive.

Indeed, the differences in Max DD performance between Gen-AS and either of the Alpha-AS models, over all test days, are not statistically significant, despite the large differences in means. The latter are a result of extreme outliers for the Alpha-AS models from days in which these obtained a very poor (i.e., high) value for Max DD. The medians, however, are very similar to the median for the Gen-AS model. Mann-Whitney tests comparing the four daily performance indicator values (Sharpe, Sortino, Max DD and P&L-to-MAP) obtained for the Gen-AS model with the corresponding values obtained for the other models, over the 30 test days. Number of days either Alpha-AS-1 or Alpha-AS-2 scored best out of all tested models, for each of the four performance indicators.

Top 10 Quant Professors 2022 – Rebellion Research

Top 10 Quant Professors 2022.

Posted: Thu, 13 Oct 2022 07:00:00 GMT [source]

We call trading cycles the interval of time where spreads start the widest possible and end up the smallest. Once the cycle is reset, spreads will start again, being the widest possible. This parameter, denoted by the letter gamma, is related to the aggressiveness when setting the spreads to achieve the inventory target. It is directly proportional to the asymmetry between the bid and ask spread. The Avellaneda Market Making Strategy is designed to scale inventory and keep it at a specific target that a user defines it with.

Again, the of selecting a specific individual for parenthood is proportional to the Sharpe ratio it has achieved. A weighted average of the values of the two parents’ genes is then computed. Mean decrease accuracy , a feature-specific estimate of average decrease in classification accuracy, across the tree ensemble, when the values of the feature are permuted between the samples of a test input set . To obtain MDA values we applied a random forest classifier to the dataset split in 4 folds.

However, because of the characteristics of imbalanced classification, we replace the categorical cross-entropy loss function with the focal loss function. It is necessary to pay more attention on the minority cases and capture the patterns of these valuable long and short signals. Then, the model trained daily or weekly can predict trading actions and the probability of each choice at every tick.

On the contrary, if he expects the price to rise, he is going to post orders deeper in the book in order to slow down execution and benefit from the price increase. Table11 which is obtained from all simulations depicts the results of these two strategies. We can see that when the jumps occur in volatility, it causes not only larger profits but also larger standard deviation of the profit and loss function. This is a small inventory-risk aversion value but is enough to force the inventory process to revert to zero at the end of the trading.