#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Modeling aggressive market order placements with Hawkes factor models


Authors: Hai-Chuan Xu aff001;  Wei-Xing Zhou aff001
Authors place of work: Research Center for Econophysics, East Chine University of Science and Technology, Shanghai, China aff001;  Department of Finance, East Chine University of Science and Technology, Shanghai, China aff002;  Department of Mathematics, East China University of Science and Technology, Shanghai, China aff003
Published in the journal: PLoS ONE 15(1)
Category: Research Article
doi: https://doi.org/10.1371/journal.pone.0226667

Summary

Price changes are induced by aggressive market orders in stock market. We introduce a bivariate marked Hawkes process to model aggressive market order arrivals at the microstructural level. The order arrival intensity is marked by an exogenous part and two endogenous processes reflecting the self-excitation and cross-excitation respectively. We calibrate the model for a Shenzhen Stock Exchange stock. We find that the exponential kernel with a smooth cut-off (i.e. the subtraction of two exponentials) produces much better calibration than the monotonous exponential kernel (i.e. the sum of two exponentials). The exogenous baseline intensity explains the U-shaped intraday pattern. Our empirical results show that the endogenous submission clustering is mainly caused by self-excitation rather than cross-excitation.

Keywords:

Finance – Microstructure – Stock markets – Fluid flow – Kernel functions – Exponential functions – Operator theory – Commodity markets

Introduction

Self-exciting and mutually exciting point processes are a natural extension of Poisson processes, which are first proposed by Alan G. Hawkes [1, 2]. Hawkes processes have been applied to characterize clustering events in finance, particularly to high-frequency data and market microstructure [3, 4], because many types of events are clustered in time such as order submissions [5], mid-quotes changes [6], transactions [7] and extreme returns occurrences [8].

As a class of branching processes, self-exciting Hawkes models can be used to compute the so-called branching ratio, which is defined as the average number of triggered events of the first generation per source [911]. Through calibrating the self-exciting Hawkes model on time series of price changes, the endogeneity and structural regime shifts are quantified in commodity markets [12]. In other words, the branching ratio can serve as an effective measure of endogeneity for the autoregressive conditional duration point processes [13]. In addition, a marked self-exciting process model can successfully characterize intraday clustering of extreme fluctuations and the instantaneous conditional VaR [14]. The Hawkes models are further extended to quadratic by allowing all feedback effects in the jump intensity that are linear and quadratic in past returns [15].

In addition to self-exciting processes, more researchers study the cross-exciting effects through multivariate Hawkes processes. The multivariate Hawkes processes are applied to time trades and mid-quote changes for a New York Stock Exchange stock [16], to study complex interactions between the time of arrival of orders and their sizes [17], to fit the observations of trades-through [18], to measure the resilience of London Stock Exchange order book [5], to account for the dynamics of market prices [1921], to model price change by a self-exciting mechanism and an exogenous component generated by the pre-announced arrival of macroeconomic news [22], and to model financial contagion across six international stock index [23]. The multivariate Hawkes processes are prevalent in modeling high-frequency order book and have been extended to non-linear Hawkes function [24] and new non-parametric kernel estimation procedure [25].

In this paper, we are interested in modeling aggressive market order placement, i.e. orders with the size greater than the opposite best quote. These orders consume liquidity and walk up the limit order book, causing the best-quotes to change. Aggressive market orders are very important in price formation and microstructure. For example, the submission pattern of aggressive market orders may contain information about order splitting behavior according to the liquidity available in the order book. We introduce a bivariate marked Hawkes process to model aggressive market order arrivals. It’s reasonable to apply a Hawkes process to model order events because the inter-trade durations have fat tails and long memory [2628]. The Autoregressive Conditional Duration (ACD) model, introduced in [29], can also characterize market order arrival via durations between events. However, the ACD model is indirectly in terms of durations, not directly giving order arrival intensity as the Hawkes model. We find that the exponential kernel with a smooth cut-off (i.e. the subtraction of two exponentials) at short times produces better calibration than the monotonous exponential kernel (i.e. the sum of two exponentials) does. Our empirical results show that the endogenous submission clustering is mainly caused by self-excitation rather than cross-excitation. In addition, the exogenous aggressive order arrivals show obvious intraday pattern.

Materials and methods

The model

Let N1 and N2 denote the counting processes for aggressive market buy orders and aggressive market sell orders. These two processes are assumed to form a bivariate Hawkes process with intensities λ1 and λ2,

where μi > 0 is a baseline intensity describing the arrival of exogenous events, and the kernels ϕii and ϕij represent respectively the self-exciting and cross-exciting effects.

The kernel ϕ(ts) describes the impact of a previous event at time s on the current intensity at time t. Previous studies advocate the use of exponential or power-law kernels. Here we use the difference of two exponentials as the kernel to account for the self- or cross-excitations:

where vj is the share volume of the order event. The negative exponential term provides a smooth cut-off at short times. This function has several advantages. First, it satisfies ϕij(0) = 0, since we cannot expect market participants to react instantaneously to events. Second, it allows excitations smoothly increase to the highest and then gradually fade over time (see Fig 1), which is more reasonable to characterize the reaction of market participants. A few researches also suggest similar kernel functions [9, 11]. In our empirical analysis, we will compare the kernel in Eq (2) with the sum of exponentials kernel below
In this case, we also set ϕij(0) = 0 like other literatures. Third, compared with power-law kernels, the use of exponential kernels can reduce the computational complexity from O ( N 2 ) to O ( N ).

Fig. 1. Illustration of the kernel with a subtraction of two exponentials (continuous blue line) and the kernel with a sum of two exponentials.
Illustration of the kernel with a subtraction of two exponentials (continuous blue line) and the kernel with a sum of two exponentials.

Stationarity condition

A multivariate point process is stationary if the joint distribution of any number of types of events on any number of given intervals is invariant under translation. According to Theorem 7 in [30], the stationarity condition of a multivariate point process is that, the matrix Q with entries q i j = ∫ 0 + ∞ | ϕ i j ( u ) | d u has a spectral radius strictly less than 1. For our bivariate Hawkes model with the kernel in Eq (2), the matrix Q is

In the same way, the matrix Q for the kernel in Eq (3) is
We recall that the spectral radius of the matrix Q is defined as ρ(Q) = maxa(Q) |a|, where (Q) denotes the set of all eigenvalues of Q.

Parameter estimation

For each type of orders, μi is taken as a seasonal piecewise linear spline with 4 knots at 9:30am, 10:00am, 10:30am, 11:30am for the morning session, and 13:00pm, 14:00pm, 14:30pm, 15:00pm for the afternoon session. Therefore, the intensity λi dependents on the following parameter set θi,

In other words, there are 12 parameters to be estimated for each type of orders and for each time interval. Suppose that the data is observed over the interval [0, T], then the maximum likelihood estimates for θi can be obtained by maximizing
where {tn,i} is the sequence of the times of the events of type i (see Theorem 3.1 in [16]).

Mis-specification testing

The quality of the fits is then assessed on the time-deformed series of durations {τn,i}, defined by

where λ i ^ is the estimated intensity and {tn,i} are the empirical time stamps. If a Hawkes process describes the data correctly, the values of τn,i must be independent and exponentially distributed with the rate equal to 1. This can be verified visually in QQ-plots and rigorously with the Kolmogorov-Smirnov test [16].

Data

We use order flow data of the stock China Vanke (000002.SZ) traded on the Shenzhen Stock Exchange from April 10th, 2003 to May 20th, 2003. The China Vanke is one of the stocks with high liquidity. We choose these 21 days data due to high activity of order events around the annual financial report announcement. In this paper we consider the order flow occurring in the continuous double auction period (9:30 AM to 11:30 AM and 1:00 PM to 3:00 PM). Note that there is a lunch effect, that is, the morning orders nearly have no impact on the submission of afternoon orders after 1.5 hours lunch break. Therefore, we will estimate our model separately for the morning and afternoon sessions.

A problem arises regarding the granularity of the data. Because time stamps are rounded to the nearest 10 milliseconds, the data set contains multiple events with the same time stamp. For our sample, there are only 0.61% of events having the same time stamp as some other events. Comparing with the data in [5], which are rounded to the nearest seconds, there are 40% of events have the same time stamp. Due to ignorable probability to have multiple events in our sample, we do not handle the data and assume that each event occurring within 10 milliseconds is independent of all the others if any within the same interval.

We mark the role of order size vi in our model. We collect all the aggressive orders which have sizes greater than the opposite best quote and then calculate their median values: 3000 shares (30 lots) for aggressive market buy orders and 3200 shares (32 lots) for aggressive market sell orders. We also calculate the proportions of order penetration, which is the number of price levels on the opposite order book that the order consumes. The results are presented in Table 1. We can see that most aggressive market orders consume only the orders at the first price level.

Tab. 1. Proportions of market orders with different penetrations.
Proportions of market orders with different penetrations.

Fig 2(A) shows the empirical distributions for the inter-arrival durations of aggressive market buy (sell) orders. Generally, it seems more like a power-law distribution, not an exponential distribution. This indicates that the order events are not following a Poisson process. The autocorrelations of inter-arrival durations of order events presented in Fig 2(B) also confirms this point. Both market buy orders and market sell orders have an autocorrelation lasting more than 40 seconds. Therefore, it’s reasonable to apply a Hawkes process to model the order events.

Fig. 2. The duration distributions and correlations for aggressive market buy (sell) orders.
The duration distributions and correlations for aggressive market buy (sell) orders.
A: The empirical distributions for the inter-arrival durations. B: The autocorrelation functions of inter-arrival durations. Time (the units on the X-axis) is shown in seconds.

Results

In Fig 3, we present a sample of the estimated intensity path of aggressive market buy orders in the morning of April 10th, 2003. The kernel function used here is the smooth cut-off biexponential function given in Eq (2). We rescale the instantaneous intensity in every minute. In order to observe the goodness of fitting, we also chart the real intensity, which is the number of aggressive market buy orders in every minute. It shows that our model describes the intensity dynamics quite well.

Fig. 3. The estimated intensity of aggressive market buy orders in the morning of April 10th, 2003 (events per minute) (blue line).
The estimated intensity of aggressive market buy orders in the morning of April 10th, 2003 (events per minute) (blue line).
The kernel function used in Hawkes model is given in Eq (2). The real intensity, i.e. the number of aggressive market buy orders in every minute, is charted by orange stars.

The QQ plots of the time-deformed durations defined in Eq (8) on April 10th, 2003 are presented in Fig 4. We carry out the tests on both two types of market orders in either the morning or afternoon sessions. It is found that all the point collapse to the corresponding diagonals, indicating the exponential distribution of the data. Therefore, all the four fits are rather satisfactory against the theoretical exponential quantiles. This suggests our Hawkes model with the kernel in Eq (2) describes the data correctly. For comparison, we also present in Fig 4 the QQ plots of time-deformed durations when the kernel in Eq (3) is used. We find that, except for the case of market sell orders in the morning, the time-deformed durations are obviously not consistent with the exponential distribution. As for the case of market sell orders in the morning, it shows a good fit due to the fact that the estimated second term in the kernel is too small. More specifically, we obtain that α22 = 0.0089 and β22 = 0.1733. If the time has passed 20 seconds since the last market sell order arrival (u = 20), the first term is e−α22u=0.8369 and the second term is e−β22u=0.0312. Therefore, the second term has little contribution to the self-excitation process, and both the sum of exponentials kernel and the subtraction of exponentials kernel provide high goodness-of-fit. However, this is not the usual case. For the usual cases like the other three plots in Fig 4, the second term in the kernel function is essential and the kernel in Eq (2) gives much better goodness-of-fit than the kernel in Eq (3). Hence, we will only consider the smooth cut-off kernel in the following analyses.

Fig. 4. QQ plots (April 10th, 2003) of time-deformed durations, i.e. residuals, against an exponential distribution of parameter 1 for the two kernel functions in Eqs (2) and (3).
QQ plots (April 10th, 2003) of time-deformed durations, i.e. residuals, against an exponential distribution of parameter 1 for the two kernel functions in Eqs <em class="ref">(2)</em> and <em class="ref">(3)</em>.
A: aggressive market buy orders in the morning. B: aggressive market buy orders in the afternoon. C: aggressive market sell orders in the morning. D: aggressive market sell orders in the afternoon.

Now we use the Kolmogorov-Smirnov test to analyze the goodness of fit for all sample days. Fig 5 shows the box plot of the p-values of Kolmogorov-Smirnov test on all 21 sample days. This demonstrates that, with rare exceptions, almost of the samples pass the Kolmogorov-Smirnov test by a large margin. This further confirms that our bivariate Hawkes model with smooth cut-off kernels fits the market order events correctly.

Fig. 5. The p-values of the Kolmogorov-Smirnov test on the 21 sample days.
The <i>p</i>-values of the Kolmogorov-Smirnov test on the 21 sample days.
A: aggressive market buy orders. B: aggressive market sell orders.

Then we examine whether the estimated parameters result in a stationary bivariate Hawkes process. We fix the share volume v = 3000 (30 lots). Fig 6 presents the spectral radiuses for 42 estimated bivariate marked Hawkes processes, including 21 morning sessions and 21 afternoon sessions during the 21 sample days. It can be seen that all 42 spectral radiuses are strictly less than 1 and thus all 42 bivariate Hawkes processes are stationary.

Fig. 6. The spectral radiuses for 42 estimated bivariate marked Hawkes processes.
The spectral radiuses for 42 estimated bivariate marked Hawkes processes.
It can be seen that all 42 Hawkes processes are stationary.

We recall that the baseline intensity μ(t) describes the arrival of exogenous events. In the left panel of Fig 7, we first count the average market order number in every minute. The average number of orders displays the well-known U-shaped intraday pattern of order placement. Then, we plot the baseline intensity μ(t) in the right panel of Fig 7. The estimated exogenous part μ(t) perfectly exhibits an intraday pattern. In addition, it is reasonable that the exogenous intensity is lower than the total intensity.

Fig. 7. Average number of orders (A) and the estimated intraday baseline intensity splines (events per minute)(B).
Average number of orders (A) and the estimated intraday baseline intensity splines (events per minute)(B).
Error bars are computed for 2 standard deviations.

The endogenous intensity depends on self- and cross-exciting kernel functions. In Fig 8, we show the four estimated kernel function ϕij(u) with fixed share volume v = 3000 for aggressive market buy orders and aggressive market sell orders. We find that these kernel functions have a similar pattern but the scales are remarkably different. The kernel functions representing the self-exciting impact have higher values than those representing the cross-exciting impact, especially for market buy orders. This indicates that the self-excitation plays a major role in the endogenous part of aggressive market order placement.

Fig. 8. The estimated kernel functions ϕij(u) for aggressive market buy orders and for aggressive market sell orders.
The estimated kernel functions <i>ϕ</i><sub><i>ij</i></sub>(<i>u</i>) for aggressive market buy orders and for aggressive market sell orders.
The coefficients used in the kernel functions are the average values of 42 estimations.

Conclusion

In this work, a bivariate marked Hawkes model is proposed to characterize aggressive market order arrivals. The order arrival intensity is marked by an exogenous part and two endogenous processes reflecting respectively the self-excitation and cross-excitation. The kernel function is crucial to characterize the endogenous self-excitation and cross-excitation. We propose and compare two types of kernel function. One is a smooth cut-off exponential function (i.e. the subtraction of two exponentials), and the other is a monotonous exponential kernel (i.e. the sum of two exponentials). We calibrate the bivariate Hawkes models with different kernel functions using order flow data of a stock traded on the Shenzhen Stock Exchange. The bivariate Hawkes model is well estimated when the kernel is a smooth cut-off exponential function and the parameters satisfy the stationary condition. The exogenous baseline intensity explains the U-shaped intraday pattern. We confirm that the order arrival intensity from the endogenous part is mainly contributed to the self-exciting process, while the cross-exciting influence is weak, especially for aggressive market buy orders. Through our model, the high-frequency traders can better understand and predict market order arrivals, and then form their own order submission strategy to make profit. Besides, quantifying the endogenous contribution to the order arrival intensity will help to predict some extreme events, such as flash crash.


Zdroje

1. Hawkes AG. Point spectra of some mutually exciting point processes. J R Stat Soc B. 1971;33(3):438–443.

2. Hawkes AG. Spectra of some self-exciting and mutually exciting point processes. Biometrika. 1971;58(1):83–90. doi: 10.1093/biomet/58.1.83

3. Bacry E, Mastromatteo I, Muzy JF. Hawkes Processes in Finance. Market Microstructure and Liquidity. 2015;1(59).

4. Hawkes AG. Hawkes processes and their applications to finance: A review. Quant Financ. 2018;18(2):193–198. doi: 10.1080/14697688.2017.1403131

5. Large J. Measuring the resiliency of an electronic limit order book. J Financ Markets. 2007;10:1–25. doi: 10.1016/j.finmar.2006.09.001

6. Filimonov V, Sornette D. Quantifying reflexivity in financial markets: Toward a prediction of flash crashes. Phys Rev E. 2012;85(5):056108. doi: 10.1103/PhysRevE.85.056108

7. Lallouache M, Challet D. The limits of statistical significance of Hawkes processes fitted to financial data. Quant Financ. 2016;16(1):1–11. doi: 10.1080/14697688.2015.1068442

8. Bormetti G, Calcagnile LM, Treccani M, Corsi F, Marmi S, Lillo F. Modelling systemic price cojumps with Hawkes factor models. Quant Financ. 2015;15(7):1137–1156. doi: 10.1080/14697688.2014.996586

9. Hardiman SJ, Bercot N, Bouchaud JP. Critical reflexivity in financial markets: A Hawkes process analysis. Eur Phys J B. 2013;86(10):442. doi: 10.1140/epjb/e2013-40107-3

10. Saichev A, Sornette D. Superlinear scaling of offspring at criticality in branching processes. Phys Rev E. 2014;89(1):012104. doi: 10.1103/PhysRevE.89.012104

11. Filimonov V, Sornette D. Apparent criticality and calibration issues in the Hawkes self-excited point process model: Application to high-frequency financial data. Quant Financ. 2015;15(8):1293–1314. doi: 10.1080/14697688.2015.1032544

12. Filimonov V, Bicchetti D, Maystre N, Sornette D. Quantification of the high level of endogeneity and of structural regime shifts in commodity markets. J Int Money Financ. 2014;42:174–192. doi: 10.1016/j.jimonfin.2013.08.010

13. Filimonov V, Wheatley S, Sornette D. Effective measure of endogeneity for the Autoregressive Conditional Duration point processes via mapping to the self-excited Hawkes process. Commun Nonlinear Sci Numer Simul. 2015;22(1):23–37. doi: 10.1016/j.cnsns.2014.08.042

14. Chavez-Demoulin V, Mcgill JA. High-frequency financial data modeling using Hawkes processes. J Bank Financ. 2012;36(12):3415–3426. doi: 10.1016/j.jbankfin.2012.08.011

15. Blanc P, Donier J, Bouchaud JP. Quadratic Hawkes processes for financial prices. Quant Financ. 2017;17(2):171–188. doi: 10.1080/14697688.2016.1193215

16. Bowsher CG. Modelling security markets in continuous time: Intensity based, multivariate point process models. J Econometrics. 2007;141(2):876–912. doi: 10.1016/j.jeconom.2006.11.007

17. Rambaldi M, Bacry E, Lillo F. The role of volume in order book dynamics: A multivariate Hawkes process analysis. Quant Financ. 2017;17(7):999–1020. doi: 10.1080/14697688.2016.1260759

18. Muni Toke I, Pomponio F. Modelling trades-through in a limited order book using Hawkes processes. Economics: The Open-Access, Open-Assessment E-Journal. 2012;6(2012-22):1–23.

19. Bacry E, Muzy JF. Hawkes model for price and trades high-frequency dynamics. Quant Financ. 2014;14(7):1147–1166. doi: 10.1080/14697688.2014.897000

20. Zheng B, Roueff F, Abergel F. Modelling bid and ask prices using constrained Hawkes processes: Ergodicity and scaling limit. SIAM J Financial Math. 2014;5(1):99–136. doi: 10.1137/130912980

21. Calcagnile LM, Bormetti G, Treccani M, Marmi S, Lillo F. Collective synchronization and high frequency systemic instabilities in financial markets. Quant Financ. 2018;18(2):237–247. doi: 10.1080/14697688.2017.1403141

22. Rambaldi M, Pennesi P, Lillo F. Modeling foreign exchange market activity around macroeconomic news: Hawkes-process approach. Phys Rev E. 2015;91(1):012819. doi: 10.1103/PhysRevE.91.012819

23. Aït-Sahalia Y, Cacho-Diaz J, Laeven RJA. Modeling financial contagion using mutually exciting jump processes. J Financ Econ. 2015;117(3):585–606. doi: 10.1016/j.jfineco.2015.03.002

24. Lu X, Abergel F. High-dimensional Hawkes processes for limit order books: Modelling, empirical analysis and numerical calibration. Quant Financ. 2018;(2):249–264. doi: 10.1080/14697688.2017.1403142

25. Bacry E, Jaisson T, Muzy JC. Estimation of slowly decreasing Hawkes kernels: Application to high-frequency order book dynamics. Quant Financ. 2016;16(8):1179–1201. doi: 10.1080/14697688.2015.1123287

26. Jiang ZQ, Chen W, Zhou WX. Scaling in the distribution of intertrade durations of Chinese stocks. Physica A. 2008;387:5818–5825. doi: 10.1016/j.physa.2008.06.039

27. Jiang ZQ, Chen W, Zhou WX. Detrended fluctuation analysis of intertrade durations. Physica A. 2009;388(4):433–440. doi: 10.1016/j.physa.2008.10.028

28. Ruan YP, Zhou WX. Long-term correlations and multifractal nature in the intertrade durations of a liquid Chinese stock and its warrant. Physica A. 2011;390(9):1646–1654. doi: 10.1016/j.physa.2011.01.001

29. Engle RF, Russell JR. Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica. 1998;66(5):1127–1162. doi: 10.2307/2999632

30. Brémaud P, Massoulié L. Stability of nonlinear Hawkes processes. Atmos Pollut Res. 1996;24(3):1563–1588.


Článok vyšiel v časopise

PLOS One


2020 Číslo 1
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#