Transfer entropy as a variable selection methodology of cryptocurrencies in the framework of a high dimensional predictive model
Autoři:
Andrés García-Medina aff001; Graciela González Farías aff003
Působiště autorů:
Consejo Nacional de Ciencia y Tecnología, Av. Insurgentes Sur 1582, Col. Crédito Constructor 03940, Ciudad de México, México
aff001; Unidad Monterrey, Centro de Investigación en Matemáticas, A.C. Av. Alianza Centro 502, PIIT 66628, Apodaca, Nuevo Leon, Mexico
aff002; Probability and Statistics, Centro de Investigación en Matemáticas, A.C. Jalisco S/N, Col. Valenciana 36240, Guanajuato, Mexico
aff003
Vyšlo v časopise:
PLoS ONE 15(1)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0227269
Souhrn
We determine the number of statistically significant factors in a high dimensional predictive model of cryptocurrencies using a random matrix test. The applied predictive model is of the reduced rank regression (RRR) type; in particular, we choose a flavor that can be regarded as canonical correlation analysis (CCA). A variable selection of hourly cryptocurrencies is performed using the Symbolic estimation of Transfer Entropy (STE) measure from information theory. In simulated studies, STE shows better performance compared to the Granger causality approach when considering a nonlinear system and a linear system with many drivers. In the application to cryptocurrencies, the directed graph associated to the variable selection shows a robust pattern of predictor and response clusters, where the community detection was contrasted with the modularity approach. Also, the centralities of the network discriminate between the two main types of cryptocurrencies, i.e., coins and tokens. On the factor determination of the predictive model, the result supports retaining more factors contrary to the usual visual inspection, with the additional advantage that the subjective element is avoided. In particular, it is observed that the dynamic behavior of the number of factors is moderately anticorrelated with the dynamics of the constructed composite index of predictor and response cryptocurrencies. This finding opens up new insights for anticipating possible declines in cryptocurrency prices on exchanges. Furthermore, our study suggests the existence of specific-predictor and specific-response factors, where only a small number of currencies are predominant.
Klíčová slova:
Finance – Eigenvalues – Eigenvectors – Centrality – Entropy – Directed graphs – Covariance
Zdroje
1. Catell RB. The Scree Test For The Number Of Factors. Multivar. Behav. Res. 1966; 1(2):245–276. doi: 10.1207/s15327906mbr0102_10
2. Wold S. Cross-validatory estimation of the number of components in factor and principal component models. Technometrics. 1978; 20(4):397–405. doi: 10.1080/00401706.1978.10489693
3. Zientek LR, Thompson B. Applying the bootstrap to the multivariate case: Bootstrap component/factor analysis. Behav Res Methods. 2007; 39(2):318–325. doi: 10.3758/bf03193163 17695360
4. Kaiser HF. The application of electronic computers to factor analysis. Educ. Psychol. Meas. 1960; 20: 141–151. doi: 10.1177/001316446002000116
5. Guttman L. Some necessary conditions for common factor analysis. Educ. Psychol. Meas. 1960; 20: 141–151.
6. Braeken J, Van Assen MA. An empirical Kaiser criterion. Psychological Methods. 2017; 22(3):450. doi: 10.1037/met0000074 27031883
7. Kapetanios G. A new method for determining the number of factors in factor models with large datasets. Queen Mary University of London, School of Economics and Finance. 2004; Working Paper 525.
8. Onatski A. Determining the number of factors from empirical distribution of eigenvalues. Rev. Econ. Stat. 2010; 92(4):1004–1016. doi: 10.1162/REST_a_00043
9. Lam C. Yao Q. Factor modeling for high-dimensional time series: Inference for the number of factors. Ann. Stat. 2012; 40(2):694–726. doi: 10.1214/12-AOS970
10. Tracy CA, Widom H. Spacing distributions and the Airy kernel. Comm. Math. Phys. 1994 Jan 1;159(1):151–74. doi: 10.1007/BF02100489
11. Harding M. Estimating the number of factors in large dimensional factor models. J Econom. 2013.
12. Yeo J, Papanicolaou G. Random matrix approach to estimation of high-dimensional factor models; 2016. Preprint. Available from: arxiv.org/abs/1611.05571. Cited 10 July 2019.
13. Johnstone IM. Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann. Stat. 2008; 36(6):2638. doi: 10.1214/08-AOS605 20157626
14. Mantegna RN., Stanley HE. Scaling behaviour in the dynamics of an economic index. Nature. 1995; 376(6535):46. doi: 10.1038/376046a0
15. Burda Z, Jurkiewicz J, Nowak MA, Papp G, Zahed I. Free random Lévy variables and financial probabilities. Physica A. 2001; 299(1-2):181–187. doi: 10.1016/S0378-4371(01)00294-1
16. Burda Z, Jurkiewicz J, Nowak MA, Papp G, Zahed I. Free Lévy matrices and financial correlations. Physica A. 2004; 343:694–700. doi: 10.1016/j.physa.2004.05.049
17. Biroli G, Bouchaud JP, Potters M On the top eigenvalue of heavy-tailed random matrices. Europhysics Letters. 2007; 78(1):10001. doi: 10.1209/0295-5075/78/10001
18. Nakamoto S. Bitcoin: A Peer-to-Peer Electronic Cash System. Bitcoin. 2009. Available from https://bitcoin.org/bitcoin.pdf Cited 26 April 2019.
19. Cryptocurrency Market Capitalizations. CoinMarketCap. Available from https://coinmarketcap.com Cited 26 April 2019. It has been complied with the terms of service of the CoinMarketCap API.
20. Fama EF. The behavior of stock-market prices. J. Bus. 1965;38(1):34–105. doi: 10.1086/294743
21. Urquhart A. The inefficiency of Bitcoin. Econ. Lett. 2016; 148:80–82. doi: 10.1016/j.econlet.2016.09.019
22. Alessandretti L, ElBahrawy A, Aiello LM, Baronchelli A. Anticipating cryptocurrency prices using machine learning. Complexity. 2018. doi: 10.1155/2018/8983590
23. Caporale GM, Gil-Alana L, Plastun A. Persistence in the cryptocurrency market. Res. Int. Bus. Finance. 2018; 46: 141–148. doi: 10.1016/j.ribaf.2018.01.002
24. Stosic D, Stosic D, Ludermir TB, Stosic T. Collective behavior of cryptocurrency price changes. Physica A. 2018 Oct 1;507:499–509. doi: 10.1016/j.physa.2018.05.050
25. Marchenko VA, Pastur LA. Distribution of eigenvalues for some sets of random matrices. Sb. Math. 1967;114(4):507–36.
26. Begušić S, Kostanjčar Z, Stanley HE, Podobnik B. Scaling properties of extreme price fluctuations in Bitcoin markets. Physica A. 2018 Nov 15;510:400–6. doi: 10.1016/j.physa.2018.06.131
27. Schreiber T. Measuring information transfer Phys. Rev. Lett. 2000; 85(2):461. doi: 10.1103/PhysRevLett.85.461 10991308
28. Shannon CE. A mathematical theory of communication Bell Syst. Tech. J. 1948; 27(3):379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x
29. Prokopenko M, Lizier JT. Transfer entropy and transient limits of computation Sci. Rep. 2014; 4:5394.
30. Papana A, Kugiumtzis D, Larsson PG. Reducing the bias of causality measures Phys. Rev. E. 2011; 83(3):036207. doi: 10.1103/PhysRevE.83.036207
31. Barnett L, Bossomaier T. Transfer entropy as a log-likelihood ratio. Phys. Rev. Lett. 2012 Sep 28;109(13):138105. doi: 10.1103/PhysRevLett.109.138105 23030125
32. Liang X. The Liang-Kleeman information flow: Theory and applications. Entropy. 2013; 15(1):327–360. doi: 10.3390/e15010327
33. Prokopenko M, Lizier J, Price D. On thermodynamic interpretation of transfer entropy. Entropy. 2013; 15(2):524–543. doi: 10.3390/e15020524
34. Barnett L, Barrett AB, Seth AK. Granger causality and transfer entropy are equivalent for Gaussian variables. Phys. Rev. Lett. 2009 Dec 4;103(23):238701. doi: 10.1103/PhysRevLett.103.238701 20366183
35. Marschinski R, Kantz H. Analysing the information flow between financial time series Eur. Phys. J. B. 2002; 30(2): 275–281. doi: 10.1140/epjb/e2002-00379-2
36. Sandoval L. Structure of a Global Network of financial companies based on transfer entropy. Entropy. 2014; 16(8):4443–4482. doi: 10.3390/e16084443
37. Begušić S, Kostanjčar Z, Kovač D, Stanley HE, Podobnik B. Information Feedback in Temporal Networks as a Predictor of Market Crashes. Complexity. 2018; 1–13.
38. Staniek M, Lehnertz K. Symbolic Transfer Entropy. Phys. Rev. Lett. 2008 Apr 14;100(15):158101. doi: 10.1103/PhysRevLett.100.158101 18518155
39. Papana A, Kyrtsou C, Kugiumtzis D, Diks C. Detecting causality in non-stationary time series using partial symbolic transfer entropy: evidence in financial data. Comput. Econ. 2016; 47(3):341–365. doi: 10.1007/s10614-015-9491-x
40. Dickey DA, Fuller WA. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. J. Am. Stat. Assoc. 1979 Jun 1;74(366a):427–31. doi: 10.2307/2286348
41. Kullback S. Information Theory and Statistics. 1st ed. New York: Joh Wiley & Sons; 1959.
42. Granger CW. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969 Aug 1:424–38. doi: 10.2307/1912791
43. Bradley E, Kantz H. Nonlinear time-series analysis revisited Chaos (2015); 25(9):097610. doi: 10.1063/1.4917289 26428563
44. Bossomaier T, Barnett L, Harré M, Lizier JT. An Introduction to Transfer Entropy: Information Flow in Complex Systems. Cham, Germany: Springer International Publishing; 2016.
45. Hlaváčková-Schindler K. Equivalence of Granger causality and transfer entropy: A generalization. Appl. Math. Sci. 2011; 5(73):3637–48.
46. Bandt C, Keller G, Pompe B. Permutation Entropy: A Natural Complexity Measure for Time Series Nonlinearity. 2002 Apr 11;88(17):174102.
47. Lizier JT. JIDT: An Information-Theoretic Toolkit for Studying the Dynamics of Complex Systems. Front. robot. AI. 2014 Dec 2;1:11. doi: 10.3389/frobt.2014.00011
48. Newman MEJ. CModularity and community structure in networks. PNAS. 2006; 103(23):8577–82. doi: 10.1073/pnas.0601602103 16723398
49. Leicht EA, Newman MEJ. Community Structure in Directed Networks. J.Phys. Rev. Lett. 2008 Mar 21;100(11):118703. doi: 10.1103/PhysRevLett.100.118703
50. Fruchterman TMJ, Reingold EM Graph Drawing by Force-directed Placement. Software-practice and experience. 1991 Nov, 21(11):1129–1164. doi: 10.1002/spe.4380211102
51. Newman MEJ Networks, and introduction. New York, NY, USA: Oxford University Press; 2010.
52. Kleinberg JM. Authoritative sources in a hyperlinked environment. J. ACM 1999; 46(5):604–632. doi: 10.1145/324133.324140
53. Izenman AJ. Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 1975 Jun 1;5(2):248–64. doi: 10.1016/0047-259X(75)90042-1
54. Izenman AJ. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning 1st ed. New York: Springer-Verlag; 2008.
55. Guhr T, Müller-Groeling A, Weidenmüller HA. Random-matrix theories in quantum physics: common concepts. Phys. Rep. 1998 Jun 1;299(4-6):189–425. doi: 10.1016/S0370-1573(97)00088-4
56. Forrester PJ, Huges TD. Complex Wishart matrices and conductance in mesoscopic systems: Exact results. J. Math. Phys. 1994 Dec;35(12):6736–47. doi: 10.1063/1.530639
57. Lou F, Zhong J, Yang Y, Zhou J. Application of random matrix theory to microarray data for discovering functional gene modules. Phys. Rev. E. 2006 Mar 29;73(3):031924. doi: 10.1103/PhysRevE.73.031924
58. Telatar E. Capacity of Multi-antenna Gaussian Channels. Eur.Trans.Telecomm. 1999 Nov;10(6):585–95. doi: 10.1002/ett.4460100604
59. Plerou V, Gopikrishnan P, Rosenow B, Amaral LA, Stanley HE. Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series. Phys. Rev. Lett. 1999 Aug 16;83(7):1471. doi: 10.1103/PhysRevLett.83.1471
60. Laloux L, Cizeau P, Bouchaud JP, Potters M. Noise Dressing of Financial Correlation Matrices. Phys. Rev. Lett. 1999 Aug 16;83(7):1467. doi: 10.1103/PhysRevLett.83.1467
61. Edelman A, Wang Y. Random Matrix Theory and Its Innovative Applications. In: Melnik R, Kotsireas IS, editors. Advances in Applied Mathematics, Modeling, and Computational Science. Boston: Springer; 2013. pp. 91–116.
62. Johnstone IM. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 2001;29(2):295–327. doi: 10.1214/aos/1009210544
63. Johnstone IM. High dimensional statistical inference and random matrices; 2006. Preprint. Available from: arXiv:math/0611589. Cited 26 April 2019.
64. Johnstone IM. Approximate null distribution of the largest root in multivariate analysis. Ann. Appl. Stat. 2009;3(4):1616. doi: 10.1214/08-AOAS220 20526465
65. Mardia KV, Kent JT, Bibby JM. Multivariate Analysis. 1st ed. London: Academic Press; 1979.
66. Vinod HD. Canonical ridge and econometrics of joint production. J. Econometrics 1976;4(2):147–166. doi: 10.1016/0304-4076(76)90010-5
67. Visscher WM. Localization of electron wave functions in disordered systems J. Non-Cryst. Solids (1972);(8): 477–484. doi: 10.1016/0022-3093(72)90179-2
Článok vyšiel v časopise
PLOS One
2020 Číslo 1
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Těžké menstruační krvácení může značit poruchu krevní srážlivosti. Jaký management vyšetření a léčby je v takovém případě vhodný?
- Fixní kombinace paracetamol/kodein nabízí synergické analgetické účinky
Najčítanejšie v tomto čísle
- Psychometric validation of Czech version of the Sport Motivation Scale
- Comparison of Monocyte Distribution Width (MDW) and Procalcitonin for early recognition of sepsis
- Effects of supplemental creatine and guanidinoacetic acid on spatial memory and the brain of weaned Yucatan miniature pigs
- Accelerated sparsity based reconstruction of compressively sensed multichannel EEG signals