Predicting diabetes second-line therapy initiation in the Australian population via time span-guided neural attention network
Autoři:
Samuele Fiorini aff001; Farshid Hajati aff002; Annalisa Barla aff005; Federico Girosi aff002
Působiště autorů:
Iren S.p.A, Genoa, Italy
aff001; School of Information Technology and Engineering, MIT Sydney, Sydney, New South Wales, Australia
aff002; Translational Health Research Institute, Western Sydney University, Penrith, New South Wales, Australia
aff003; Capital Markets CRC, Sydney, New South Wales, Australia
aff004; Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy
aff005; Digital Health CRC, Sydney, New South Wales, Australia
aff006
Vyšlo v časopise:
PLoS ONE 14(10)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0211844
Souhrn
Introduction
The first line of treatment for people with Diabetes mellitus is metformin. However, over the course of the disease metformin may fail to achieve appropriate glycemic control, and a second-line therapy may become necessary. In this paper we introduce Tangle, a time span-guided neural attention model that can accurately and timely predict the upcoming need for a second-line diabetes therapy from administrative data in the Australian adult population. The method is suitable for designing automatic therapy review recommendations for patients and their providers without the need to collect clinical measures.
Data
We analyzed seven years of de-identified records (2008-2014) of the 10% publicly available linked sample of Medicare Benefits Schedule (MBS) and Pharmaceutical Benefits Scheme (PBS) electronic databases of Australia.
Methods
By design, Tangle inherits the representational power of pre-trained word embedding, such as GloVe, to encode sequences of claims with the related MBS codes. Moreover, the proposed attention mechanism natively exploits the information hidden in the time span between two successive claims (measured in number of days). We compared the proposed method against state-of-the-art sequence classification methods.
Results
Tangle outperforms state-of-the-art recurrent neural networks, including attention-based models. In particular, when the proposed time span-guided attention strategy is coupled with pre-trained embedding methods, the model performance reaches an Area Under the ROC Curve of 90%, an improvement of almost 10 percentage points over an attentionless recurrent architecture.
Implementation
Tangle is implemented in Python using Keras and it is hosted on GitHub at https://github.com/samuelefiorini/tangle.
Klíčová slova:
Insulin – Sequence databases – Machine learning – Attention – Deep learning – Medicare – Word embedding
Zdroje
1. Australian Government—Australian Institute of Health and Welfare. Diabetes snapshot; 2018. https://www.aihw.gov.au/reports/diabetes/diabetes-compendium/contents/deaths-from-diabetes.
2. Diabetes Australia. Living with diabetes;. https://www.diabetesaustralia.com.au/managing-type-2.
3. Gottlieb A, Yanover C, Cahan A, Goldschmidt Y. Estimating the effects of second-line therapy for type 2 diabetes mellitus: retrospective cohort study. BMJ Open Diabetes Research and Care. 2017;5(1):e000435. doi: 10.1136/bmjdrc-2017-000435 29299328
4. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Computational and structural biotechnology journal. 2017;15:104–116. doi: 10.1016/j.csbj.2016.12.005 28138367
5. Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM Sigkdd Explorations Newsletter. 2010;12(1):40–48. doi: 10.1145/1882471.1882478
6. Chollet F. Deep learning with python. Manning Publications Co.; 2017.
7. Wallach HM. Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on Machine learning. ACM; 2006. p. 977–984.
8. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems; 2013. p. 3111–3119.
9. Pennington J, Socher R, Manning CD. GloVe: Global Vectors for Word Representation. In: Empirical Methods in Natural Language Processing (EMNLP); 2014. p. 1532–1543. Available from: http://www.aclweb.org/anthology/D14-1162.
10. Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. vol. 1. Springer series in statistics New York; 2001.
11. Breiman L. Random forests. Machine learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324
12. Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences. 1997;55(1):119–139. doi: 10.1006/jcss.1997.1504
13. Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of machine learning research. 2003;3(Mar):1157–1182.
14. LeCun Y, Bengio Y, Hinton G. Deep learning. nature. 2015;521(7553):436. doi: 10.1038/nature14539 26017442
15. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997;9(8):1735–1780. doi: 10.1162/neco.1997.9.8.1735 9377276
16. Choi E, Bahadori MT, Song L, Stewart WF, Sun J. GRAM: Graph-based attention model for healthcare representation learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2017. p. 787–795.
17. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:14061078. 2014.
18. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473. 2014.
19. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2016. p. 1480–1489.
20. Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart W. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In: Advances in Neural Information Processing Systems; 2016. p. 3504–3512.
21. Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2017. p. 1903–1911.
22. Australian Government—Department of Health. Public Release of Linkable 10% sample of Medicare Benefits Scheme (Medicare) and Pharmaceutical Benefits Scheme (PBS) Data; 2016. http://www.pbs.gov.au/info/news/2016/08/public-release-of-linkable-10-percent-mbs-and-pbs-data.
23. Hajati F, Atlantis E, Bell KJ, Girosi F. Patterns and trends of potentially inappropriate high-density lipoprotein cholesterol testing in Australian adults at high risk of cardiovascular disease from 2008 to 2014: analysis of linked individual patient data from the Australian Medicare Benefits Schedule and Pharmaceutical Benefits Scheme. BMJ open. 2018;8(3):e019041. doi: 10.1136/bmjopen-2017-019041 29523561
24. Iacus SM, King G, Porro G. Causal inference without balance checking: Coarsened exact matching. Political analysis. 2012;20(1):1–24. doi: 10.1093/pan/mpr013
25. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:12070580. 2012.
26. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012. p. 1097–1105.
27. Chollet F, et al. Keras; 2015. https://keras.io.
28. Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: a comparison of resampling methods. Bioinformatics. 2005;21(15):3301–3307. doi: 10.1093/bioinformatics/bti499 15905277
29. Everitt B, Skrondal A. The Cambridge dictionary of statistics. vol. 106. Cambridge University Press Cambridge; 2002.
30. Maaten Lvd, Hinton G. Visualizing data using t-SNE. Journal of machine learning research. 2008;9(Nov):2579–2605.
31. Johnson AE, Pollard TJ, Shen L, Li-wei HL, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific data. 2016;3:160035. doi: 10.1038/sdata.2016.35 27219127
Článok vyšiel v časopise
PLOS One
2019 Číslo 10
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Těžké menstruační krvácení může značit poruchu krevní srážlivosti. Jaký management vyšetření a léčby je v takovém případě vhodný?
- Fixní kombinace paracetamol/kodein nabízí synergické analgetické účinky
Najčítanejšie v tomto čísle
- Correction: Low dose naltrexone: Effects on medication in rheumatoid and seropositive arthritis. A nationwide register-based controlled quasi-experimental before-after study
- Combining CDK4/6 inhibitors ribociclib and palbociclib with cytotoxic agents does not enhance cytotoxicity
- Experimentally validated simulation of coronary stents considering different dogboning ratios and asymmetric stent positioning
- Risk factors associated with IgA vasculitis with nephritis (Henoch–Schönlein purpura nephritis) progressing to unfavorable outcomes: A meta-analysis