Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
Autoři:
E. Tejera aff001; I. Carrera aff003; Karina Jimenes-Vargas aff001; V. Armijos-Jaramillo aff001; A. Sánchez-Rodríguez aff002; M. Cruz-Monteagudo aff006; Y. Perez-Castillo aff002
Působiště autorů:
Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, Quito, Ecuador
aff001; Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
aff002; Departamento de Informática y Ciencias de la Computación, Escuela Politécnica Nacional, Quito, Ecuador
aff003; Departamento de Ciências de Computadores, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
aff004; Universidad Técnica Particular de Loja, Loja, Ecuador
aff005; Center for Computational Science (CCS), University of Miami (UM), Miami, FL, United States of America
aff006; West Coast University, Miami, Florida, United States of America
aff007; Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito, Ecuador
aff008
Vyšlo v časopise:
PLoS ONE 14(10)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0223276
Souhrn
The prediction of cell-lines sensitivity to a given set of compounds is a very important factor in the optimization of in-vitro assays. To date, the most common prediction strategies are based upon machine learning or other quantitative structure-activity relationships (QSAR) based approaches. In the present research, we propose and discuss a straightforward strategy not based on any learning modelling but exclusively relying upon the chemical similarity of a query compound to reference compounds with annotated activity against cell lines. We also compare the performance of the proposed method to machine learning predictions on the same problem. A curated database of compounds-cell lines associations derived from ChemBL version 22 was created for algorithm construction and cross-validation. Validation was done using 10-fold cross-validation and testing the models on new data obtained from ChemBL version 25. In terms of accuracy, both methods perform similarly with values around 0.65 across 750 cell lines in 10-fold cross-validation experiments. By combining both methods it is possible to achieve 66% of correct classification rate in more than 26000 newly reported interactions comprising 11000 new compounds. A Web Service implementing the described approaches (both similarity and machine learning based models) is freely available at: http://bioquimio.udla.edu.ec/cellfishing.
Klíčová slova:
Gene expression – Database and informatics methods – Algorithms – Optimization – Machine learning algorithms – Machine learning – Support vector machines – Kernel functions
Zdroje
1. Lagunin AA, Dubovskaja VI, Rudik A V., Pogodin P V., Druzhilovskiy DS, Gloriozova TA, et al. CLC-Pred: A freely available web-service for in silico prediction of human cell line cytotoxicity for drug-like compounds. Rishi A, editor. PLoS One. Public Library of Science; 2018;13: e0191838. doi: 10.1371/journal.pone.0191838 29370280
2. Menden MP, Iorio F, Garnett M, McDermott U, Benes CH, Ballester PJ, et al. Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties. Raghava GPS, editor. PLoS One. Public Library of Science; 2013;8: e61318. doi: 10.1371/journal.pone.0061318 23646105
3. Cortes-Ciriano I, Murrell D, Chetrit B, Bender A, Malliavin T, Ballester P. Cancer Cell Line Profiler (CCLP): a webserver for the prediction of compound activity across the NCI60 panel. bioRxiv. Cold Spring Harbor Laboratory; 2017; 105478. doi: 10.1101/105478
4. Cortés-Ciriano I, van Westen GJP, Bouvier G, Nilges M, Overington JP, Bender A, et al. Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics. Narnia; 2015;32: btv529. doi: 10.1093/bioinformatics/btv529 26351271
5. Ammad-ud-din M, Georgii E, Gönen M, Laitinen T, Kallioniemi O, Wennerberg K, et al. Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix Factorization. J Chem Inf Model. American Chemical Society; 2014;54: 2347–2359. doi: 10.1021/ci500152b 25046554
6. Zhang N, Wang H, Fang Y, Wang J, Zheng X, Liu XS. Predicting Anticancer Drug Responses Using a Dual-Layer Integrated Cell Line-Drug Network Model. Leslie CS, editor. PLOS Comput Biol. Public Library of Science; 2015;11: e1004498. doi: 10.1371/journal.pcbi.1004498 26418249
7. Lamb J. The Connectivity Map: a new tool for biomedical research. Nat Rev Cancer. 2007;7: 54–60. doi: 10.1038/nrc2044 17186018
8. Cheng J, Yang L, Kumar V, Agarwal P. Systematic evaluation of connectivity map for disease indications. Genome Med. 2014;6: 95. doi: 10.1186/s13073-014-0095-1 25606058
9. Duan Q, Reid SP, Clark NR, Wang Z, Fernandez NF, Rouillard AD, et al. L1000CDS(2): LINCS L1000 characteristic direction signatures search engine. NPJ Syst Biol Appl. 2016;2: 16015. doi: 10.1038/npjsba.2016.15 28413689
10. Wang K, Sun J, Zhou S, Wan C, Qin S, Li C, et al. Prediction of Drug-Target Interactions for Drug Repositioning Only Based on Genomic Expression Similarity. Markel S, editor. PLoS Comput Biol. 2013;9: e1003315. doi: 10.1371/journal.pcbi.1003315 24244130
11. Bajorath J, Peltason L, Wawer M, Guha R, Lajiness MS, Van Drie JH. Navigating structure–activity landscapes. Drug Discov Today. 2009;14: 698–705. doi: 10.1016/j.drudis.2009.04.003 19410012
12. Guha R, Van Drie JH. Assessing how well a modeling protocol captures a structure-activity landscape. J Chem Inf Model. 2008;48: 1716–28. doi: 10.1021/ci8001414 18686944
13. Chen R, Liu X, Jin S, Lin J, Liu J. Machine learning for drug-target interaction prediction. Molecules. MDPI AG; 2018. doi: 10.3390/molecules23092208 30200333
14. Liu X, Xu Y, Li S, Wang Y, Peng J, Luo C, et al. In Silico target fishing: addressing a "Big Data" problem by ligand-based similarity rankings with data fusion. J Cheminform. Springer; 2014;6: 33. doi: 10.1186/1758-2946-6-33 24976868
15. Cereto-Massagué A, Ojeda MJ, Valls C, Mulero M, Pujadas G, Garcia-Vallve S. Tools for in silico target fishing. Methods. 2015;71: 98–103. doi: 10.1016/j.ymeth.2014.09.006 25277948
16. Daina A, Michielin O, Zoete V. SwissTargetPrediction: updated data and new features for efficient prediction of protein targets of small molecules. Nucleic Acids Res. Oxford University Press (OUP); 2019;47: W357–W364. doi: 10.1093/nar/gkz382 31106366
17. Peón A.; Naulaerts S.; Ballester PJ. Predicting the Reliability of Drug-target Interaction Predictions with Maximum Coverage of Target Space. 2017;
18. Jenkins JL, Bender A, Davies JW. In silico target fishing: Predicting biological targets from chemical structure. doi: 10.1016/j.ddtec.2006.12.008
19. Bender A, Jenkins JL, Li Q, Adams SE, Cannon EO, Glen RC. Chapter 9 Molecular Similarity: Advances in Methods, Applications and Validations in Virtual Screening and QSAR. Annu Rep Comput Chem. Elsevier; 2006;2: 141–168. doi: 10.1016/S1574-1400(06)02009-3
20. Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45: D945–D954. doi: 10.1093/nar/gkw1074 27899562
21. Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem Substance and Compound databases. Nucleic Acids Res. 2016;44: D1202–13. doi: 10.1093/nar/gkv951 26400175
22. Peón A, Dang CC, Ballester PJ. How reliable are ligand-centric methods for target fishing? Front Chem. Frontiers Media S. A; 2016;4. doi: 10.3389/fchem.2016.00015 27148522
23. Ding P, Yan X, Liu Z, Du J, Du Y, Lu Y, et al. PTS: a pharmaceutical target seeker. Database. Oxford University Press; 2017;2017. doi: 10.1093/database/bax095
24. Cruz-Monteagudo M, Schürer S, Tejera E, Pérez-Castillo Y, Medina-Franco JL, Sánchez-Rodríguez A, et al. Systemic QSAR and phenotypic virtual screening: chasing butterflies in drug discovery. Drug Discov Today. 2017;22: 994–1007. doi: 10.1016/j.drudis.2017.02.004 28274840
25. RDKit, Open-Source Cheminformatics. 2018. p. http://www.rdkit.org.
26. Stumpfe D, Bajorath J. Exploring Activity Cliffs in Medicinal Chemistry. J Med Chem. 2012;55: 2932–2942. doi: 10.1021/jm201706b 22236250
27. Tropsha A. Best practices for QSAR model development, validation, and exploitation. Molecular Informatics. 2010. doi: 10.1002/minf.201000061 27463326
Článok vyšiel v časopise
PLOS One
2019 Číslo 10
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Těžké menstruační krvácení může značit poruchu krevní srážlivosti. Jaký management vyšetření a léčby je v takovém případě vhodný?
- Fixní kombinace paracetamol/kodein nabízí synergické analgetické účinky
Najčítanejšie v tomto čísle
- Correction: Low dose naltrexone: Effects on medication in rheumatoid and seropositive arthritis. A nationwide register-based controlled quasi-experimental before-after study
- Combining CDK4/6 inhibitors ribociclib and palbociclib with cytotoxic agents does not enhance cytotoxicity
- Experimentally validated simulation of coronary stents considering different dogboning ratios and asymmetric stent positioning
- Risk factors associated with IgA vasculitis with nephritis (Henoch–Schönlein purpura nephritis) progressing to unfavorable outcomes: A meta-analysis