Prediction of retention in hydrophilic interaction liquid chromatography using solute molecular descriptors based on chemical structures
Taraji, M and Haddad, PR and Amos, RIJ and Talebi, M and Szucs, R and Dolan, JW and Pohl, CA, Prediction of retention in hydrophilic interaction liquid chromatography using solute molecular descriptors based on chemical structures, Journal of Chromatography A, 1486 pp. 59-67. ISSN 0021-9673 (2017) [Refereed Article]
Quantitative structure-retention relationship (QSRR) models are developed to predict the retention times of analytes on five hydrophilic interaction liquid chromatography (HILIC) stationary phases (bare silica, amine, amide, diol and zwitterionic), with a view to selecting the most suitable stationary phase(s) for the separation of these analytes. The study was conducted using six β-adrenergic agonists as target analytes. Molecular descriptors were calculated based only on chemical structures optimized using density functional theory. A genetic algorithm (GA) was then used to select the most relevant molecular descriptors and these were used to build a retention model for each stationary phase using partial least squares (PLS) regression. This model was then used to predict the retention of the test set of target analytes. This process created an optimized descriptor set which enhanced the reliability of the developed QSRR models. Finally, the QSRR models developed in the work were utilized to provide some insight into the separation mechanisms operating in the HILIC mode. Three performance criteria − mean absolute error (MAE), root mean square error of prediction scaled to retention time (RMSEP), and the number of selected descriptors, were used to evaluate the developed models when applied to an external test set of six β-adrenergic agonists and showed highly predictive abilities. MAE values ranged from 13–25 s on four of the stationary phases, with a somewhat higher error (50 s) being observed for the zwitterionic phase. RMSEP values of 4.88-11.12% were recorded. Validation was performed through Y-randomization and chemical domain applicability, from which it was evident that the developed optimized GA-PLS models were robust. The high levels of accuracy, reliability and applicability of the models were to a large extent due to the optimization of the GA descriptor set and the presence of relevant structural and geometric molecular descriptors, together with descriptors based on important physicochemical properties, which establish a strong connection between retention time and meaningful chemical properties. The present strategy, while it is a pilot study, holds great promise for broader screening of HILIC stationary phases for desired separation, as well as for acquisition of information about molecular mechanisms of separation under chromatographic conditions.