Towards a chromatographic similarity index to establish localized quantitative structure-retention models for retention prediction: use of retention factor ratio
Tyteca, E and Talebi, M and Amos, R and Park, S and Taraji, M and Wen, Y and Szucs, R and Pohl, CA and Dolan, JW and Haddad, PR, Towards a chromatographic similarity index to establish localized quantitative structure-retention models for retention prediction: use of retention factor ratio, Journal of Chromatography A, 1486 pp. 50-58. ISSN 0021-9673 (2017) [Refereed Article]
Quantitative Structure-Retention Relationships (QSRR) have the potential to speed up the screening phase of chromatographic method development as the initial exploratory experiments are replaced by prediction of analyte retention based solely on the structure of the molecule. The present study offers further proof-of-concept of localized QSRR modelling, in which the retention of any given compound is predicted using only the most chromatographically similar compounds in the available dataset. To this end, each compound in the dataset was sequentially removed from the database and individually utilized as a test analyte. In this study, we propose the retention factor k as the most relevant chromatographic similarity measure and compare it with the Tanimoto index, the most popular similarity measure based on chemical structure. Prediction error was reduced by up to 8 fold when QSRR was based only on chromatographically similar compounds rather than using the entire dataset. The study therefore shows that the design of a practically useful structural similarity index should select the same compounds in the dataset as does the k-similarity filter in order to establish accurate predictive localized QSRR models. While low average prediction errors (Mean Absolute Error (MAE) < 0.5 min) and slopes of the regression lines through the origin close to 1.00 were obtained using k-similarity searching, the use of the structural Tanimoto similarity index, considered as the gold standard in Quantitative Structure-Activity Relationships (QSAR) studies, generally resulted in much higher prediction errors (MAE > 1 min) and significant deviations from the reference slope of 1.0. The Tanomoto similarity index therefore appears to have limited general utility in QSRR studies. Future studies therefore aim at designing a more appropriate chromatographic similarity index that can then be applied for unknown compounds (that is, compounds which have not been tested previously on the chromatographic system used, but for which the chemical structures are known).
QSRR, similarity analysis, retention factor, liquid chromatography, HILIC, RPLC, IC