Use of dual-filtering to create training sets leading to improved accuracy in quantitative structure-retention relationships modelling for hydrophilic interaction liquid chromatographic systems

Taraji, Maryam; Haddad, Paul; Amos, RIJ; Talebi, Mohammad; Szucs, R; Dolan, JW; Pohl, CA

File(s) under permanent embargo

Use of dual-filtering to create training sets leading to improved accuracy in quantitative structure-retention relationships modelling for hydrophilic interaction liquid chromatographic systems

journal contribution

posted on 2023-05-19, 10:27 authored by Maryam TarajiMaryam Taraji, Paul HaddadPaul Haddad, Amos, RIJ, Mohammad TalebiMohammad Talebi, Szucs, R, Dolan, JW, Pohl, CA

The development of quantitative structure retention relationships (QSRR) having sufficient accuracy to support high performance liquid chromatography (HPLC) method development is still a major issue. To tackle this challenge, this study presents a novel QSRR methodology to select a training set of compounds for QSRR modelling (i.e. to filter the database to identify the most appropriate compounds for the training set). This selection is based on a dual filtering strategy which combines Tanimoto similarity (TS) searching as the primary filter and retention time (t_R) similarity clustering as the secondary filter, using a database of pharmaceutical compound retention times collected over a wide range of hydrophilic interaction liquid chromatography (HILIC) systems. To employ t_R similarity filtering, correlation to a molecular descriptor is used as a measure of retention time. For the retention time of a compound to be modelled a relationship between experimental chromatographic data and various molecular descriptors is calculated using a genetic algorithm-partial least squares (GA-PLS) regression. The proposed dual-filtering-based QSRR model significantly improves the retention time predictability compared to the diverse, global, and TS-based QSRR models, with an average root mean square error in prediction (RMSEP) of 11.01% over five different HILIC stationary phases. The average CPU time for implementing the proposed approach is less than 10 min, which makes it quite favorable for rapid method development in HILIC. In addition, interpretation of the molecular descriptors selected by this novel approach provided some insight into the HILIC mechanism.

Funding

Australian Research Council

Pfizer

Thermo Fisher Scientific Australia

History

Publication title

Journal of Chromatography A

Volume

1507

Pagination

53-62

ISSN

0021-9673

Department/School

School of Natural Sciences

Publisher

Elsevier Science Bv

Place of publication

Po Box 211, Amsterdam, Netherlands, 1000 Ae

Rights statement

Repository Status

Restricted

Socio-economic Objectives

Expanding knowledge in the chemical sciences

Usage metrics

Keywords

QSRR prediction accuracy dual-filtering retention prediction similarity searching HILIC mechanism

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Use of dual-filtering to create training sets leading to improved accuracy in quantitative structure-retention relationships modelling for hydrophilic interaction liquid chromatographic systems

Funding

Australian Research Council

Pfizer

Thermo Fisher Scientific Australia

History

Publication title

Volume

Pagination

ISSN

Department/School

Publisher

Place of publication

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports