Linking ordinal log-linear models with correspondence analysis: an application to estimating drug-likeness in the drug discovery process

Zafar, S.; Cheema, S. A.; Beh, E. J.; Hudson, I. L.; Hudson, S. A.; Abell, A. D.

Title: Linking ordinal log-linear models with correspondence analysis: an application to estimating drug-likeness in the drug discovery process
Creator: Zafar, S.; Cheema, S. A.; Beh, E. J.; Hudson, I. L.; Hudson, S. A.; Abell, A. D.
Relation: MODSIM2013, 20th International Congress on Modelling and Simulation. Proceedings of the 20th International Congress on Modelling and Simulation (Adelaide, S.A. 1-6 December, 2013) p. 1945-1951
Relation: http://www.mssanz.org.au/modsim2013
Publisher: Modelling and Simulation Society of Australia and New Zealand
Resource Type: conference paper
Date: 2013
Description: Ordinal log-linear models (OLLM’s) are amid the most widely used and powerful techniques to model association among ordinal variables in categorical data analysis. The parameters of such models are traditionally estimated using iterative algorithms, such as the Newton-Raphson method and iterative proportional fitting. More recent advances involve a non-iterative estimation method that performs equally well for estimation of the linear-by-linear association in OLLM’s for a two-way table. This paper establishes a link between the Beh-Davy non-iterative estimation method (BDNI) (Beh & Davy, 2003) and the well-known ordinal correspondence analysis (CA) technique for two dimensional tables. The BDNI estimator of association relies on orthogonal polynomials (OP’s), an approach dating from Lancaster (1953) to Beh and Davy (2003). OP’s provide insight into the origin and development of non-iterative estimation in OLLM’s, as an alternative to popular iterative procedures. The main advantage of OP’s is that the resultant parameters enable estimation of the linear, and also quadratic and higher order association structures amongst the ordered categories. Ordinal CA was first introduced by Beh (1997). We compare the linear-by-linear BDNI association procedure with the linear-by-linear association method depicted via graphical representation in ordinal CA. To demonstrate this link and theory we analyzed the relationships between predictors of drug-likeness used in drug discovery to filter out small molecule (drugs) that may fail clinical trials. In vitro absorption, distribution, metabolism and elimination (ADME) assays are now being conducted throughout the drug discovery process, from hit to lead optimization (Kerns & Li, 2008). The analytical community needs still to develop faster and better analytic methods to enhance the 'developability' of drug leads, and to formalize strategies for ADME assessment of candidates in the discovery and pre-clinical stages (Kassel 2004). Assessing drug-likeness depends on the nature of relationships between surrogate measures of drug-likeness (aqueous solubility, permeability) and physicochemical properties (lipophilicity, molecular weight (MW)). To date, lipophilicity is expressed quantitatively as logP, the most popular predictor for permeation. We apply our methods to test rules of druggability (Lipinski 2000). In this study 1,279 small molecules from Hudson et al. (2012), based on the DrugBank3.0 database (Knox et al., 2011), a unique chem-informatics resource are analysed. The pair-wise association between categorised variants of 2 of the 4 traditional parameters of Lipinski’s rule of five (Ro5), namely MW and logP, and an additional parameter, polar surface area (PSA), introduced by Veber et al., (2002), are shown to differ in magnitude or swap sign across strata, where strata are defined by a molecule’s druggable (Ro5 compliant) versus non-druggable (Ro5 violation) status. Log P’s association with MW, assumed to be positive, is shown to: [1] change sign from significantly negative to positive for nondruggable vs druggable strata, when data is tertiled within the stratum and the first level category (0) satisfies the new cutpoints for violation developed by Hudson et al., (2012), i.e. log P ≤1.9 and MW≤305, in contrast to Ro5’s cutpoints of log P ≤ 5 and MW ≤ 500; or [2] be lower (positive) for nondruggable vs druggable, for data stratified within quartiles. These findings support recent criticisms about using log P (Bhal et al., 2007) in ADME assessment. Also PSA’s association with MW, traditionally assumed to be positive, is shown to change sign from significantly negative to significantly positive for nondruggable vs druggable molecules, for data stratified within quartiles; with the first level category (0) satisfying the cutpoints for violation of Hudson et al., (2012), i.e. PSA ≤ 65, MW ≤ 305, in contrast to conventional cutpoints of 140 and 500, respectively. This study shows that assumed relationships between predictors need to be questioned. Log D, as a distribution coefficient (Bhal et al., 2007), may be a better surrogate than log P.
Subject: ordinal log-linear model; non-iterative estimation; linear-by-linear association parameter; ordinal correspondence analysis; Lipinski’s rule of 5 (Ro5)
Identifier: http://hdl.handle.net/1959.13/1052202
Identifier: uon:15391
Identifier: ISBN:9780987214331
Language: eng
Full Text
Reviewed

Hits: 4695
Visitors: 5408
Downloads: 340

		Thumbnail	File	Description	Size	Format
View Details Download			ATTACHMENT01	Publisher version (open access)	1 MB	Adobe Acrobat PDF	View Details Download