1 Department of Chemistry, Payame Noor University (PNU), P. O. Box, 19395-3697 Tehran), Iran

A comparative workflow, including linear and non-linear QSAR models, was carried out to evaluate the predictive accuracy of models and predict the inhibition activity of a series of aryl-substituted isobenzofuran-1(3H)-ones. The data set consisted of 34 compounds was classified into the training and test sets, randomly. Molecular descriptors were selected using the genetic algorithm (GA) as a feature selection tool. Various linear models based on multiple linear regression (MLR), principle component regression (PCR) and partial least square (PLS) and non-linear models based on artificial neural network (ANN), adaptive network-based fuzzy inference system (ANFIS) and support vector machine (SVM) methods were developed and compared. The accuracy of the models was studied by leave-one-out cross-validation (Q_LOO^2), Y-randomization test and group of compounds as external test set. Six descriptors were selected by GA to develop predictive models. With respect to the linear models, GA-PCR method was more accurate than the reset with statistical results of 〖 R〗_train^2=0.883, R_test^2=0.897,〖 R〗_(adj,train)^2=0.829,〖 R〗_(adj,test)^2=0.849,〖 F〗_train=24.07 and F_test=34.17. In case of non-linear models, GA-SVM (R_train^2=0.992 and R_test^2=0.997) showed high predictive accuracy for the inhibitory activity. It was found that the selected descriptors have the major roles in interpretation of biological activities of the compounds.

