# br Performance indicators br The performance

4.2. Performance indicators

The performance measures for classification methods are accuracy, sensitivity, specificity and G-mean, among which ac-curacy indicates the proportion of all test samples that are correctly classified, sensitivity is the proportion of positive sam-

Table 5

Pearson correlation coe cients between patient identification number

and other attributes.

Correlated variables
Pearson correlation coe cients

Marital status at diagnosis
−
0.052∗

Reason no cancer-directed surgery
0.112∗∗

∗∗
significant at the 1% level.

∗
significant at the 5% level

ples that are correctly classified, specificity is the proportion of negative samples that are correctly classified and G-mean is the geometric mean of sensitivity and specificity. The indicators are computed as follows:

Sensitivity
=

TP

TP
+
FN

Speci f icity
=

TN

TN

Accuracy

TP

FP

TP

where TP (True Positive) and TN (True Negative) represent the number of positive samples and negative samples that are correctly identified respectively, FP (False Positive) and FN (False Negative) denote the number of positive samples and neg-ative samples that are mistakenly identified, respectively.

Meanwhile, regression methods are evaluated in terms of three indicators, namely, root mean square error (RMSE), mean absolute error (MAE) and coe cient of determination (R2). Suppose that yˆi is the predicted value of the ith sample, and yi is the corresponding true value, ns is the number of samples, the three indicators are defined as follows:

ns

R2 suggests the extent to which the model can explain the target variable variability and measures how well the new samples can be predicted by the model. The best possible value is 1.0 and it Mitomycin C can be negative when the model is very poor.
ns −1 (yi− yi )2
R2 = 1 − i=1 (11)

A 10-fold cross validation procedure is adopted in assessing the performance of both classification and regression meth-ods. K fold cross validation divides the complete dataset into K disjoint subsets with an approximately equal size, training and testing are conducted K times and each time a different subset is used for testing and the remaining K-1 subsets are for

Table 6

Parameters of SRRT-SEM.

Parameter
Value

Initial number of tree
300

Size of feature subset
22

Number of feature subset for screening
10

Number of initialized solutions
100

Crossover probability
1

Mutation probability
0.05

Threshold of MPEI (For one-stage regression
0.46
(fold 1)

training. Then, the mean value of the correctness from K operations is used as the estimate for the accuracy of the method.

The correct rate of K-fold cross validation A is calculated as:
¯
1
K

where ¯ is the correctness rate of K-fold cross validation, and K is the number of folds.
A

4.3. Experimental design and parameters tuning

In order to examine the performance of the proposed two-stage model, 10-fold cross validation is performed in all the comparative experiments. The results of each fold are the average results of 30 independent runs, which makes the results statistically sound.

For the classification stage, the proposed tree-based imbalanced ensemble classification method is compared with classi-fication tree in the context of using undersampling and using oversampling. For the regression stage, SRRT-SEM is compared against the regression tree, which is the base learner of the proposed method, three popular ensemble regression methods, namely, the random forest regressor, gradient boosting regression tree, and AdaBoost regression tree, and two other well-known ensemble methods: random subspace method and GEFS. GEFS is a popular ensemble generation algorithm using genetic algorithms to select base learners of neural networks trained under different feature space. For a fair comparison, we extended GEFS by replacing the neural networks with decision trees. In the proposed method, semi-random feature selection and base learner preselection are conducted in generating the pool of semi-random regression trees. The size of feature subsets, the number of feature subsets for screening and the threshold of MPEI are three important parameters. After extensive experiments, the parameters of the proposed method are summarized in Table 6. In each fold, the threshold of MPEI is adjusted according to the performance of generated regression trees based on the training set of this fold.