Archives

  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2020-03
  • 2020-07
  • 2020-08
  • br The input of iMaxDriver algorithm is

    2020-08-28


    The input of iMaxDriver algorithm is the graph representing TRN and a list containing the threshold values. The output of the algorithm is a list of BrefeldinA sorted by their coverage count in descending manner. In the Algorithm 1, which is provided for weighted graphs, in the first line, we create an empty list of tuples. Each tuple contains the gene name and its coverage count. In line 3, we calculate MaxIncome as the maximum of sum of the weights of incoming edges (equation (4)). In lines 4-6, the edge weights are normalized (equation (4)). Afterwards, in lines 7-
    10, FindCoverage function is called for each node in the and the returned value of coverage count is stored in Result. Finally, in line 11 the Result list is sorted by CoverageCount in descending order.
    In Algorithm 2, which is provided for non-weighted graphs, in line 1, we first create an empty list of tuples, each tuple contains gene name and its coverage (number of genes that activated). In line 3, a loop defined to run the coverage discovery for 100 times and for each of the iterations, in lines 4-6 new random values assigned to the edges. Note that in each iteration the edge weights regenerated randomly to overcome random value effects on final results. Next, for normalization of randomly generated edge weights, in line 7, a for loop defined to iterate over all of the nodes and normalize the incoming edges to them. In lines 8-9, the sum of weights of incoming edges of node
    ACCEPTED MANUSCRIPT
    is calculated. Now in lines 10-12, for each incoming edge to node , the edge weights divided by Sum value according to equation (5). Next, in lines 14-16, the core function of coverage finding (FindCoverage) called for each node and the results are integrated by averaging coverage values.
    The algorithm of FindCoverage function is provided in Algorithm 3. In this algorithm, the weighted graph and list
    are the inputs and coverage count (ActiveCount) is output. At the beginning, in line 1, initially the gene considered as the only active gene and is assigned to ActiveNodes. Subsequently, in line 2, the number of currently activated genes by gene assigned to be zero. The algorithm loops through lines 3-14, until an iteration with no new gene activation. This defined in the algorithm using while statement in line 3. In line 4, an empty list of the genes that are going to be activated in this iteration (NewActiveNodes), is created. Subsequently, in lines 7-10, for each inactive gene, if difference of sum of incoming edge weights from currently active genes and threshold of the gene, be larger than , the gene will be activated. Finally, in line 13, the newly activated nodes (NewActiveNodes) is added to current active nodes (ActiveNodes).
    In line 9, we have used as a parameter in the algorithm for alleviating the differences exist in threshold values and edge weights causing the algorithm to not work properly. We have set α = 0.15 for obtaining the best result. The larger value of causing less genes to be activated at final step of the algorithm and vice versa. If be too small, all
    of the genes will be activated and if be too large, none of the genes will be activated and results in trivial outcomes.
    2.3 Evaluation Method
    We evaluated iMaxDriver by comparing its results with fifteen popular computational CDG prediction methods. The list of genes introduced as CDGs by the fifteen computational methods are obtained from DriverDB v2 [24] for the evaluation with same input for all methods. The details of computational methods used for evaluation is available in Table 2. These lists are reported in DriverDB webpage (http://driverdb.tms.cmu.edu.tw/driverdbv2/) based on three different cancer datasets, namely breast invasive carcinoma (BRCA) [25], lung squamous cell carcinoma (LUSC)
    [26] and colon adenocarcinoma (COAD) [27], which are parts of the Cancer Genome Atlas (TCGA) [28]. We also retrieved the lists of genes identified as CDG by each of the fifteen methods for benchmarking.
    ACCEPTED MANUSCRIPT
    Table 2 The details of the computational methods used for comparison with iMaxDriver
    Method Name Feature(s) Data type
    ActiveDriver [3] Protein phosphorylation signaling sites Single nucleotide variants related to
    phosphorylation
    CoMDP [12] Co-occurred mutated driver pathways Mutation data
    Dendrix [9] Coverage and exclusivity of mutations Mutation data
    DawnRank [4] Downstream expression impact in GE and network and mutation data
    network
    e-Driver [5] Protein functional region mutation rates Mutation profiles
    DriverNet [13]