br For the APA analyses APADB data were filtered
For the APA analyses, APADB data were filtered for CS identified only in human peripheral blood samples and supporting read align-ments for each CS were visualized in a genome browser. This database was built by 3′ end sequencing using massive analysis of com-plementary DNA ends, a high-throughput next-generation sequencing-based technique (Müller et al., 2014). Additionally, APASdb allowed the mapped CS counting in all transcript variants for each gene and from 22 normal human tissue samples (searching dataset: hg19 human-all22-tissues). This database, based on the sequencing APA sites (SAPAS) method reported previously (Sun et al., 2012), provided both the position and usage quantification for a given alternative CS among transcripts derived from the same gene by computing their corre-sponding normalized-reads (You et al., 2015).
5.3. MicroRNA-mediated regulation analysis
To investigate the miRNAs that may regulate Minocycline HCl of selected tumor suppressor genes (n = 81) and oncogenes (n = 17), experimen-tally validated data of miRNA-target gene interactions in humans were collected from the miRTarBase release 6.0 (Chou et al., 2016), starBase v2.0 (Li et al., 2014), TarBase v5.0 (Sethupathy et al., 2016), and miRecords v4.0 (Xiao et al., 2009) databases. Data derived from miR-TarBase was restricted to interactions classified as functional, including those with weak experimental evidence, while starBase data was fil-tered to keep only interactions predicted by two or more software and supported by at least one experiment (“low stringency” parameter). For other databases, all available interactions were included in the analysis.
Experimental data was complemented by computational target prediction tools, namely using TargetScan v7.2 (Agarwal et al., 2015), Diana MicroT-CDS (Paraskevopoulou et al., 2013), and miRanda-mirSVR (August 2010 Release) (Betel et al., 2010). To control false positive rates and restrain the large list of predicted miRNAs, the fol-lowing filtering criteria were adopted: (a) for TargetScan, only inter-actions involving 8mer, 7mer and 6mer conserved miRNA sites and with Context++ score lower than −0.2 were considered; (b) for Diana MicroT-CDS, only predictions with a score higher than 0.9 were kept;
(c) for miRanda-mirSVR, only interactions with mirSVR score lower than −0.1 (regarded as “good mirSVR score”) involving conserved miRNAs were included; (d) the final set of predicted interactions was defined by pairs of miRNA-target genes suggested by at least two computational tools.
Finally, validated and predicted miRNA-target gene interactions collected were filtered in order to identify interactions whose target gene was either a tumor suppressor gene or an oncogene included in this study. Moreover, whenever possible, miRNAs were clustered into families according to the miRBase 21 annotations (some miRNAs do not have a family classification). These analyses were conducted using in-house R scripts (further information upon request). To better visualize the overlap among va-lidated and predicted data, we created regulatory networks and Cytoscape software version 3.1.0 was used for graphical network vi-sualization (Shannon et al., 2003). All miRNAs and genes identifiers were mapped to miRBase 21 and HUGO Gene Nomenclature Committee (HGNC) standards, respectively, to maintain uniformity and enable consistent data integration across different sources.
5.4. Statistical analyses
SPSS version 18.0 (IBM) was used for data handling and statistical analyses. Fisher's exact test was employed to compare different PAS hexamer type frequency between oncogenes and tumor suppressor genes. The same test was applied to identify miRNAs and miRNA fa-milies overrepresented in the regulation of the groups of oncogenes and tumor suppressors studied, i.e., they have a greater number of target genes among the genes of interest included in this study than would be expected by chance. The complete list of validated and predicted in-teractions collected from the aforementioned sources, without filtering it for interactions involving all CPGs evaluated in this study, was adopted as the background set in this over-representation analysis. More details about the composition of the background set are detailed in Table S2. Statistical tests were two-sided and P-values of < 0.01 were considered statistically significant.