News & Awards
Cancer Genome
Lab location
Contact Us




Research Projects

The advancements of high-throughput transcriptomic, genomic and deep sequencing technologies have generated a flood of data in the public domain and private warehouse. However, the laboratory discoveries based on the analysis of these data have met with limited success. The focus of our laboratory is to apply a multiple disciplinary approach inclusive of bioinformatics, molecular genetics, cancer cell biology, and translational studies to discover driver genetic and epigenetic aberrations, and qualify viable cancer targets on the basis of next-generation sequencing and genomic profiling data.

A. Identification of Oncogene Targets for the Development of Precision Therapeutics in Breast Cancer

Our laboratory research focus is to discover and characterize novel therapeutic targets and predictive biomarkers in breast cancer for the development of precision therapeutics. By interrogating multidimensional genomic datasets with a drug-target database and a “concept signature” analysis we developed to reveal oncogenes (Nature Biotechnology, 2009), we identified several key oncogene targets deregulated by genomic rearrangements or amplifications in aggressive breast cancers. The clinical relevance and biological function of these targets, as well as their role in breast cancer therapeutic resistance are being investigated. These projects have been funded by the National Cancer Institute (two active R01 awards), Department of Defense Congressionally Directed Medical Research Programs (Idea Award and two Postdoctoral Fellowship Awards), Susan G. Komen for the Cure Foundation (two Postdoc Fellowship awards), Nancy Owens Memorial Foundation, Breast Cancer Research Foundation, and Commonwealth of PA. We expect that our new discoveries will yield novel insights into recurring genetic and epigenetic abnormalities leading to breast cancer, and establish robust targets for effective and personalized therapies.

B. Characterization of Pathological Recurrent Gene Fusions in Breast Cancer and Other Solid Tumors.

The discovery of TMPRSS2-ETS fusion in ~70% prostate tumors and EML4-ALK in ~7% lung cancer revealed gene fusions as a crucial class of genetic lesions driving epithelial tumorigenesis. To examine the key characteristics that assist in the discovery of recurrent gene fusions in solid tumors, we performed a multi-dimensional characterization of known cancer-related gene fusions. Placing the array of cancer genes in the context of a compilation of “molecular concepts”, including molecular interactions, gene annotations and pathways revealed the “signature concepts” defining the genes driving cancer initiation and progression. Using such information, we developed an innovative concept signature (ConSig) technology that nominates biologically important genetic aberrations from high-throughput data by assessing their association with molecular concepts characteristic of cancer genes.

To integrate use of high-throughput genomic data, we analyzed the genomic imbalances associated with known gene fusions, finding that recurrent gene fusions exhibit distinctive patterns of copy number alterations corresponding to differential portions of fusion partners. We have formulated this pattern as the “fusion breakpoint principle”, and developed a genome-wide breakpoint mapping analysis to identify recurrent unbalanced rearrangements from copy number data. This principle also laid the foundation for an amplification breakpoint analysis (ABRA) to discover amplified gene fusions in cancer from copy number data (Cancer Discovery 2011).

Based on these principles, we then developed a powerful integrative pipeline called “Fusion Zoom” to reveal recurrent pathological gene fusions from RNA sequencing data (Figure 2). We postulate that the detection of authentic driver gene fusions would be greatly improved by applying more sensitive parameters to comprehensively capture the authentic fusion sequences from the RNAseq data, and by integrating distinct types of genomic data to prioritize the driving fusion events based on the aforementioned principles. The Fusion Zoom pipeline detects recurrent chimeras potentially encoding in-frame protein products from RNAseq data, catalogs the unbalanced breakpoints at the genomic loci of these fusion partner genes from copy number data, and prioritizes pathological gene fusions through the ConSig analysis.

The above analyses have lead to the discovery of recurrent ESR1-CCDC170 fusions in more aggressive breast cancers (Figure 3)(Nature Communications. 2014), recurrent NFE2 rearrangements in lung adenocarcinoma (Nature Biotechnology 2009), and oncogenic KRAS gene fusions in a rare subset of prostate cancer (Cancer Discovery 2012). The ongoing project of our laboratory is to further develop and apply this integrative platform to discover recurrent gene fusions in breast and other major solid tumors. Paired-end RNA sequencing and whole genome sequencing data from the cancer genome atlas (TCGA) will be leveraged to discover chimerical transcripts and genomic rearrangements, and the large measure of cancer genomic data from public domain will be interrogated to facilitate fusion candidate prioritization. This project is funded by The National Cancer Institute (R01 award) and Department Of Defense (postdoc fellowship).

C. Discovery of amplified kinase target in aggressive luminal breast cancers

More aggressive and therapy-resistant ER-positive breast cancers remain a great clinical challenge. To identify new kinase targets for effective intervention, we applied our integrative ConSig-amp analysis (Figure 4a) to the multi-dimensional genomic datasets from The Cancer Genome Atlas. This analysis revealed tousled-like kinase 2 (TLK2) as a lead candidate kinase target that is frequently amplified in ~10.5% of ER-positive breast tumors. The resulting overexpression of TLK2 is more significant in aggressive and advanced tumors, and correlates with worse clinical outcome regardless of endocrine therapy (Figure 4b). Ectopic expression of TLK2 leads to enhanced aggressiveness in breast cancer cells, which may involve the EGFR/SRC/FAK signaling. Conversely, TLK2 inhibition selectively inhibits the growth of TLK2-high breast cancer cells, downregulates ERα, BCL2, and SKP2, impairs G1/S cell-cycle progression (Figure 4c), induces apoptosis, and significantly improves progression-free survival in vivo (Figure 4d). We have identified two potential TLK2 inhibitors that could serve as backbones for future drug development. This study represents the first comprehensive analysis of TLK2 function in aggressive luminal breast cancers (Nature Communications. in Press.)

In addition, we discovered that TLK2 overexpression mechanistically impairs Chk1/2-induced DNA-damage checkpoint signaling, leading to a G2/M checkpoint defect, delayed DNA repair process, and increased chromosomal instability. This is the first observation linking TLK2 function to chromosomal instability. This finding yields new insight into the deregulated DNA damage pathway and increased genomic instability in aggressive luminal breast cancers (Mol. Cancer Res. pii: molcanres.0161.2016).

Together, amplification of TLK2 presents an attractive genomic target for aggressive ER-positive breast cancers. Of note, the latest phosphoproteomic study of TCGA breast tumors by The Clinical Proteomic Tumor Analysis Consortium (CPTAC) independently identified TLK2 as an amplicon-associated highly phosphorylated kinases in luminal breast cancer(Nature 2016 doi:10.1038/nature18003). This further supports the significance of TLK2 amplification in luminal breast cancer.

HEPA-PARSE approach (970x444)

Figure 4. Identification of TLK2 as an amplified kinase target in aggressive luminal breast cancer. (a) The bioinformatics workflow of ConSig-Amp to discover therapeutically relevant oncogene targets in cancer at genome-wide scale based on TCGA copy number and RNAseq datasets. (b) Kaplan-Meier plots based on multiple gene expression datasets showing correlation of TLK2 overexpression with the outcome of systemically untreated or endocrine-treated ER+ breast cancer patients. (c) A schematic of normal G1/S cell cycle signaling and their alternations following TLK2 inhibition (black arrows). (d) The effect of TLK2 inhibition in the MCF7 xenograft tumors inducibly expressing a TLK2 shRNA, in the presence or absence of concomitant tamoxifen treatment. Figure shows the Kaplan–Meier survival plot comparing the progression-free survival of different treatment groups.

D. Genome-Wide Detection of Cancer-Specific Antigen Targets Using an Integrated Computational and Laboratory Technology

Tumor specific antigens (TSAs) have been widely adopted in clinics as active diagnostic and therapeutic targets in cancer. In our previous research project aimed at genome-wide detection of immunological targets, we analyzed the antigens widely adopted as clinical targets, and observed that these antigens usually present a distinctive heterogeneous gene expression profile in large-scale microarray datasets (Figure 5a). We therefore developed the Heterogeneous Expression Profile Analysis (HEPA) which preferentially identifies the clinically useful tumor antigens from the human genome. We then evaluated the immunogenicity of the TSAs by detecting specific autoantibodies in cancer patients. To deal with the large number of candidates, we developed a novel assay called Protein A/G based Reverse Serological Evaluation (PARSE), in which radio-labeled, in vitro translated proteins were used as probes for the presence of serum antibodies (Figure 5b). This allows for quick detection of the autoantibodies against a wide array of serum samples without the need of producing purified recombinant proteins. Further, in this assay, the in vitro translated tumor antigens retained the natural protein conformation and post- translational modifications, thus generating a precise picture of autoantibody responses against these antigens in cancer. Seven out of twelve novel antigens evaluated by PARSE elicited highly tumor-specific autoantibody responses in 4-15% of patients with selected cancers, resulting in distinctive autoantibody signatures in lung and stomach cancers.

Together, HEPA-PARSE will comprise an integrative computational-experimental technology for the detection of cancer specific immunome. In addition, the HEPA platform can be also modified and trained to nominate tumor specific membrane targets, which are considered as near-term drug candidates. Combining with the membrane localization database, HEPA can quickly reveal the membrane proteins specific to certain tumor entities. Then the ConSig technology can be applied to evaluate the functional significance of putative membrane targets in cancer progression (Cancer Research 2012).


Wang Laboratory @ University of Pittsburgh Cancer Institute.