Welcome to NBBC

NBBC: A Non-B DNA Burden Explorer in Cancer

Multi-level Queries, Motif Clustering and Non-B Burden heterogeneity.



Non-B DNA and Genomic instability in Cancer

Alternate (non-B) DNA-forming structures, such as Z-DNA, G-quadruplex, mirror repeats have demonstrated a potential role in cancer etiology. It has been found that non-B DNA-forming sequences can stimulate genetic instability in human cancer genomes, implicating them in the development of cancer and other genetic diseases. While there exist several non-B prediction databases published, they lack the ability to both analyze and visualize non-B data within the context of cancer gene signature sets.

Here, we introduce NBBC, A Non-B DNA Burden Explorer in Cancer to serve the purpose of analyzing and visualization non-B motif data. We introduce 'Non-B Burden', as a metric to summarize the prevalence of non-B DNA motifs at a gene level or within a genomic region. The non-B DNA data is collected from Non-B DB v2.0 database.



I. Computation of Non-B Burden

(Goal: To query non-B forming regions in genes and calculate non-B burdens.)

Based on the user input gene list, the computation module calculates non-B burden composition for the gene signature and offers multiple normalizations to enable comparisons across genes or non-B structures.


II. Module1: Gene Screen

(Goal: To help users select a subset of genes with high genomic instability evaluated by non-B burden. )

The gene layer analyzes non-B burdens and provides several visualizations for descriptive analysis of burden values, burden distribution, and burden-based gene clustering. A stacked barplot is used to visualize the total non-B burden. A bubble plot allows users to see the non-B burden by gene and type. A burden clustering function is also available within the heatmap format. Worth mentioning, the module, “Burden in Batch” (BiB) is provided process multiple queries in batch.


III. Module2: Motif Screen

(Goal: To further select high-quality motifs that are more likely to form non-B structure in the interested genes and to provide the specific sequences that can be used for wet lab experiments.)

The motif layer performs sequence-level motif clustering for high-quality non-B motif detection. For example, the length and guanine contents (%G) are two major factors in deciding motif quality for non-B forming. We employ unsupervised clustering to detect non-B motifs with both high G-contents and proper length. The app supports multiple features for clustering including length, guanine, adenine compositions in the non-B motif sequences.


Browser compatibility


To cite our work

Example References:
Xu Q, Kowalski J (2023). NBBC: A Non-B DNA Burden Explorer in Cancer. <https://kowalski-labapps.dellmed.utexas.edu/NBBC/>.
Example Text Citation:
e.g. “We used the Non-B DNA Burden Explorer in Cancer. (Xu and Kowalski, 2023)”

Kowalski-Muegge Lab
Dell Medical School, University of Texas at Austin

5

Input options

The input options for NBBC include a single gene, a list of genes, a gene signature, or genomic coordinates. To accommodate these input types, we have provided four options below for users to choose from.

Option 1 offers pre-populated gene sets related to cancer, while Option 2 provides molecular signatures of cancer cell lines. Alternatively, users may manually input gene symbols in Option 3 or upload genomic coordinates of interesting regions in Option 4.


Please select an option and then input geneset below.

(default example: Homologous recombination Gene set, 35 Genes, Option 1)

Option 1

Signatures

(Query by built-in genesets)

Option 2

Cell lines

(Query by Cell lines Genetic features)

Option 3

Manually Input

(Query by gene symbols)

Option 4

Genomic Coordinates

(Query by genomic coordinates, hg19)

This example includes multiple groups of mutation sites. Mutation sites are the query genomic regions for Gene are only group names and can be any strings.
This example includes a list of genomic regions with [gene symbols] as labels
This example includes a list of genomic regions with [tags] as labels.

Preview of selected genes/regions


              

Table: Genomic Coordinates uplaoded from Option 4:

For below example, the mutation sites (chromosome, start, end) of multiple genes (hync_symbol) are listed.
In this case, the query regions are mutation sites and the groups are tagged by gene symbols.
Each row in the table represent a region in query.
The four column names are required to be the same as the example.
- [hgnc_symbol] can be either gene name or any tags to annotation the region.
- [chromosome, start, end] takes genomic coordinates as input.



Loading...

Non-B Burden per Gene in Total



Table for total non-B burdens per gene

Burden per Gene by non-B type

(Double click the legend to isolate the track.)

Table for non-B Burden by type

Burden distribution




Please select a region in the graph for details of non-B burden


Gene clustering on non-B burdens


motifs_n: Counts

motifs_gln: Normalized by Gene/Region length (per kilobase)

motifs_cpm: Normalized by Motif Library Size (per million)

motifs_gln_cpm: Normalized by Gene length (per kilobase) and Library Size (per million)

*For genomic coordinates, region lengths are used for normlization.

Loading...

Loading...

Full List of Non-B Motifs in query

Non-B DNA Motif Clustering





Select a Region for multiple motifs sequences


Click a Dot for the motif sequence


Motifs with flank regions

Burden in Batch

(Non-B Burden for groups of regions)

NBBC allows users to upload multiple groups of genomic regions (genes, mutation regions) in batch and calculate the non-B burden for each group

In below example, we used 104 groups of mutation regions from 104 TCGA-PAAD early-stage samples.

We used 'Burden in Batch' to calculate non-B burdens for each samples and performed sample clustering on them.

Significant survival difference was observed between non-B-burden derived patient sample clusters


Loading...

Input

Genomic regions by Group

Output

Non-B Burdens by group

Two column table names with [geneset_id] and [gene].
Example on the right.

Each row represents a genomic region.
Please use [group_id] to group the regions


For each group, non-B burdens are calculated
and normalized for genomic regions in it.




Example downsteam analysis:

Clustering on non-B burden








Kowalski-Muegge Lab


Lab Website

Web Apps



Contact

Jeanne Kowalski-Muegge, Ph.D.

Professor, Department of Oncology
Dell Medical School
University of Texas at Austin
Email

Qi Xu, Graduate Student

Interdisciplinary Life Sciences (ILS) Graduate Programs
Dell Medical School
University of Texas at Austin
Email