Frequently asked questions

What is Sleeping Beauty insertional mutagenesis? What are transposons?

Sleeping Beauty insertional mutagenesis is a gene discovery tool that uses a DNA transposable element to mutagenize the mouse genome. Transposons act as mutagens by altering gene regulation that may convey a selective growth advantage to cells. The accumulation of selected insertions over time leads to tumor formation. You can learn more from Wikipedia or the glossary.

Why are some links orange?

These links take you to external pages, whereas the blue links point to pages within this application.

How do you determine cancer drivers?

To determine patterns of transposon insertions in genes, we first determined the orientation of each insertion with respect to the gene and calculate the number of forward (5' to 3') and reverse (3' to 5') insertions in said gene for the population of tumors. A binomial test is performed to determine whether the number of forward insertions is greater than expected by chance. If so, we next use a Kolmogorov-Smirnov test to determine if the forward insertions are non-uniformly distributed. If so, and if 80% of the total insertions in the gene are in the forward orientation, we say the pattern is activating. If the pattern is not activating, and if fewer than 60% of insertions are in the forward orientation and there are two or more reverse insertions, we say the pattern is inactivating. If the pattern is not considered activating or inactivating, we say it is indeterminate. An activating pattern is often visually discernible by a tight cluster of insertions or a non-random pattern of forward insertions with a defined boundary, indicating selection for a downstream product. Inactivating patterns show a distribution of insertions throughout the coding region, indicative of gene disruption.

Why do you exclude data from the donor chromosome in the population-level analysis?

Insertions are not randomly distributed across the donor chromosome, and the insertion density on this chromosome tends to be a lot higher than on the non-donor chromosomes.

How do you determine if an insertion pattern is activating or inactivating?

For each driver, the insertion pattern is initially assigned the 'indeterminate' label. We then identify the orientation of each insertion with respect to the gene and calculate the number of forward (5' to 3') and reverse (3' to 5') insertions in said gene for the population of tumors. A binomial test is performed to determine whether the number of forward insertions is greater than expected by chance. If so, we next use a Kolmogorov-Smirnov test to determine if the forward insertions are non-uniformly distributed. If so, and if 80% of the total insertions in the gene are in the forward orientation, we call the pattern 'activating'. If the pattern is not activating, and if fewer than 60% of insertions are in the forward orientation and there are two or more reverse insertions, we call the pattern 'inactivating.'

Why do some Gene Report pages contain a warning message?

A warning message appears at the top of some gene reports to highlight specific information pertaining to the queried gene that may be of interest to the user. Some genes (e.g. En2, Foxf2, Serinc3, Dpp10) are known Sleeping Beauty insertion hotspots because these loci contain sequences found in the transposon. Other warning messages attempt to highlight a gene that might be considered exciting but was not found to be reproducible using a complimentary sequencing technology (e.g. Pds5b) or highlight a known gene annotation issue that is known to the SB community (e.g. Sfi1). For example, the complex imprinted locus on mouse chromosome 12, containing at least 6 genes including Dlk1, Meg3, Rtl1, Rian, Mir3071, and Dio3, has been extensively studied in the context of SB-induced liver tumorigenesis (see Riordan et al., 2013). The data presented in SBCDDB for HCA and HCC agrees with that reported by Riordan et al., specifically that Rtl1 appears to be the only gene under positive selection during liver tumorigenesis despite the false detection of Rian as the predominant trunk and progression driver in this region. This highlights at least some of the challenges associated with applying a set of statistical analyses uniformly across the mouse genome. Thus, the authors have made hardcoded changes to the insertion data for this region in order to prioritize the driver status of Rtl1 in deference to the evidence for this gene in contributing to liver tumorigenesis provided by Riordan et al. These changes are noted as flagged comments in the Gene Report for Rian. In a effort to keep SBCDDB users informed, additional warning messages may appear as needed.

I would like to use the SB insertion data underlying the SBCDDB for my research. Is this possible?

Yes, please send an email request for additional details.

How often will the SBCDDB be updated?

It depends on the type of update:

May I contribute SB insertion data from my Sleeping Beauty transposon mouse study to the SBCDDB?

Yes, new SB insertion datasets from end-stage mouse tumors will be accepted on an ongoing basis. Please send an email inquiry for additional details. Please note that newly submitted SB insertion datasets obtained by 454 sequencing may appear along with other minor changes (as outlined above in "How often will the SBCDDB be updated?") in future updates to this version of the SBCDDB. Newly submitted SB insertion datasets obtained by other sequencing platforms (Illumina, Ion Torrent, etc) may not appear until there is a new major release of the SBCDDB.

How do I contact you with questions related the SBCDDB?

Questions about the SBCDDB can be directed here.

Do you have a great idea regarding a new feature that could appear in a future update to the SBCDDB?

Please email your suggestion to us.