Knowledge of cancer genomic DNA sequences has created unprecedented opportunities for mutation studies. Computational analyses have begun to decipher mutational signatures that identify underlying causes. A recent analysis encompassing 30 cancer types reported 20 distinct mutation signatures, resulting from ultraviolet light, deficiencies in DNA replication and repair, and unexpectedly large contributions from both spontaneous and APOBEC-catalyzed DNA cytosine deamination. Mutational signatures have the potential to become diagnostic, prognostic, and therapeutic biomarkers as well as factors in therapy development.
Germline versus somatic mutations
Every cancer is distinct. We are all conceived with near equal amounts of genetic information from each parent, and yet the resulting genetic blueprint is different for everyone (except identical twins). During development, copying and partitioning of DNA takes place during cell division such that every daughter cell receives a full genetic complement. Individuals can thus directly inherit mutations (known as germline mutations) that predispose to cancer later in life. Additionally, a variety of factors combine to diminish the fidelity of DNA copying, resulting in DNA alterations, termed somatic mutations, that distinguish a daughter cell from its sister or parent (Figure 1). Because each tumor is derived from a somatic cell, the repertoire of somatic mutations that accumulate in each tumor is distinct for each individual and reflects the underlying processes that contributed to its development.
Figure 1. External and internal sources of mutation in cancer. A schematic depiction of major external and internal sources of DNA damage, a variety of DNA repair mechanisms that serve to counteract damage, and mutation as an outcome of unrepaired DNA damage.
Driver versus passenger somatic mutations in cancer
A major rationale for sequencing large numbers of cancer genomes is to identify commonly mutated genes to inform diagnoses and treatments . The mutations themselves range from simple base substitution to larger-scale aberrations such as translocations and copy number changes. The recurrent involvement of a single gene in cancers of the same type provides strong evidence for a mechanistic contribution at some stage of tumor development. Such genes are considered cancer drivers because their alteration is frequently required for tumor formation. Approximately 140 drivers have been identified and, given the massive amounts of existing data, only a few drivers probably remain uninvestigated .
As much as 90% to 99% of all mutations are considered passenger events. These mutations can be silent base substitutions in coding sequences but the majority occur in non-coding sequences. Such mutations are less likely to be biased by selective forces during tumor outgrowth and, therefore, can provide ‘signatures’ reflecting the original source of DNA damage and insights into causal mechanisms.
Global analyses of somatic mutations in cancer
Alexandrov and colleagues recently reported a comprehensive analysis of mutational signatures, examining nearly 5 million somatic mutations from over 7,000 tumors that represented 30 different cancer types . This study was remarkable in three ways. First, it demonstrated the huge (1,000-fold) range in somatic mutation frequencies in human cancers. Second, computational methods enabled the deduction of over 20 distinct mutational signatures. Third, the mutation pattern of each cancer comprised at least two, and in many instances three or more, distinct mutational signatures and therefore major sources of DNA damage. Some of the DNA damage mechanisms are already established, some can be inferred based on current knowledge, and others will require more work to be fully understood.
Cancer mutation signatures from external sources of DNA damage
A major external source of DNA damage is ultraviolet (UV) light, which can crosslink adjacent pyrimidine bases (CC, CT, TC and TT)  (Figure 1). If such a pyrimidine dimer is not repaired and becomes a substrate for DNA replication (or local synthesis), then most DNA polymerases will follow the ‘A-rule’ and insert two adenines opposite the dimer. Late repair or another round of replication can then immortalize the original lesion as a C-to-T transition mutation. Thus, the mutational signature of UV light is predominantly C-to-T transitions in dipyrimidine contexts. Other features of UV-induced mutagenesis include the occurrence of adjacent mutations (mostly CC-to-TT) and a nontranscribed strand bias due to preferential repair of the transcribed DNA strand.
Tobacco smoke is another external source of DNA damage (Figure 1), but it leads to a more complex array of DNA damaging agents and lesions than UV does . For instance, polycyclic aromatic hydrocarbons are converted by cellular cytochrome P450 enzymes into activated epoxides, which can then react to form alkylated guanine adducts. These lesions can erroneously base pair with adenine during DNA replication and, if unrepaired, lead to G-to-T transversions (equivalent to C-to-A on the opposing DNA strand), which comprise the most abundant class of mutations in smoking-associated cancers .
Many chemotherapeutics are DNA-damaging agents and, by definition, external sources of mutation. An effective chemotherapeutic should eradicate a target cancer and leave no trace for downstream analysis by sequencing. The study by Alexandrov and colleagues raises a cautionary note for treatment of glioblastomas and melanomas with the DNA methylating agent temozolomide . The presence of a temozolomide-induced mutational signature in these cancers (G-to-A transition mutations at non-CpG sites) suggests not only that the intended therapy may have been ineffective but also that the drug itself may have increased the tumor mutation rate, and possibly contributed to tumor evolution, therapy resistance, and/or poor outcome. Future studies should consider mutational signatures before and after chemotherapy and strive to minimize potentially adverse outcomes.
Cancer mutation signatures from internal sources of DNA damage
Hydrolytic deamination of cytosine bases, and particularly 5-methyl-cytosine (5meC) bases in a CpG context, appears to be the most prevalent mechanism of mutagenesis  (Figure 1). Deamination of C-to-U or 5meC-to-T and subsequent DNA replication or misrepair results in a C-to-T transition mutation biased to CpG dinucleotide motifs. Interestingly, this is the only mechanism that correlates with age, suggesting it may be the only source of mutagenesis that accrues significantly over a lifetime . Some tumors lack this signature, which suggests that these cancers might have existed for short periods and/or that they employ a mechanism of preferential repair. Other sources of chemical damage, such as oxidation, are less prevalent and may be eclipsed by more dominant mutational mechanisms.
Defects in DNA repair processes have already been linked to mutagenesis and carcinogenesis, such as in hereditary nonpolyposis colorectal cancer, which is due to inherited defects in mismatch repair . Somatic inactivation and epigenetic silencing can also result in defective mismatch repair. The study by Alexandrov and colleagues confirmed the telltale signature of mismatch repair deficiency: enhanced C-to-T transitions and microsatellite instability . By comparison, elevated frequencies of C-to-A transversions and C-to-T transitions occurred in a specific trinucleotide context in colorectal and uterine tumors with defects in the proofreading domain of DNA polymerase ϵ. In addition, an elevated frequency of insertions and deletions (without enhanced C-to-T mutagenesis) was evident in BRCA1- and BRCA2-mutant tumors, consistent with underlying defects in recombination repair.
This study also highlighted the breadth of genomic DNA deamination by members of the apolipoprotein B mRNA catalytic subunit-like (APOBEC)/activation-induced deaminase (AID) family of DNA cytosine deaminases  (Figure 1). These proteins catalyze the conversion of C-to-U in single-stranded DNA, which can be converted by replication into C-to-T transition mutations or by uracil DNA glycosylase into an abasic site. This lesion can then lead to a variety of mutagenic outcomes, including C-to-T transitions, C-to-G transversions, and DNA breaks that can precipitate larger-scale aberrations. Most human cells express up to nine active DNA cytosine deaminases, with one family member (AID) functioning in antibody gene diversification and most family members protecting against virus and transposon replication .
Sixteen different tumor types showed evidence of an APOBEC mutational signature, characterized by both dispersed and clustered C-to-T transitions and C-to-G transversions at TC dinucleotides . Mutation clusters, also called kataegis, implicated extended regions of single-stranded DNA, the preferred substrate of these enzymes. Two B cell cancers had an APOBEC signature and an additional signature consistent with AID activity . Prior studies have converged upon APOBEC proteins, particularly APOBEC3B, as a major source of mutation in several types of cancer [7-9]. Because this mutational signature was similar across all sixteen tumor types, it is likely that APOBEC3B is broadly involved in cancer mutagenesis. However, additional studies are needed to assess whether one or more of the APOBEC family members may also be involved. An additional intriguing possibility, given the innate immune function of APOBEC3B and other family members, is that parasite infection may contribute to their induction and/or aberrant regulation. In terms of overall impact, APOBEC involvement in cancer mutagenesis is second only to spontaneous deamination of cytosine and 5meC .
Epidemiological, translational, and clinical implications
Each of the cancers studied by Alexandrov and colleagues appeared to be influenced by two or more sources of DNA damage, as deduced by their mutational signatures . This knowledge has a number of important implications. First, novel signatures, such as the strong C-to-A bias in neuroblastomas and T-to-C bias in glioblastomas, will spur research to determine additional DNA damage sources. The quest to account for all mutational signatures is as much a mechanistic problem as it is epidemiological. If some of the unknown signatures are due to external sources (like UV light and tobacco carcinogens), then measures should be taken to minimize exposures.
Second, mutational signatures may act as biomarkers for the underlying mechanisms, and may become diagnostic. They will likely be even more beneficial if the mutational signatures and underlying processes correlate with clinical outcomes or specific treatments, because chemotherapeutic agents may synergize with underlying DNA damage sources (for example, PARP inhibition in BRCA-mutant cells ). Finally, it is important to emphasize that most internal sources of DNA damage are unavoidable and/or due to mistakes in DNA maintenance processes. By contrast, APOBEC/AID mutagenesis is through the aberrant action of normal enzymes, which raises the additional prospect of inhibiting these enzymes to slow down rates of tumor evolution, drug resistance, and metastasis.
5meC: 5-methyl-cytosine; AID: Activation-induced deaminase; APOBEC: Apolipoprotein B mRNA catalytic subunit-like; UV: Ultraviolet.
RSH is a cofounder of ApoGen Biotechnologies LLC.
I thank D Harki for comments and apologize to colleagues whose work could not be cited directly due to bibliography length constraints. The Harris Laboratory is grateful for support from the Minnesota Ovarian Cancer Alliance, V Foundation for Cancer Research, Department of Defense Breast Cancer Research Program (BC121347), and US National Institutes of Health (R01-AI064046 and P01-GM091743).
Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cortés ML, Auclair D, Saksena G, Voet D, Noble M, DiCara D, et al.: Mutational heterogeneity in cancer and the search for new cancer-associated genes.
Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale AL, Boyault S, Burkhardt B, Butler AP, Caldas C, Davies HR, Desmedt C, Eils R, Eyfjörd JE, Foekens JA, Greaves M, Hosoda F, Hutter B, Ilicic T, Imbeaud S, Imielinsk M, Jäger N, Jones DT, Jones D, Knappskog S, Kool M, et al.: Signatures of mutational processes in human cancer.
Front Biosci 2002, 7:d1024-d1043. PubMed Abstract
Burns MB, Lackey L, Carpenter MA, Rathore A, Land AM, Leonard B, Refsland EW, Kotandeniya D, Tretyakova N, Nikas JB, Yee D, Temiz NA, Donohue DE, McDougle RM, Brown WL, Law EK, Harris RS: APOBEC3B is an enzymatic source of mutation in breast cancer.
Roberts SA, Lawrence MS, Klimczak LJ, Grimm SA, Fargo D, Stojanov P, Kiezun A, Kryukov GV, Carter SL, Saksena G, Harris S, Shah RR, Resnick MA, Getz G, Gordenin DA: An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers.