The genetic connections between DNA repair pathways and human cancer predisposition have fueled interest in the proteins that recognize and repair specific sites of DNA damage. The repair enzymes are remarkably conserved from bacteria to fungi to humans, underscoring the premium placed on maintaining genomic integrity in the face of a mutagenic burden. DNA is susceptible to damage caused by errors committed during replication and by environmental factors, such as radiation, oxidants, or alkylating agents. Repair reactions involve the excision of chemically altered or mispaired bases from the DNA duplex. Resulting gaps are filled in by DNA polymerases; this reaction leaves a nick at or flanking the site of repair. An analogous process occurs during chromosomal DNA replication, whereby the 5’-RNA segments that prime discontinuous synthesis of Okazaki fragments are excised, and the intervening gaps are filled in by DNA polymerase.
The DNA repair and replication pathways converge on a common final step in which the continuity of the repaired DNA strand is restored by DNA ligase, an enzyme that converts nicks into phosphodiester bonds. Nicks are potentially deleterious DNA lesions that, if not corrected, may give rise to lethal double-strand breaks. Accordingly, the total loss of DNA ligase function is lethal.
DNA Ligase Reaction
DNA ligases catalyze the joining of a 5’-phosphate-terminated strand to a 3’-hydroxyl-terminated strand. Ligation depends on magnesium and a high-energy cofactor, either ATP or NAD. The reaction mechanism involves 3 sequential nucleotidyl transfer reactions. In the first step, nucleophilic attack on the a-phosphorus of ATP (adenosine triphosphate) or NAD (nicotinamide adenine dinucleotide) by ligase results in release of pyrophosphate or NMN (nicotinamide mononucleotide) and formation of a covalent intermediate (ligase-adenylate) in which AMP is linked via a phosphoamide (P-N) bond to the e-amino group of a lysine. In the second step, the AMP is transferred to the 5’-end of the 5’-phosphate-terminated DNA strand to form DNA-adenylate — an inverted pyrophosphate bridge structure, AppN. In this reaction, the 5’-phosphate oxygen of the DNA strand attacks the phosphorus of ligase-adenylate; the active-site lysine side chain is the leaving group. In the third step, ligase catalyzes attack by the 3’-OH of the nick on DNA-adenylate to join the 2 polynucleotides and liberate AMP.
Phylogenetic Distribution and Function
Living organisms comprise 3 domains: eubacteria, archaeabacteria, and eukaryotes. All organisms encode 1 or more DNA ligases. The ligases are grouped into 2 families, ATP-dependent ligases and NAD-dependent ligases, according to the cofactor required for ligase-adenylate formation. The ATP-dependent DNA ligases are found in all 3 domains; whereas the NAD-dependent enzymes have been described only in eubacteria.
Eukaryotic Cellular ATP-dependent Ligases
ATP-dependent DNA ligases are found in all eukaryotic species. Mammalian cells contain multiple DNA ligase isozymes encoded by at least 3 genes. Amino acid-sequence comparisons suggest that a core catalytic domain common to all ATP-dependent ligases is embellished by additional isozyme-specific domains located at the amino or carboxyl termini of the proteins. It is thought that these flanking segments mediate the binding of mammalian DNA ligases to other proteins involved in DNA replication, repair, and recombination. The mammalian isozymes are referred to as ligase I, ligase IIIa, ligase IIIb, and ligase IV. DNA ligase I is a 919-amino acid polypeptide, expressed in all tissues, which catalyzes the joining of Okazaki fragments during DNA replication and also plays a role in DNA repair. DNA ligases IIIa (922 amino acids) and IIIb (862 amino acids) are the products of a single gene; they differ in amino acid sequence only at their carboxyl termini as a consequence of alternative mRNA splicing. Ligase IIIa is expressed ubiquitously and is implicated in DNA repair. Ligase IIIb expression is restricted to the testis, specifically to spermatocytes undergoing meiosis. This suggests that ligase IIIb is involved in meiotic recombination. DNA ligase IV is a 911-amino acid polypeptide that plays a role in the repair of double-strand DNA breaks.
Yeast cells contain 2 separately encoded DNA ligases, which are homologous to mammalian DNA ligases I and IV, respectively. The DNA ligase I (Cdc9p) of the budding yeast Saccharomyces cerevisiae is essential for cell growth. Genetic experiments implicate ligase I in sealing Okazaki fragments and in the completion of DNA excision repair. In contrast, yeast DNA ligase IV is not essential for cell growth. However, deletion of the LIG4 gene elicits phenotypes, indicating that ligase IV catalyzes the repair of double-strand breaks. Budding yeast have no apparent homologue of mammalian DNA ligase III.
Viral ATP-Dependent DNA Ligases
Bacterial DNA viruses, such as the E. coli bacteriophages T4, T6, T7, and T3, encode their own ATP-dependent DNA ligases. ATP-dependent DNA ligases are also encoded by eukaryotic DNA viruses that conduct some or all of their replication cycle in the cytoplasm. These include vaccinia virus, African swine fever virus, and Chlorella virus PBCV1. The bacteriophage and eukaryotic viral DNA ligases are smaller than their cellular counterparts. Vaccinia DNA ligase, a 552-amino acid polypeptide, is strikingly similar at the amino acid-sequence level to mammalian DNA ligase III. Indeed, ligase III is more closely related to vaccinia ligase than to mammalian ligases I and IV. The ligases of T4 (487 amino acids), T7 (359 amino acids), T3 (346 amino acids), and Chlorella virus (298 amino acids) are smaller still. We’ve shown that the Chlorella virus ligase can complement the growth of a yeast strain in which the DNA ligase I gene has been deleted. This result suggests that the protein segments unique to the much larger DNA ligase I are not essential for yeast cell growth.
Archaeabacterial ATP-dependent Ligases
The archaea are unicellular organisms that have remarkable biosynthetic capacity and the ability to thrive under extreme environmental conditions. Archaea are thought to be the forerunners of the eukaryotes. Kletzin identified the first DNA ligase gene from an archaeon — Delsulfurolobus ambivalens — and noted amino acid-sequence similarity of the Dam ligase polypeptide to eukaryotic viral and cellular ATP-dependent ligases. Genes encoding putative DNA ligases have since been sequenced from at least 9 other archaeal species. The archaeal ligases are of fairly uniform size (557 to 619 amino acids), and their primary structures are extensively conserved.
We have purified and characterized the DNA ligase encoded by the thermophilic archaeon Methanobacterium thermoautotrophicum. As predicted from sequence comparisons, Mth ligase resembles eukaryotic DNA ligases in its specificity for ATP as the energy cofactor. NAD+, the cofactor for classical eubacterial ligases, did not support strand joining. The striking feature of Mth ligase is its activity at elevated temperatures. Mth ligase requires high temperatures for nick joining, and the optimal temperature for ligation in vitro (60°C) agrees well with the favored in vivo growth temperatures for M. thermoautotorophicum.
The NAD-dependent DNA ligases are monomeric enzymes found in eubacteria. Genes encoding NAD-dependent DNA ligases have been identified and sequenced from at least 50 eubacterial species. The NAD-dependent ligases are of fairly uniform size (647 to 841 amino acids), and there is extensive amino acid-sequence conservation throughout the entire lengths of the polypeptides.
We found that the NAD-dependent E. coli DNA ligase can support the growth of Saccharomyces cerevisiae strains deleted singly for CDC9 or doubly for CDC9 plus LIG4. This is the first demonstration that an NAD-dependent enzyme is biologically active in a eukaryotic organism. Yet yeast cells containing only E. coli ligase are defective in the repair of DNA damage induced by UV irradiation or treatment with MMS. This result suggests that the structural domains unique to yeast DNA ligases are not essential for mitotic growth, but may facilitate DNA repair.
Nick-Sensing by DNA Ligases
We are examining the interaction of eukaryotic ligases with DNA using virus-encoded enzymes as models. Vaccinia virus DNA ligase and Chlorella virus DNA ligase each form a discrete complex with a singly nicked DNA ligand in the absence of magnesium that can be resolved from free DNA by native polyacrylamide gel electrophoresis. The viral ligases do not form stable complexes with the following ligands: (i) DNA containing a 1-nucleotide or 2-nucleotide gap; (ii) the sealed duplex DNA product of the ligation reaction; (iii) a singly nicked duplex containing a 5’-OH terminus at the nick instead of a 5’-phosphate; or (iv) a singly nicked duplex containing an RNA strand on the 5’-phosphate side of the nick (10 to 15). Thus, viral ATP-dependent DNA ligases have an intrinsic nick-sensing function.
Nick recognition by vaccinia DNA ligase and Chlorella virus DNA ligase also depends on occupancy of the AMP binding pocket on the enzyme — i.e., mutations of the ligase active site that abolish the capacity to form the ligase-adenylate intermediate also eliminate nick recognition; whereas a mutation that preserves ligase-adenylate formation but inactivates downstream steps of the strand joining reaction has no effect on binding to nicked DNA. Sequestration of an extrahelical nucleotide by DNA-bound ligase is reminiscent of the “base-flipping” mechanism of target site recognition and catalysis used by other DNA modification and repair enzymes.
Although the 5’-phosphate moiety is essential for the binding of Chlorella virus ligase to nicked DNA, the 3’-OH moiety is not required for nick recognition. Chlorella virus ligase binds to a nicked ligand containing 2’, 3’ dideoxy and 5’-phosphate termini but cannot catalyze adenylation of the 5’-end. Thus, the 3’-OH is important for step 2 chemistry even though it is not itself chemically transformed during DNA-adenylate formation.
To delineate the ligase-DNA interface, we have footprinted the ligase binding site on DNA. The size of the exonuclease III footprint of ligase bound to a single nick in duplex DNA is 19 to 21 nucleotides. The footprint is asymmetric, extending 8 to 9 nucleotides on the 3’-OH side of the nick and 11 to 12 nucleotides on the 5’-phosphate side.
Crystal Structure of Eukaryotic DNA Ligase-Adenylate
Chlorella virus DNA ligase is the smallest eukaryotic ATP-dependent ligase known. As the “minimal” DNA ligase, it presented an attractive target for structure determination. We have crystallized the Chlorella virus ligase and, in collaboration with our SKI colleague Dimitar Nikolov, determined its structure at 2Å resolution. The enzyme consists of a larger N-terminal domain and a smaller C-terminal domain with a cleft between them. The experimental MIR map showed that an AMP moiety was covalently linked to Nz of Lys27 at the active site, even though the ligase preparation was not intentionally exposed to ATP during purification and crystal growth. Thus, we have the structure of a genuine catalytic intermediate. The experimental density map of the ligase-adenylate crystal also revealed a single well-ordered sulfate ion located on the protein surface approximately 5Å from the phosphorus of AMP.
The surface topology and electrostatics suggest that the ligase-adenylate intermediate is ready to bind DNA. The sulfate, which is taken to represent the 5’-PO4 of the nick, is situated just above the alpha-phosphate of the adenylate on the positively charged surface of domain 1. The surface positive charge is highly concentrated at the nick-binding site and radiates outward over the top surface of domain 1. The electrostatic potential of this surface of domain 1 is favorable for interaction with the negatively charged phosphodiester backbone of nicked DNA. Note that there is little positive charge elsewhere on the surface of the molecule. The relatively flat, slightly concave surface of the ligase molecule is also well suited to accommodate 1 face of the DNA double helix.
The adenosine binding pocket is composed of 5 conserved motifs that define the polynucleotide ligase/mRNA capping enzyme superfamily of nucleotidyl transferases. The ligase-adenylate structure reveals a network of interactions between amino acid side chains and the adenine base, the sugar hydroxyls, and the phosphate of AMP. The structure accounts for the effects of mutations in the conserved motifs on DNA ligase activity, and it extends our understanding of the structural basis for catalysis and DNA binding.