Thursday, 24 October 2019

Site-Specific Recombination

Site-Specific Recombination

General Principles

Generalized recombination involves large regions of homologous DNA sequences, but site-specific recombination involves considerably smaller segments of DNA in which the recombination event occurs at a specific sequence (the recognition sequence). The essential components for site-specific recombination include enzyme(s) specific for the recognition sequence and two DNA duplexes, at least one of which carries that sequence. Under proper conditions, reciprocal (or nearly so) recombination occurs within or adjacent to the recognition sequence. If both DNA duplexes must carry the recognition sequence, the process is described as double site-specific recombination. The term single site-specific recombination has been used when only one DNA duplex is carrying the recognition sequence.

Double Site-Specific Recombination

Phage Lambda Integration

One especially well-studied example of double site-specific recombination is that of the integration of λ DNA to form a prophage. The λ integrase is the definitive member of a group of similar tyrosine recombinases and is the one to which the others are compared. The recognition sequence, in this case, is the att site, which was represented in the article by the letters PP′. The corresponding region of the Escherichia coli genome is at, which was represented by the letters BB′. Analysis of these two regions by both genetic and heteroduplex techniques has led to the surprising conclusion that they are not homologous at all. Although the minimum size of the att site is about 240 bp, there is a small 7-bp sequence embedded within each att site that is the actual point at which the integrase enzyme acts to produce staggered cuts.

In recognition of this small, centrally located binding site, the terminology for designating the att sites has changed to POP′ and BOB′, where O designates the short homologous sequence (overlap region). The integration event produces prophage endpoints (BOP′ and POB′) that are slightly different from the original sequence. In fact, in vitro experiments have shown that integrase has difficulty binding to the right prophage end (POB′ or attR), and the role of the xis protein is to assist the folding of the structure so that integrase can act. This situation would account for the genetic observations that only int function is necessary for integration but both int and xis are required during prophage excision unless the level of Int protein is high. In all cases, integration host factor (IHF), the product of the himA and hip genes, is necessary for normal activity. IHF also binds to cII DNA, but more weakly than to att DNA.

A general diagram for λ recombination at att is shown in Fig.1. The phage att region is 234 bp and can be subdivided into P and P′ arms. The bacterial att region is only about 25 bp and is subdivided into B and B′ arms. Within each arm, foo printing experiments have identified specific sequences protected by the Int protein (identified by circles in Fig.1): three in the P′ arm, two in the P arm, two in the core region, and one each in the B and B′ arms. IHF has three binding sites designated H in the figure, all within the phage DNA. Two of these sites flank the region of the phage DNA in which exchange occurs. Arthur Landy and his collaborators have shown that the H1 site appears to be of critical importance in regulating the integration event. Catalysis and cleavage of the core sites is a function of the carboxy-end of the integrase molecule (Tirumalai et al. 1998).

Phage Lambda Integration

Figure.1. Lambda site-specific recombination. The att sites involved in the integration and excision reactions, as well as the proteins required for each, are shown. Gray lines are bacterial DNA, and black lines are phage DNA. The locations of the binding sites of the Int (O), IHF (?), and Xis (◊) have been deduced from footprinting experiments. There are two kinds of Int binding sites: those in the phage arm DNA (P) and two core sites (C). The IHF consensus sequence and coordinates relative to the center of the overlap region are shown under the sequences of the H1 sites. The asterisks indicate the side of the overlap region where the first DNA cuts are made. (Adapted from Thompson, J.F., Waechter-Brulla, D., Gumport, R.I., Gardner, J.F., Moitoso de Vargas, L., Landy, A. [1986]. Mutations in an integration host factor-binding site: Effect on lambda site-specific recombination and regulatory implications. Journal of Bacteriology 168: 1343–1351.)

Rutkai et al. (2003) have examined the extent to which DNA pairing is important for the action of integrase. They deleted the normal bacterial att site and looked at the DNA sequences at which λ now managed to integrate. The leftmost portion of the overlap region was conserved in the new sites along with substantial similarity in the imperfect repeats of the flanking arms.

Kim and Landy (1992) have considered the problem of how the two ends of the prophage can find one another so that recombination can occur. The role of the additional proteins apparently is to provide appropriately bent DNA. IHF is a known DNA binding protein. If the concentration of Xis is limiting, the host Fis protein (see next section) can substitute. It too is a known DNA binding protein. The necessity for DNA bending arises from the fact that Int is a monomeric protein with two specific binding activities. The amino terminus binds with high affinity to a site in the arm, while the carboxy terminus binds with low affinity to the core site where strand exchange occurs. Therefore, when excision is to occur, each arm must be folded over a core site and the core sites juxtaposed so that exchange is possible.

Overall regulation of integration and excision is provided by several mechanisms. The end of the xis gene overlaps the p I promoter so that when Xis is being produced, Int expression is reduced. Furthermore, the weak binding of IHF to the cII region can occur only after the attP binding sites are all occupied. This requirement ensures that the phage DNA is ready to integrate before the temperate mode of transcription becomes predominant.

Circular Chromosome Segregation

Organisms with circular chromosomes must be prepared to deal with the possibility that replication or recombination might accidentally produce a single concatemeric molecule instead of two unit circles. Before chromosome segregation can occur, the chromosomes or low copy number plasmids must be monomeric. The E. coli system that accomplishes this task is a very and very system. The two proteins show sequence similarities to the essential regions of the λ integrase protein. The actual site of exchange is dif, which is located in the terminus region between term and terC. Mutations in xerC, xerD, or dif result in filamentous cells with aberrantly segregating chromosomes. The FtsK protein, a protein known to be required early in cell division, participates in the process. Li et al. (2003) bound a protein fused to the green fluorescent protein to the chromosome terminus of replication and tracked the position of the terminus using light micrography. Movement of the terminus to the center of the daughter cell occurred normally in very mutants but not in ftsK mutants.

Inversion Systems

Bacteria possess a variety of invertible segments whose orientation is controlled by DNA invertases (for a review, see Johnson [2002]). Invertases are enzymes that catalyze site-specific recombination between the ends of defined DNA segments. Normally, the invertible segment of DNA contains two oppositely oriented genes, only one of which is adjacent to a promoter. Inversion has the effect of turning on the expression of one member of the paired genes while inactivating the other. Site-specific inversion systems are known to control flagellar proteins in Salmonella, fimbriae production in E. coli, and tail fiber production (host range determination) in phages P1 and Mu.

The Mu system is a particularly well-studied one. The invertible segment is bounded by two inverted six sites, each consisting of two inversely oriented 12-bp half-sites separated by two asymmetric spacer bases. In the approximate middle of the invertible segment is a recombination enhancer element that functions independently of orientation or distance to the six sites. For inversion to occur, host Fis protein (factor for inversion stimulation) must be present. Fis binds to the enhancer and causes synapsis with the two six sites (Fig.2). Gin protein then causes double-strand breaks at each fixed site and catalyzes rotation of the two strands and resealing of recombined ends. The process can be processive, meaning that the reaction may occur more than once on the same DNA molecule. The result of such multiple reactions is a knotted DNA molecule.

Inversion Systems
Figure.2. Ribbon diagrams of DNA illustrating changes caused by Gin recombination. A DNA substrate for Gin is represented schematically by a ribbon in which the edges are the complementary strands. One side of the ribbon is gray; the other is white. The split arrows indicate the inversely oriented six sites bound by Gin, and the darker box at the bottom of the loop indicates the enhancer bound by Fis. The DNA substrates for Gin need to be (−) supercoiled. Synapsis of the two six sites and the enhancer trap two (−) supercoils as shown on the left. The trapped crossings, or nodes, are indicated by −1. Recombination occurs via a double-strand break in each gix site, a 180˚ right-handed rotation of one pair of half-sites relative to the other and religation to generate the product diagrammed on the right. The rotation of the DNA creates one additional (−) node, while simultaneously overtwisting both sites by a half turn (indicated by +1/2); after deproteinization, these nodes cancel. (Reproduced with permission from Klippel, A., Kanaar, R., Kahmann, R., Cozzarelli, N.R. [1993]. Analysis of strand exchange and DNA binding of enhancer-independent Gin recombinase mutants. The EMBO Journal 12: 1047–1057.)


Integrons are small pieces of modular DNA found in Bacteria. These have assumed enormous importance as it has been established that they represent a way for pathogenic organisms to share and/or exchange drug resistance genes. The basic integron codes for an integrase (intI) that is a member of the tyrosine recombinase superfamily whose primary member is λ integrase. Other members of the superfamily that have been discussed earlier include Cre (Phage P1) and XerCD. Like λ integrase, each intI gene has associated with it an attI site that is specific to the integrase (Collis et al. 2002), and this difference serves to distinguish different classes of integrons. The final element in an integron is a strong promoter near the att site. In order for the system to operate, there needs to be the equivalent of the attB site (Fig.1), which in this case is the 59-be (59 base element) site found on gene cassettes. A gene cassette is one or more promoterless genes with an associated 59-be. As in the case of phage λ, the actual point of recombination is a core element within the att sites. In some cases, 59-be sites can recombine directly. The process is reminiscent of conjugative transposition because a circular intermediate is formed during the capture of the mobile cassette. While a single integron is a simple structure (Fig.3), more complexity is possible. Vibrio cholera, for example, contains a super integron that is 126 km long and contains 179 cassettes (Rowe-Magnus et al. 1999), most of which are inactive. Holmes et al. (2003) argue that integrons represent a major element in genome evolution. They used PCR to sample the population of cassettes available to organisms and showed that there were both protein-encoding and noncoding cassettes to serve as the raw material for evolution.


Figure.3. The general structure of an integron.
(a) An integron includes a gene for an integrase, an att site, and two divergent promoters. One promoter transcribes integrase while the other transcribes the cassette. The gray arrows show the directions of transcription.
(b) In a super integron, additional cassettes are present beyond the actively transcribed one.

Transposons: Single Site-Specific Recombination

Transposons are genetic elements that maintain their own integrity (i.e., their site of recombination is preserved) while integrating into a variety of sites on the target DNA (the target DNA site is not preserved). Included here are phenomena such as the highly promiscuous integration of phage Mu DNA, the integration of the R100 plasmid to form an Hfr, and the movement of various pure transposons (e.g., Tn10). However, all of these phenomena reduce to single site-specific recombination events catalyzed by the insertion elements bounding various transposons. For example, when R100 integrates, it usually loses its Tn10 transposon, which means that the entire transfer region of the R plasmid as well as the antibiotic resistance genes is acting as a large transposon and hopping from one DNA molecule (the R100 plasmid) to another (the E. coli chromosome). To emphasize the size differences, the process has been called inverse transposition (i.e., Tn10 stays where it is, and the rest of the DNA moves).

The molecular mechanisms proposed for transposition basically come down to two. Each involves a transposon located on a donor DNA molecule and a target site that may be located on the same or different DNA molecule. In one model, there is DNA replication involved in the process, and an intermediate cointegrate molecule is formed if the target site is on a separate molecule. A cointegrate structure is one in which two DNA molecules are fused into one. The other model requires no DNA replication as an intrinsic part of the transposition process. This mode of transposition was first suggested by Douglas Berg and is characteristic of several transposons, including Tn5 and Tn10.

Transposon Tn10

Tn10 is the 9.3-kb transposon encoding tetracycline resistance that is found within the R100 plasmid. It carries inverted IS10 elements at its boundaries. As detailed in the article, it has moved into and out of the DNA of a variety of bacteria and their phages and therefore must use an extremely versatile recombination system. A close examination of Tn10 has revealed several important aspects of its behavior. As shown in Fig.4, it can excise itself from a molecule precisely or imprecisely. It can also invert a region of DNA or delete a region of DNA. Prior to the discovery of transposons, these recombination events would have been classified as examples of illegitimate (nonhomologous) recombination. Lack of replication in this process was demonstrated by preparing λ phages carrying slightly different Tn10 moieties. Strand separation and reannealing were used to make heteroduplexes differing at specific bases. After packaging and infection, when the products of transposition were examined, the transposon was still a heteroduplex. This could happen only if there were no replication as a part of transposition. Despite this observation, transpositions have been observed in which Tn10 apparently remained where it was and also appeared in a new site. During transposition, double-strand cuts occur at the ends of the transposon (the outer edge of the IS10 elements) catalyzed by a cis-acting transposase enzyme. The IS10 elements at the transposon ends are necessary and sufficient for transposition, implying that the code for the transposase. However, the two IS10 elements are not identical. The left-hand element in the conventional genetic map is vestigial, as all but 13 bp at the tip can be deleted and still give transposition. Similar deletions of the right-hand element cut the frequency of transpositions by 90%.

Transposon Tn10

Figure.4. Conservative transposition of Tn10. (a) Tn10 is composed of a central region coding for resistance to tetracycline with inverted repeat sequences at its ends. Each repeat is an IS10 element, but only IS10R has functional transposase. The major IS10 transcript is RNA-IN, which is named for its direction of synthesis. A minor transcript from the other DNA strand is RNA-OUT. The transcripts overlap at their 5′-ends. IHF is an integration host factor. (b) When transposase is synthesized, it binds to the outer edges of IS10 and makes a single nick. The hydroxyl group thus produced engages in a nucleophilic attack on the opposite DNA strand catalyzed by the transposase. The product is a hairpin loop. Transposase then cleaves the hairpin and uses the hydroxyl group to attack the recipient molecule. If the donor molecule is reassembled correctly, the process was precise excision. If portions of the transposon remain, the process was nearly precise excision.

The departing transposon leaves a double-strand gap in the donor DNA. The observed transposon duplication may be due to the nature of the repair process that acts on this gap. If there is no repair, the gap destroys the integrity of the donor DNA and the molecule cannot replicate. Proteins such as RecA are known to have the capability of joining the ends of broken DNA to repair damage such as that induced by x-rays. A similar phenomenon might occur in this case, and the result would be the precise excision of the transposon. A final alternative would be recombinational repair in which case the missing transposon DNA would be replaced by transposon DNA looped from another DNA molecule and then copied. Note carefully that in this instance the replication is part of the recombination process and not a part of transposition. The net result would be the appearance of an additional copy of the transposon within the cell.

The transposition frequency for Tn10 is about 10 −4 per cell per generation when the cells are growing on a minimal medium. At a frequency of 10 −5 or less, Tn10 also promotes deletions or inversions and deletions. These rearrangements occur preferentially near the transposon itself, whereas transposition targets are located more or less randomly. The normal frequency of Tn10 transposition is of the order of 10 −6 to 10 −7. The regulation of this rate is obtained by several mechanisms. The major system involves the DNA adenine methylase (dam) DNA modification pathway. Transposase primarily acts on the ends of IS1 when they are hemimethylated (i.e., immediately after replication).

A second regulatory mechanism for transposition involves the synthesis of complementary RNA in a manner similar to that used to regulate copy number in R plasmids. The start of the transposase coding region in IS10 is overlapped by a small RNA (RNA-OUT, Fig.4) transcribed from an outwardly directed promoter called out. Inactivation of that promoter allows extra translation of the mRNA transcribed from pin (RNA-IN) and increases the transposition frequency. Transcription of the transposase gene from outside the transposon is prevented by a double-strand RNA region that sequesters the AUG codon of the transposase so that translation initiation would be difficult if not impossible.

RNA-OUT by itself forms a stem-loop structure. However, the pairing between RNA-IN and RNA-OUT has more hydrogen bonds and is more stable than the intramolecular loop. Therefore, given the opportunity, RNA-IN will pair with RNA-OUT. The coexistence of multiple copies of Tn10 within a single cell is difficult owing to a trans-acting repressor of transposase whose gene lies within IS10. The only transposon-specific protein necessary for transposition is the transposase. However, cellular integration host factor (IHF) plays an important regulatory role, both positive and negative, with respect to transposition. Binding of IHF and transposase to the IS10 element sharply bends the DNA, producing a transposase loop—a stable complex of DNA and protein. The nucleoid DNA-binding protein H-NS is also necessary for normal transposition. Mutant cells can excise the transposon but cannot form the circular intermediate, so Swingle et al. (2004) suggest that its function is to stabilize the bent DNA.

The effect of IHF binding depends on the supercoiling state of the local DNA. If supercoiling is lacking, the effect is inhibitory, presumably on the grounds that there is already a problem with the DNA that transposition would only make worse. Kennedy et al. (1998) have shown that the transposase catalyzes four sequential reactions at each end of the transposon (Fig.3): hydrolysis, transesterification by the hydroxyl group created in the first reaction, hydrolysis, and transesterification of the hydroxyl group. In the process, the enzyme creates a hairpin structure at each end of the transposon. The hairpin breaks as part of the reaction joining the end of the transposon to the new target DNA.

Transposon Tn10 is, comparatively speaking, a specific transposon. An extensive analysis of the DNA sequence at its insertion sites has shown that there is some specificity involved. Approximately 85% of all insertions are found at the sequence NGCTNAGCN, where N represents any base. The sequence is symmetric, so there is no preferred orientation. Pribil et al. (2004) have shown that changes that make DNA bending easier (e.g., a nick) can compensate for alterations in the target sequence. After insertion of Tn10, there is duplication of the 9 bp forming the target site. This duplication presumably indicates that the enzyme making the incision in the target site does so with offset cuts in the manner of a type II restriction enzyme. The transposon ends are ligated to the offset cuts and the gaps filled in by DNA repair, generating the duplications.

Phage Mu Transposition

Phage Mu can carry out transposition in either a replicative or a nonreplicative mode. The latter reaction is called the simple insertion mode and occurs during a new infection of a cell. It is a special case because the donor DNA is linear, whereas other types of transposition involve circular, supercoiled DNA molecules. Mu replication is via replicative transposition.

Consider first the replicative transposition mechanism. Initially, the Mu DNA must be bent into the appropriate configuration. Tetramers of the transposase, product of gene A, bind cooperatively to three sites at each end of the phage DNA to generate a transposon. There is an enhancer sequence approximately 1 kb in from the left-hand end to which IHF binds, and a sharp bend in the DNA results. The net effect is to bring together the ends of the prophage. By itself, protein A is only 1% effective at transposition. It requires the presence of protein B for maximum effect.

The B protein is a DNA binding protein and ATPase that polymerizes onto DNA (Green and Mizuuchi 2002). In that state, it assists in the selection of the target DNA. If it binds to donor DNA in the vicinity of protein A, the presence of protein A causes ATP hydrolysis and release of the B protein. Therefore, phage Mu does not transpose close to its original location. What protein B does accomplish is to bring together the target DNA and the donor DNA (Fig.5). 

Figure.5. Replicative transposition of phage Mu. (a) Overview of molecular movements. The Mu prophage (thick gray line), located in one DNA molecule, produces two proteins, A and B. The transposase A and host IHF bind to and bend the donor DNA. Meanwhile, the B proteins assemble on a target DNA and participate in binding of the target to the transposase. ATP hydrolysis occurs during the process, and then the B proteins are released. The transposase nicks both target and donor DNA. (b) Magnified view of the strand exchange. Gray lines are donor DNA and Mu prophage, dark lines are target DNA. The transposase makes offset cuts in both DNA molecules. The donor DNA separates and binds to the nicked ends of the prophage. Replication occurs primed by the free ends of the target DNA. The results are two copies of the prophage, duplication of target DNA sequences at the ends of the prophage, and formation of a cointegrate molecule (concatemer). Normal concatemer separating processes will resolve the monomeric DNA molecules.

In Fig. 15.5a, single-strand nicks are made at the ends of the transposon and five base offset nicks are made in the target region. As in the case of Tn10, it is the 3′-OH ends of the transposon that are first linked to the target. The strand transfer reaction requires one host protein, HU. Proper folding of the helices generates the strand transfer complex (STC, X-shaped structure in Fig. 15.5b) that is held together by ligating the transposon ends to the offset nicks in the target DNA. After ATP hydrolysis and exit of protein B, the ClpX chaperonin arrives to remodel the STC and disassemble the transposases. The gaps remaining in the STC can serve as primers for DNA replication to yield the two structures shown in Fig. 15.5b (see the review by Nakai et al. [2001]). As the gaps of the structure are filled in, the typical 5-base repeat of the target DNA is generated. Note that if the original DNA molecules were circular, the left and right sides of each molecule were linked. Those linkages are not affected by the transposition, and so the final structure is then a single circle (cointegrate) containing two copies of Mu. A recombination event between the two copies of Mu then resolves the cointegrate into single circles each containing a recombinant Mu prophage.

To derive a simple insertion, only a slight modification must be made in the model. Instead of filling in the gaps in Fig. 15.5b, the remaining links of the transposon DNA to the donor molecule (links to gray lines) are degraded. A cointegrate structure then cannot develop. Instead, the donor DNA is left with a gap, as in the case of Tn10, and the target DNA (black lines) is left with the transposon plus one gap at each end of the transposon. A simple gap-filling reaction restores the molecular integrity of the target.

No comments:

Post a comment