Meera Atreya ‘09*, Kevin M. Esvelt, and David R. Liu

*Harvard University,

I designed and created enzymes consisting of the catalytic domain of Hin-H107Y, a hyperactive mutant DNA invertase, fused via a flexible linker sequence to engineered zinc finger DNA-binding domains. Individually, the engineered enzymes show both activity and substrate specificity with regards to their target sequences on CCR5. To optimize enzyme activity and specificity individually, I developed and validated a substrate-linked protein evolution selection scheme. I generated enzyme libraries in E. coli, selected active mutants, and performed recombinase activity assays side-by-side with starting pool samples. Future experiments will use the starting libraries for iterative rounds of directed evolution. After determining efficient recombinase variants individually, I will test the evolved enzymes for recombinase-mediated excision on CCR5, first in E. coli and subsequently in human cells. These engineered recombinase enzymes have the potential to provide protection from HIV-1 in a manner suitable for delivery in developing nations.


Chemokine receptors

In response to foreign antigens, the body’s immune system concentrates specific white blood cells at the site of infection. Cell migration and activation in this process is mediated by signaling molecules known as chemokines (chemoattractant cytokines) which are secreted early in the immune response (de Silva and Stumpf 2004).

There are four subfamilies of chemokines and their respective receptors, distinguished by patterns of cysteine residues (abbreviated “C”) in the former. For example, the nomenclature “CCR5” refers to the 5th chemokine receptor of the CC subfamily. Several ligands bind to each chemokine receptor, and many ligands bind to multiple receptors. Due to this redundancy, a chemokine receptor knockout rarely results in phenotypic changes (Allen et al. 2007).

CCR5 and HIV infection

Chemokine receptor 5, encoded by gene CCR5, plays a central role in cellular infection by the human immunodeficiency virus (HIV). CCR5 is expressed on the surface of numerous cells of the immune system, including macrophages and CD4+ T cells (Berger et al. 1999). Infection of these cells by HIV-1 is often described in two consecutive stages, the M-tropic and T-tropic phases, corresponding to the target cells: macrophage or T-lymphocyte cells. During viral infection, a glycoprotein on the surface of the virus, gp120, binds to the immune cell-surface molecule CD4 receptor (Janeway et al. 2008). Gp120 undergoes a conformational change upon CD4 binding, allowing the virus to bind to a chemokine receptor in the cell membrane as a co-receptor. Co-receptor binding allows viral glycoprotein gp41 to fuse the cell membrane and viral envelope together, allowing the HIV-1 genome and proteins to enter the cell (Allen et al. 2007; Janeway et al. 2008).

Figure 1
Figure 1. HIV-1 (top) infects a macrophage cell (bottom) via binding of viral protein gp120 to cell-surface molecule CD4 and to co-receptor CCR5, poising gp41 for membrane fusion (BBC 2007).

CCR5 is a required co-receptor for HIV-1 infection of macrophages in the first (M-tropic) phase. The corresponding HIV-1 variants are deemed “R5.” Almost all primary HIV-1 isolates are “R5” viruses, as this is the dominant viral phenotype of new infections (Janeway et al. 2008). It is estimated that “R5” viral strains account for 90% of all HIV-1 infections (O’Brien and Moore 2000).

The CCR5-∆32 mutation

In studies of disease progression following exposure to HIV-1, a genetic mutation in CCR5 has been linked to resistance to HIV-1 infection (de Silva and Stumpf 2004). The CCR5-∆32 mutation is a naturally-occurring deletion polymorphism in CCR5 that yields a non-functional cellular receptor. Thirty-two nucleotides of the CCR5 coding region are absent in the mutation, resulting in a frameshift and premature stop codon (Janeway et al. 2008). The resulting truncated protein is not transported to the cell surface and instead accumulates in the cytoplasm (de Silva and Stumpf 2004). Consequently, individuals homozygous for the CCR5-∆32 allele do not have detectable CCR5 receptors on lymphoid cell surfaces (Carrington et al. 1997) and heterozygous allele carriers express fewer functional receptors on their cell surfaces (Y. Huang et al. 1996).

Although rare in Africans and Asians, up to 20% of Caucasians are heterozygous Δ32 allele carriers. In addition to having fewer functional CCR5 receptors due to Δ32 heterozygosity, the truncated protein may reduce wild-type CCR5 and CXCR4 cell-surface expression by forming heterodimers with them, which are retained in the endoplasmic reticulum (de Silva and Stumpf 2004; Agrawal et al. 2007). As “R5” HIV-1 variants require co-receptor CCR5 for infection, heterozygous carriers of the CCR5-∆32 allele experience a delayed onset of AIDS by two to three years if infected with HIV-1. Furthermore, the 1% of Caucasians who are homozygous for the allele are nearly fully resistant to infection by the HIV-1 virus (Y. Huang et al. 1996; Stephens et al. 1998).

Although extremely rare, there are 12 documented cases of individuals homozygous for the CCR5-∆32 allele who have been successfully infected with HIV-1 (D. Oh et al. 2008). The viral strains in these individuals appear to use CXCR4 as a co-receptor during infection (Agrawal et al. 2007; Lama and Planelles 2007). A contrasting example, however, involves a patient infected with HIV-1 who received a stem cell transplant from a homozygous CCR5-∆32 allele donor and was effectively cured. As a result of transplantation, the patient’s genotype changed from heterozygous to homozygous for the CCR5-∆32 allele. Following the transplant, the patient discontinued highly active antiretroviral therapy (HAART), HIV-1 RNA levels in the patient’s blood and bone marrow became negligible, and CD4+ T cell counts steadily increased. At the time of this writing, more than 20 months post-transplant, no detectable replicating, active HIV-1 has been reported in the patient (Hutter et al. 2009). The CCR5-∆32 allele appears to not only provide resistance to initial infection, but also potentially to treat current infections.

Homozygous allele carriers of the CCR5-∆32 mutation are immunologically healthy. The allele has no known serious side-effects (Carrington et al. 1997), although some studies indicate that heterozygous CCR5-∆32 allele carriers may have increased risks of cervical cancer development (Singh et al. 2008) and susceptibility to symptomatic West Nile virus infection (Glass et al. 2006). Chemokine-receptor functional redundancy has been hypothesized to compensate for the lack of CCR5 in those individuals (Premack and Schall 1996).

A strategy for conferring resistance to HIV-1 infection

Given the promising phenotype of the CCR5-∆32 mutation, many researchers have striven to mimic its effects. Recently, zinc finger nucleases have been targeted to CCR5 to replace the wild-type allele with the ∆32 by homologous recombination (Perez et al. 2008). These genome-editing enzymes introduce a double-stranded DNA break which is corrected via homologous recombination with a provided ∆32 template strand. As a therapy, this method requires adoptive transfer of ex vivo-expanded, zinc finger nuclease-modified CD4+ T cells (Perez et al. 2008), which is not a feasible treatment for use in developing nations. Zinc finger nucleases have been shown to induce toxicity as a result of DNA cleavage at non-target sites, resulting in cell death and apoptosis, and so would be ill-suited for in vivo delivery to target cells (D. Carroll 2008; T. Cathomen and J. K Joung 2008). Furthermore, DNA double-strand break repair systems have been linked to human genome instability and cancer (van Gent et al. 2001). This represents a major limitation for zinc finger nuclease-mediated gene therapy to become suitable for use in humans.

Figure 2
Figure 2. Site-specific recombinase-mediated excision. The sequence of interest (arrow) between recombinase enzyme recognition sites (boxes) is excised, creating a freed loop of DNA (Akopian et al. 2003).

Using the natural ∆32 deletion mutation as a model, I aim to emulate the CCR5-∆32 mutation at the DNA level in the genomes of individuals lacking the allele. Specifically targeting CCR5 on chromosome three, I will use engineered, genome-editing enzymes capable of safely excising DNA to create a deletion in CCR5 which results in a similarly-truncated receptor. If successful, this project will further efforts to confer resistance to HIV-1 infection, or to possibly treat infected individuals by generating a reserve of resistant cells. Furthermore, recombinase enzymes are unlikely to suffer the same disadvantages as zinc finger nuclease enzymes because recombinase-mediated DNA excision does not rely on the generation of DNA damage as a means to effect a CCR5-null phenotype. Given an appropriate delivery mechanism, this therapy should be safe for direct in vivo application, which is a necessity for conditions in the developing world.

Serine recombinase enzymes

Site-specific recombinase enzymes are ideal for genome modification applications due to their ability to recognize precise DNA sequences and remove, replace, or invert the flanked sequence (Gordley et al. 2007).

The serine family of recombinase enzymes (named for their active-site serine residues), catalyze excision or inversion depending on the orientation of a two-base pair overhang at the center of the recombination sites. If the sites are in direct repeat, the sequence between them is excised. Similarly, a sequence can be integrated into a recombination site via the reverse reaction. Inverse repeats lead to inversion of the intervening sequence (Gordley et al. 2007).

Figure 3
Figure 3. Serine recombinase enzyme (ᵞᵟ-resolvase) dimer bound to DNA. Catalytic domains in yellow, interdomain linkers in orange, DNA-binding domains in green, and DNA in gray (Akopian et al. 2003).

Recombinase-mediated excision or inversion in the serine recombinase family occurs via an enzyme tetramer. Enzyme dimers recognize the two sites to be recombined, interact to form a catalytic tetramer between crossover sites, and coordinately cleave all four DNA strands with catalytic serine residues, covalently attaching each recombinase monomer to the DNA backbone (Gordley et al. 2007). The dimers then exchange partners by 180° rotation about a flat, hydrophobic interface, preventing dissociation. Subsequently, the four 3’ hydroxyls attack the serine esters, forming new phosphodiester bonds and ligating the strands without DNA loss (W. Li et al. 2005; Gordley et al. 2007).

The serine recombinase family is attractive for protein engineering applications due to the enzymes’ structural modularity. One domain is responsible for DNA binding, while a distinct, catalytic domain mediates all subsequent enzymatic steps (Gordley et al. 2007). These domains are physically separated and connected by a flexible linker.

Because sequence recognition and catalysis functions of a recombinase can be specified by unrelated protein domains, it is possible to replace the enzyme’s native helix-turn-helix DNA-binding motif with a zinc finger DNA-binding motif (Akopian et al. 2003).

Zinc finger DNA-binding motifs

A zinc finger is a protein domain that binds to DNA via sequence-specific base contacts by an α-helix in the major groove (Segal et al. 1999). Proteins containing zinc fingers are common in eukaryotes; in humans they comprise approximately two percent of all genetically-encoded proteins. Many zinc finger-containing proteins are involved in gene regulation, conferring sequence specificity to transcription factors and other DNA-localized proteins (Matthews and Sunde 2002).

Figure 4
Figure 4. Illustration depicting the full-length CCR5 gene (protein-coding sequence marked by the blue arrow) within a segment of genomic DNA (light and dark blue horizontal lines). Thee (32 deletion region (purple bracket), recombinase-mediated deletion region (green bracket), and central dinucleotides of both recombination sites (green arrows and “TT,” “AA”) are indicated. Chimeric recombinase enzyme target sequences are marked by their respective zinc finger DNA-binding domain recognition sites (orange, pink, gray, turquoise brackets). Although not distinguished in the illustration, the zinc finger domains of chimeric recombinase enzymes A and C bind to the CCR5 coding strand, while those for B and D bind to the non-coding DNA strand. Stop codons for the (32 mutant, wild-type, and recombinase-modified CCR5 versions are represented by red brackets.
Figure 5
Figure 5. Illustration of genomic DNA before and after recombinase-mediated excision of the C-terminal region of gene CCR5.Central dinucleotides (marked by “TT” and “AA” in white and yellow)are in direct repeat. The two-base pair overhangs base pair with exchange partners during recombinase-mediated excision. Stop codon introduced by the resulting frameshi& denoted by a red bracket. The resulting protein will lack the proper C-terminal region and will instead contain a segment of amino acids derived from non-coding genomic DNA.

Of all zinc finger classes in the human genome, Cis2His2 zinc fingers, named for pairs of cysteine and histidine residues which interact with the zinc ion, are the most prevalent (Matthews and Sunde 2002). Each zinc finger generally interacts with three DNA base pairs. Unlike helix-turn-helix DNA-binding domains, which often dimerize to recognize symmetric DNA sequences, zinc finger DNA-binding domains are modular and can be combined to recognize extended, asymmetric sequences (Segal et al. 1999; Pavletich and Pabo 1991). For example, three or more zinc fingers together can confer DNA sequence specificity to a protein, allowing it to target a particular gene (Matthews and Sunde 2002).

Several researchers have categorized the zinc finger protein sequences needed to recognize specific triplets of DNA base pairs. These functionally modular sequences can be combined to make proteins that can bind with nanomolar affinity to DNA sequences up to 18 base pairs in length (Segal et al. 1999). Not all triplets can be recognized; while there is nearly complete coverage of 5’-GNN-3’, 5’-ANN-3’, and 5’-CNN-3’ DNA-recognition sequences (Segal et al. 1999; Dreier et al. 2001; Dreier et al. 2005), the 5’-TNN-3’ sequences are largely unknown. Furthermore, not all characterized sequences exhibit specific binding, and reported success rates of the strict modular assembly methods vary considerably. Library selection for improved binders is often required (Ramirez et al. 2008).

As mentioned previously, replacement of the helix-turn-helix DNA-binding motif of a serine recombinase with a zinc finger DNA-binding motif changes the specificity of the resulting chimeric enzyme to a site recognized by the zinc finger (Akopian et al. 2003). Chimeric enzymes of the serine recombinase family have been retargeted to sites precisely specified by Zif268 zinc finger DNA-binding domains (Akopian et al. 2003). Importantly, such chimeras are capable of recombining their newly-specified target sites in mammalian cells (Gordley et al. 2007).

Materials and Methods

Selection of zinc finger protein domains necessary to bind to enzyme target DNA sequences

After identifying candidate CCR5 recombination sites and recombinase target sequences (see Supplementary Text), I evaluated each of the candidate recombination sites and chose the two sites yielding the four enzyme target sequences with the most promising designed-zinc finger binding affinities and least competition (ability of other zinc fingers to bind to the sequences) (Segal et al. 1999; Dreier et al. 2001; Dreier et al. 2005). Recombination between the selected sites will excise the entire C-terminal coding sequence of the CCR5—a targeted deletion of 571 base pairs of genomic DNA.

Creation of chimeric recombinase enzymes with designed zinc finger DNA-binding domains

Hin DNA invertase is a serine recombinase with a similar size and organization as γδ-resolvase and Tn3-resolvase (Figures 3 and S3) (Grindley et al. 2006). The hyperactive Hin mutant H107Y, which is able to catalyze efficient DNA inversions and deletions without the Fis regulatory protein, recombinatorial enhancer sequence, or supercoiled DNA substrate required by the wild-type enzyme, and was selected as the catalytic core (Sanders and Johnson 2004).

Oligonucleotides (Integrated DNA Technologies, Coralville, IA) encoding the designed zinc fingers were assembled and ligated to generate DNA sequences encoding the respective zinc finger domains. These zinc fingers were cloned into Hin-H107Y hyperactive recombinase mutants, replacing their C-terminal domains, in a manner analogous to that reported by Stark and coworkers using hyperactive mutant Tn3 resolvase. The flexible peptide sequence linking the catalytic and engineered DNA-binding domain of the most active reported chimeric recombinase was retained (Akopian et al. 2003). These clones yielded chimeric recombinase enzymes designed to target CCR5.

Assaying the activity of each engineered chimeric recombinase enzyme individually

Enzymes were assayed individually using substrate DNA plasmids with target recombination sites containing a total of four identical zinc finger DNA binding sequences. Each enzyme acted a homotetramer to carry out recombinase-mediated DNA inversion or excision. This strategy allowed determination of the activity of each individual engineered recombinase enzyme, permitting individual optimization prior to testing the four variants together for heterotetramer activity on CCR5.

Recombination sites used in these assays had the following sequences:





For excision assays, both recombination sites were identical. For inversion assays, the second recombination site was the reverse complement of the sequence listed above. Enzyme activity on target recombination sites in E. coli was determined using PCR and gel electrophoresis (see Supplementary Text).

Figure 6
Figure 6. CCR5 coding sequence excerpts of wild-type, Δ32 deletion mutant, and recombinase-treated CCR5 versions. Sequence continuation is abbreviated as an ellipsis. Three of the four chimeric recombinase enzyme target sequences are marked by locations of their respective zinc finger DNA-binding domain recognition sites (orange, pink, and turquoise underlines). Zinc fingers for B and D bind to the noncoding DNA strand, not pictured. The target sequence for recombinase C is not marked as it is excised during recombinase treatment. The “TAG” stop codon, introduced by the recombinase-mediated excision and resulting frameshift, is followed by another stop codon “TAA” just six base pairs downstream, ensuring that the translated protein will be truncated.

Generation of a selection system for directed evolution of engineered chimeric recombinase enzymes

As enzymes capable of acting upon the plasmid encoding them, recombinases are suitable for substrate-linked protein evolution. Using a selection scheme similar to that demonstrated by Buchholz and coworkers, albeit adapted to zinc finger-recombinase recognition sequences, I constructed recombinase selection plasmids (Buchholz and Stewart 2001).

I first crafted DNA plasmids containing a gene for a chimeric recombinase X (A, B, C, or D) as well as two recombination sites for recognition by chimeric recombinase Y (a different recombinase). By design, such plasmids would be stable and show no recombination. Pairings were as follows: recombinase A with recombination sites C, recombinase B with recombination sites D, recombinase C with recombination sites B, and recombinase D with recombination sites A.

Figure 7
Figure 7. Selection plasmid RecX-SitesY construct.

The selection plasmid for a given recombinase enzyme contained several restriction endonuclease sites flanked by recombinase recognition sites. Adjacent to this segment was the recombinase gene itself. Upon transformation into competent E. coli, plasmids carrying an active recombinase library member would remove the restriction endonuclease sites by recombinase-mediated excision. After DNA isolation and restriction digest, active library members would remain uncut, while inactive members would be digested. The subsequent amplification with PCR primers flanking the excision region and the recombinase gene should positively select for library members active on the desired site. In the event of incomplete restriction endonuclease digests, a successfully recombined product would still be distinguishable by size via gel electrophoresis, significantly reducing the background.

Figure 8
Figure 8. Recombinase selection scheme. Active enzymes excise the region subjected to restriction endonuclease digestion, while inactive library members are cleaved. After complete digestion, only active library members yield PCR products. In the event of incomplete digestion, recombined products can be gel-purified before subsequent amplification and diversification steps.

With these plasmids, I used HindIII and EcoRI restriction enzymes (New England Biolabs, NEB, Ipswich MA) to cleave the zinc finger regions and therefore separate the DNA-binding domain of recombinase X from the backbone containing sites Y. However, only purification of the zinc finger region, not the backbone, was feasible due to the similar sizes of digested, backbone DNA and undigested vector.

Figure 9
Figure 9. Selection plasmid Rec-Chl-SitesY construct to facilitate backbone purification following HindIII and EcoRI digests.
Figure 10
Figure 10. Test constructs used to assay engineered chimeric recombinase activity in E. coli for enzymes A (top) and B (bottom). Upon recombinase- mediated inversion, primer 2 inverted and could form a PCR product with primer 1.
Figure 11
Figure 11. Left: 1Kb+ DNA ladder image (Ingenesys, Co.). Right: Agarose gel depicting the presence or absence of PCR products, reflecting enzyme activity on target recombination sites or lack there-of. Lanes 1 and 15: 1Kb+ DNA Ladder. Lanes 2-4 and 9-11: PCR templates were plasmids with the correct target recombination sites but different chimeric recombinase zinc finger regions as controls. No product was expected. Lanes 5-8 and 12-15: PCR templates were plasmids with the correct recombination sites and respective zinc finger regions. Expected products were ~770 base pairs long for A and ~700 base pairs long for B.

To facilitate backbone gel-purification, I created four new plasmids expressing the comparatively large chloramphenicol resistance gene in place of the short zinc finger region. This resulted in plasmids encoding a gene for the chimeric recombinase catalytic domain fused to the chloramphenicol resistance gene. Like the zinc finger region, the resistance gene was flanked by HindIII and EcoRI restriction sites. These plasmids, too, would be unable to recombine due to the recombinase-chloramphenicol fusions and lack of zinc finger DNA-binding domains.

Figure 12
Figure 12. Test constructs used to assay engineered chimeric recombinase activity in E. coli for enzymes C (top two plasmids) and D (bottom two plasmids). Upon recombinase-mediated inversion one primer binding site was inverted, permitting PCR product formation.
Figure 13
Figure 13. Agarose gel depicting the results of an enzyme activity assay for chimeras C and D. Lane 1: 1Kb+ DNA ladder. Lanes 3 and 5: PCR on plasmid DNA collected from cells transformed with the proper recombination sites plasmid only. No products were expected. Lanes 4 and 6: PCR on plasmid DNA collected from cells transformed with both the proper recombination sites plasmid and recombinase plasmid. A 375-base pair long product was expected to result from recombinase-mediated inversion and those bands are surrounded by a maroon box. All other bands are unknown.

Restriction digests cleave the zinc finger region, or chloramphenicol resistance gene, from the backbone vector (containing the recombination sites and the Hin-H107Y recombinase catalytic domain). Subsequent ligation reactions pair each recombinase zinc finger region with the catalytic domain on plasmids containing the appropriate recombination sites. The chimeric enzymes recombine the plasmid DNA encoding them, excising a large fragment. Therefore, active variants were identifiable as those with proper DNA deletions. Chimeric recombinase libraries were created and analyzed as described in the supplementary text.


Engineered chimeric recombinase enzymes have activity on target recombination sites

After creating chimeric recombinase enzymes engineered to bind to CCR5, I assayed the enzymes individually in E. coli for homotetramer activity on their respective sites. To evaluate the first two chimeras, A and B, I expressed the plasmids shown below:

Figure 14
Figure 14. Selection plasmid RecX-SitesX (left) and RecX-Library-SitesX (right) constructs resulting from cloning X zinc finger region inserts into backbone vector containing X recombination sites.

Recombination sites on the plasmids were aligned for recombinase-mediated DNA inversion. To assay for enzyme activity, PCR reactions were performed using primers which bind in the same orientation unless recombinase enzymes can invert the segment containing the one primer’s binding sequence. PCR product formation was indicative of active recombinase enzymes as no product was expected from non-inverted substrates.

As confirmed via sequencing, recombinase-mediated inversion occurred with engineered chimeric enzymes A and B.

To assess the activities of recombinases C and D, I transformed E. coli with two plasmids instead of one. In this activity assay, substrate plasmids were separate from those that encoded the recombinase enzymes. The two sets of plasmids expressed for activity assays of chimeras C and D are depicted below:

The activity assays described above support the conclusion that I successfully conferred CCR5 binding specificity to the four chimeric recombinase enzymes. All engineered enzymes were active on their target recombination sites. Activity levels cannot be directly compared because PCR amplification bias renders this a non-quantitative assay. Given that the four engineered chimeric recombinases displayed some level of activity, all were suitable candidates for directed evolution.

Successful enzyme library generation

Following purification of insert X zinc finger regions and backbone vector containing X recombination sites, DNA was ligated and transformed into cells. Inserts for library generation were first subjected to mutagenic PCR to introduce diversity. Ligations produced RecX-SitesX plasmids or RecX-Library-SitesX plasmids, depicted below.

The library of engineered zinc finger-recombinase enzyme mutants appeared to be of substantial size for all four enzyme libraries as determined by sample dilutions (Fig S7). Based on those plates, library sizes ranged from 107 to 108. Sixteen random library members were sequenced to determine the mutation rate for each library, which appeared to be fairly low (Table S1).

Library analysis

Libraries of engineered zinc finger-recombinase enzyme mutants were generated in order to isolate mutant enzymes with improved activity and specificity via iterative rounds of selection. Due to time constraints, here I report analyses of the initial libraries only, not those after multiple rounds of directed evolution. Further rounds will be completed in the near future.

Recombinase-mediated excision on normal RecX-SitesX and RecX-Library-SitesX plasmids was assayed using PCR and gel electrophoresis. Normal and library plasmid samples were analyzed after either four or 21 hours in cells, and with or without restriction endonuclease digestion to cleave substrates that did not recombine. Results are shown below:

Figure 15
Figure 15. Agarose gel depicting PCRs on normal RecX-SitesX plasmids and RecX-Library-SitesX plasmids after four hours in E. coli cells but without restriction endonuclease digestion. Lanes 1 and 10: 1Kb+ DNA ladder. Lanes 2-5: RecA-SitesA, RecB-SitesB, RecC-SitesC, and RecD-SitesD. Lanes 6-9: RecALibrary-SitesA, RecB-Library-SitesB, RecC-Library-SitesC, and RecD-Library-SitesD. Unrecombined plasmids are 2037 base pairs long, while recombined plasmids are 1017 base pairs long.
Figure 16
Figure 16. Agarose gels depicting PCRs on normal RecX-SitesX plasmids and RecX-Library-SitesX plasmids after four hours in E. coli cells and after restriction endonuclease digestion. Left lanes 1 and 6: 1Kb+ DNA ladder. Left lanes 2-5: RecA-SitesA, RecB-SitesB, RecC-SitesC, and RecD-SitesD. Right lanes 1 and 6: 1Kb+ DNA ladder. Right lanes 2-5: RecA-Library-SitesA, RecB-Library-SitesB, RecC-Library-SitesC, and RecD-Library-SitesD. Un-recombined plasmids are 2037 base pairs long, while recombined plasmids are 1017 base pairs long.
Figure 17
Figure 17. Agarose gel depicting PCRs on normal RecX-SitesX plasmids and RecX-Library-SitesX plasmids after 21 hours in E. coli cells but without restriction endonuclease digestion. Lanes 1 and 10: 1Kb+ DNA ladder. Lanes 2-5: RecA-SitesA, RecB-SitesB, RecC-SitesC, and RecD-SitesD. Lanes 6-9: RecA-Library-SitesA, RecB-Library-SitesB, RecC-Library-SitesC, and RecD-Library-SitesD. Un-recombined plasmids are 2037 base pairs long, while recombined plasmids are 1017 base pairs long.
Figure 18
Figure 18. Agarose gel depicting PCRs on normal RecX-SitesX plasmids and RecX-Library-SitesX plasmids after 21 hours in E. coli cells and after restriction endonuclease digestion. Lanes 1, 10, and 15: 1Kb+ DNA ladder. Lanes 2-5: RecA-SitesA, RecB-SitesB, RecC-SitesC, and RecD-SitesD. Lanes 6-9: RecA-Library-SitesA, RecB-Library-SitesB, RecC-Library- SitesC, and RecD-Library-SitesD. Lanes 11-14: Rec-Chl-SitesA, Rec-Chl-SitesB, Rec-Chl-SitesC, and Rec-Chl-SitesD after restriction endonuclease digestion serve as the negative controls. Un-recombined plasmids are 2037 base pairs long, while recombined plasmids are 1017 base pairs long.
Figure 19
Figure 19. Agarose gel depicting PCRs on RecX-SitesY plasmids after restriction endonuclease digestion as negative controls. Lanes 1 and 6: 1Kb+ DNA ladder. Lanes 2-5: RecA-SitesC, RecB-SitesD, RecC-SitesB, and RecD-SitesA. Un-recombined plasmids measure 2037 base pairs long, while recombined plasmids measure 1017 base pairs long.

The above gels show that the original engineered recombinase enzymes show some activity, and that library members may show at least as much activity, if not more. Furthermore, PCRs performed on restriction endonuclease-digested templates help bias the reactions towards producing the shorter, recombined bands. Therefore, digestion is an important step in the selection scheme, illustrating that recombinase activity levels are still quite low. Data also show that recombination events were significantly more prevalent when plasmids had been in cells for 21 hours as opposed to only for four hours. Finally, excision catalyzed by engineered recombinases appears to be sequence specific; RecX-SitesY plasmids do not recombine (Figure 16).

Selection of active library members and subsequent selected library analysis

Active recombinase enzyme sequences were selected via gel purification of the recombined, 1017-base pair long band. Zinc finger regions were amplified by PCR, cleaved with restriction enzymes, and cloned back into the proper vector backbone. For each library, 16 of these selected RecX-Library-SitesX members were compared to two normal RecX-SitesX plasmids in PCR-based recombination assays. All were sequenced to aid in the categorization of interesting mutants. As these variants were the result of only a single diversification and selection round, however, this selected library pool was not expected to significantly exceed normal enzyme activity. Additionally, the gels below are not quantitative; a darker band may be attributed to differing enzyme activity, PCR bias or a greater amount of DNA being loaded into the lane. However, to make samples as comparable as possible, the PCR reactions were completed after only 20 amplification cycles.

The supplementary material contains gel and sequence analyses that are representative of the first round of directed evolution. The data are presented at this time as an illustration of the current state of the project.


Significance of results

My experimental results demonstrate that chimeric recombinase enzymes can be successfully engineered to recognize target sites, using modular zinc fingers constructed as described by Barbas III and coworkers (Segal et al. 1999; Dreier et al. 2001; Dreier et al. 2005). Furthermore, the enzymes appear to only recombine their specific targets; they have not demonstrated promiscuity in my analyses using RecX-SitesY plasmids.

One notable limitation of the presented data is the absence of quantification. PCR bands visualized on agarose gels indicate the presence or absence of recombination events, but tell little about what fraction was recombined. Based on the data, it appears that in all cases only a small portion of substrate DNA was successfully recombined; quantitative PCR reactions could be done to verify this estimate.

To perform directed evolution experiments, all that is required is initial activity, regardless of how minimal. As all four candidates satisfied this requirement, I was able to develop and validate a substrate-linked protein evolution selection scheme to select for variants with improved enzymatic activity and specificity.

I was unable to perform iterative rounds of selection due to time constraints, but the analyses of my initial libraries showed promise. Ranging from 107 to 108 in size, recombinase enzyme libraries were large, although mutation rates per base pair within the mutated zinc finger region were rather low at less than 0.5%. As a whole, library members from this first round of mutagenesis appeared to be more active than the starting pool; this observation was particularly pronounced with library A, which was more active to start.

After selection of successful recombinants, I compared the activity of 16 selected variants to normal, non-mutagenized starting pool enzymes. Both starting pool and library members demonstrated successful recombination, although I did not identify any mutants that were definitively more active than normal variants. This is no surprise, given that only 16 selected variants from each library were analyzed, and all were the result of just one round of selection. Within the libraries, there appears to be significant background recombinase activity resulting from normal, non-mutagenized variants as many of the selected library members had no mutations in the zinc finger region.

Upcoming experiments

Given the consistent activity of the chimeric enzymes as originally designed, in subsequent experiments, I will need to give the enzymes significantly less time in cells to enrich for the most active variants. This strategy will be adopted with all forthcoming experiments. For library enrichment, I will isolate active mutants, clone them into backbone vector, give them two hours to recombine, and repeat. Following two rounds of enrichment, I will amplify selected library members via mutagenic PCRs and perform iterative rounds of directed evolution, adding additional enrichment steps in between.

If necessary, negative selection against a particular undesired target site can be achieved by incorporating “false” recognition sites, differing slightly from the correct site, into the selection plasmids. Recombination using “false” sites surrounding a PCR primer binding site would prevent amplification of less-specific recombinase variants, allowing selection against promiscuous variants.

After evolving highly-effective copies of the four zinc finger-recombinase enzymes individually, I will verify their concerted activity on the CCR5 gene sequence in E. coli to determine if the enzymes can effect a CCR5 deletion. I have already constructed a plasmid designed to assay activity in bacteria by cloning the CCR5 gene into a bacterial vector. The four evolved recombinase genes will be cloned onto one expression plasmid to facilitate transformation into cells containing the CCR5 test plasmid. Only cells containing both constructs will be selected. A restriction endonuclease digest of the CCR5-containing plasmid will enable detection of the proper deletion via gel electrophoresis, with the relative band intensities yielding an estimate of excision efficiency (Thomson and Ow 2006).

Even more accurate quantification of activity could be achieved by assays in vitro.  To quantitatively evaluate tetramer activity in vitro, recombinase enzymes could be purified and mixed with the substrate selection vector. DNA after various time periods could be digested with restriction endonucleases and electrophoresed on agarose gels to detect the proper deletion. This assay would illuminate the time-frame and efficiency of recombinase-mediated excision.

Should only weak recombinase-mediated excision of CCR5 be observed, further directed evolution will be performed to select for stronger activity of the enzyme tetramer. The method will be similar to before, using restriction endonucleases that will cut at sites inserted in the middle of CCR5.

Following successful CCR5-segment excision in bacteria and in vitro, in vivo assays will be conducted with human cells. Given the results of Barbas III and coworkers, there is reason to expect some degree of activity in human cells (Gordley et al. 2007). An empty selection vector modified for human cells will be co-transfected into human cells with a vector containing all four recombinase genes, downstream of mammalian promoters. Both vectors will also encode different fluorescent protein genes, enabling isolation of co-transfected cells by fluorescence-activated cell sorting (FACS). After fixed time intervals, co-transformants will be selected via FACS and their DNA extracted. Recombinase activity in human cells will be assayed using qPCR to detect the percent of transfected cells harboring a CCR5 deletion.

Engineered recombinase enzymes as gene therapy agents conferring protection from HIV-1

The deletion in CCR5 created by the engineered recombinase enzymes is expected to be as effective as the natural Δ32 deletion conferring HIV-1 resistance because the enzymes will generate a truncated protein in an analogous manner. Although a number of amino acids are added to CCR5 prior to the frameshift-induced stop codon, they are not especially hydrophobic. There is no reason to expect that they would allow efficient CCR5 membrane insertion, but this requires verification in human cells. CCR5 Δ32 and the recombinase-treated protein version should be directly compared in human cells to determine if the mutation was properly emulated and resistance to HIV-1 conferred.

The development of safe, efficient recombinase enzymes for genome modification would represent a significant step towards gene therapy; however, it is only one component of a successful treatment. A method for safe and reliable enzyme delivery to the appropriate cells is equally necessary and one being pursued by numerous researchers (Lu et al. 2004; S. D. Li and L. Huang 2006).

At present, there are few genome modification methods with the potential to generate a precise deletion in a gene such as CCR5. To date, the only enzymes capable of performing such a deletion are zinc finger nucleases (Perez et al. 2008), which have been demonstrated on this specific target. However, their potential for development into an efficacious therapy is doubtful due to the necessity of generating a double-strand break in the target cell, which leads to non-homologous end-joining and the potential for deleterious and possibly carcinogenic chromosomal rearrangements in at least a fraction of the modified cells (van Gent et al. 2001). While very effective in modifying cells in culture, which could conceivably be cultured and transferred to the patient, this therapy is unlikely to be suitable for direct delivery. As such, the development of safe and specific enzymes for direct gene modification in patients remains an open problem.

A more ideal method, suitable for delivery of engineered recombinase enzymes, is direct viral vector injection. For example, the HIV-1-based lentivirus vector VRX496 might be modified to deliver the engineered chimeric recombinase genes (encoded with a viral or human cell promoter) (Lu et al. 2004). This method would naturally target all of the relevant immune cells, delivering the recombinase therapy precisely where it is needed.

The purpose of this protein engineering research is to advance efforts to prevent HIV-1 infection; however, the project only addresses the gene-modification aspect of gene therapy—one of many important challenges. Engineered chimeric recombinase enzymes show promise towards the generation of an HIV-1 gene therapy that would be safe for use with direct delivery. Therefore, such a therapy could be suitable for treatment of individuals in developing countries, where a more involved gene delivery method is not a feasible option and where need for HIV-1 protection is dire.


Agrawal, L. et al., 2007. CCR5{Delta}32 Protein Expression and Stability Are Critical for Resistance to Human Immunodeficiency Virus Type 1 In Vivo. J. Virol., 81(15), 8041-8049.

Akopian, A. et al., 2003. Chimeric recombinases with designed DNA sequence recognition. Proc Natl Acad Sci U S A, 100(15), 8688-91.

Allen, S.J., Crown, S.E. and Handel, T.M., 2007. Chemokine: receptor structure, interactions, and antagonism. Annu Rev Immunol, 25, 787-820.

Berger, E.A., Murphy, P.M. and Farber, J.M., 1999. Chemokine receptors as HIV-1 coreceptors: roles in viral entry, tropism, and disease. Annu Rev Immunol, 17, 657-700.

Buchholz, F. and Stewart, A.F., 2001. Alteration of Cre recombinase site specificity by substrate-linked protein evolution. Nat Biotechnol, 19(11), 1047-52.

Carrington, M. et al., 1997. Novel alleles of the chemokine-receptor gene CCR5. Am J Hum Genet, 61(6), 1261-7.

Carroll, D., 2008. Progress and prospects: zinc-finger nucleases as gene therapy agents. Gene Ther, 15(22), 1463-8.

Cathomen, T. and Joung, J.K., 2008. Zinc-finger nucleases: the next generation emerges. Mol Ther, 16(7), 1200-7.

Company, B.B., 2007. The Process of HIV Infection, BBC.

Dreier, B. et al., 2001. Development of zinc finger domains for recognition of the 5’-ANN-3’ family of DNA sequences and their use in the construction of artificial transcription factors. J Biol Chem, 276(31), 29466-78.

Dreier, B. et al., 2005. Development of zinc finger domains for recognition of the 5’-CNN-3’ family DNA sequences and their use in the construction of artificial transcription factors. J Biol Chem, 280(42), 35588-97.

van Gent, D.C., Hoeijmakers, J.H.J. and Kanaar, R., 2001. Chromosomal stability and the DNA double-stranded break connection. Nat Rev Genet, 2(3), 196-206.

Glass, W.G. et al., 2006. CCR5 deficiency increases risk of symptomatic West Nile virus infection. J Exp Med, 203(1), 35-40.

Gordley, R.M. et al., 2007. Evolution of programmable zinc finger-recombinases with activity in human cells. J Mol Biol, 367(3), 802-13.

Grindley, N.D., Whiteson, K.L. and Rice, P.A., 2006. Mechanisms of Site-Specific Recombination. Annu Rev Biochem.

Huang, Y. et al., 1996. The role of a mutant CCR5 allele in HIV-1 transmission and disease progression. Nat Med, 2(11), 1240-3.

Hutter, G. et al., 2009. Long-Term Control of HIV by CCR5 Delta32/Delta32 Stem-Cell Transplantation. N Engl J Med, 360(7), 692-698.

Ingenesys, Co., 1Kb+ DNA Ladder. Available at: [Accessed April 6, 2009].

Janeway, C. et al., 2008. Janeway’s immunobiology 7th ed., New York, NY: Garland Science.

Lama, J. and Planelles, V., 2007. Host factors influencing susceptibility to HIV infection and AIDS progression. Retrovirology, 4, 52.

Li, S.D. and Huang, L., 2006. Gene therapy progress and prospects: non-viral gene therapy by systemic delivery. Gene Ther, 13(18), 1313-9.

Li, W. et al., 2005. Structure of a synaptic gammadelta resolvase tetramer covalently linked to two cleaved DNAs. Science, 309(5738), 1210-5.

Lu, X. et al., 2004. Antisense-mediated inhibition of human immunodeficiency virus (HIV) replication by use of an HIV type 1-based vector results in severely attenuated mutants incapable of developing resistance. J Virol, 78(13), 7079-88.

Matthews, J.M. and Sunde, M., 2002. Zinc Fingers–Folds for Many Occasions. IUBMB Life, 54(6), 351-355.

Mecsas, J. et al., 2004. Evolutionary genetics: CCR5 mutation and plague protection. Nature, 427(6975), 606.

Moreno, C. et al., 2005. CCR5 deficiency exacerbates T-cell-mediated hepatitis in mice. Hepatology, 42(4), 854-62.

O’Brien, S.J. and Moore, J.P., 2000. The effect of genetic variation in chemokines and their receptors on HIV transmission and progression to AIDS. Immunological reviews, 177, 99.

Oh, D. et al., 2008. CCR5Δ32 Genotypes in a German HIV-1 Seroconverter Cohort and Report of HIV-1 Infection in a CCR5Δ32 Homozygous Individual. PLoS ONE, 3(7), e2747.

Pavletich, N.P. and Pabo, C.O., 1991. Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A. Science, 252(5007), 809-17.

Perez, E.E. et al., 2008. Establishment of HIV-1 resistance in CD4+ T cells by genome editing using zinc-finger nucleases. Nat Biotechnol, 26(7), 808-16.

Premack, B.A. and Schall, T.J., 1996. Chemokine receptors: gateways to inflammation and infection. Nature Medicine, 2(11), 1174-1178.

Ramirez, C.L. et al., 2008. Unexpected failure rates for modular assembly of engineered zinc fingers. Nat Meth, 5(5), 374-375.

Sanders, E.R. and Johnson, R.C., 2004. Stepwise dissection of the Hin-catalyzed recombination reaction from synapsis to resolution. J Mol Biol, 340(4), 753-66.

Segal, D.J. et al., 1999. Toward controlling gene expression at will: selection and design of zinc finger domains recognizing each of the 5’-GNN-3’ DNA target sequences. Proc Natl Acad Sci U S A, 96(6), 2758-63.

de Silva, E. and Stumpf, M.P., 2004. HIV and the CCR5-Delta32 resistance allele. FEMS Microbiol Lett, 241(1), 1-12.

Singh, H. et al., 2008. CCR5-Delta32 polymorphism and susceptibility to cervical cancer: association with early stage of cervical cancer. Oncol Res, 17(2), 87-91.

Stephens, J.C. et al., 1998. Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am J Hum Genet, 62(6), 1507-15.

Thomson, J.G. and Ow, D.W., 2006. Site-specific recombination systems for the genetic manipulation of eukaryotic genomes. Genesis, 44(10), 465-76.



  1. It is a shame that you could not finish this research! From what I have read as a lay person, it appears that HIV-1 is attracted to the CCR5 protein, at least initially. I was thinking that if a benign form of E. coli that could be genetically modified to produce it as “bait” and then “eat”/consume the HIV-1 virus, it would be a Nobel prize winning event. It appears that you have developed a variation of the CCR5 protein that poisons the HIV-1 virus, but my limited knowledge of biochemistry, may have created a misunderstanding on my part. Hopefully, you and/or Corey Rennell can pursue this.