Wednesday, 23 October 2019

Synthetic Biology An Emerging Approach for Strain Engineering

Synthetic Biology

Owing to nature’s inherent complexity, biological systems have traditionally been recalcitrant to both quantitative study as well as engineering. However, recent advances in fields such as metabolic engineering applied molecular biology, and genetics have resulted in a large number of emerging technologies that have the potential to address worldwide issues such as the production of chemicals and fuels, improvement of human health, and efficient utilization of plant biomass. Among these technologies, “synthetic biology” has quickly risen into the spotlight as a paradigm for rewiring cellular systems.

Since the term synthetic biology was first used in the early twentieth century, it has existed as a paradigm at the interface of disciplines such as molecular biology, metabolic engineering, systems biology, mathematics, and physics with the goals of designing and building novel proteins, genetic circuits, metabolic networks, and multicellular consortia from the ground up. Over the past decades, many research groups have developed and expanded this field to realize applications in the chemical, pharmaceutical, agricultural, and food industries. Even more so, these applications (and the field as a whole) have been advanced by several main technological driving forces:
(i) advances in novel biological parts construction, such as precise control of gene expression through promoter engineering,
(ii) cost-effective DNA synthesis and sequencing, and
(iii) computational protein design coupled with high-throughput screening. These technologies have led to advances such as rapid integration of multiple biological parts to reprogram cellular networks, including model-based design of biological systems, construction, and optimization of biochemical pathways, design principles for genetic circuits, and engineering multicellular systems that utilize cell-cell communication, including the development of quorum-sensing networks and construction of microbial consortia.

''Basic Elements"

Synthetic biology views the cell as a collection of many biological “parts” (e.g., DNA, RNA, proteins, regulatory circuits) that are assembled to generate complex biological functions. These components exhibit defined functions in a pathway, network, or cell. Effort store wire synthetic systems have thus focused on three layers: transcriptional control, translational control, and protein regulation. This section discusses recent advancing resynthesis, transcriptional control, and various gene-expression optimization tools to illustrate the importance of engineering these basic elements in isolation combination for synthetic networks. Each of these elements is essential for complete control of strain engineering. In a sense, pathway-engineering applications require predictable control of synthetic parts. From the promoters that regulate individual pathway enzymes to the global regulators that can impart bulk physiological traits, well-characterized synthetic parts play a role in each stage of strain engineering.

"Gene Synthesis"


Gene Synthesis

The capacity to generate DNA sequences de novo enables biology to be “written” as opposed to “copied” from a template. Dozens of gene synthesis companies have emerged in the past decade with the result of translating new advances in DNA synthesis technologies into a continual decrease in synthesis cost that is reminiscent of Moore’s law. As a result, it is now highly feasible to synthesize genes rapidly for most synthetic biology applications, such as codon optimization for heterologous gene expression, generation of biosynthetic pathways, and even creation of artificial genomes. The first synthetic DNA sequence was 75bp long, requiring 20 man-years of labor when it was created in 1970. Significantly longer DNA sequences, such as those encoding pathways, are also synthesized by in vitro or in Vivo methods such as Gibson Isothermal Assembly, DNA Assembler, Overlap Extension PCR (OE-PCR), Ordered Gene Assembly in Bacillus subtilis (OGAB), Sequence and Ligation-independent Cloning (SLIC), In-Fusion, and Ligase Cycling Reaction (LCR). Moving even further up the scale, it is remarkable to note the first synthesis of a Mycoplasma genitalium genome approximately 582970bp in length in 2008 by the Craig Venter Institute. In this work, cassettes of 5–7kb were overlapped and assembled from chemically synthesized oligonucleotides.

These fragments were combined to create intermediate clones of approximately 1/8 and 1/4 of the length of the genome. All four 1/4 genomes were further assembled into a complete synthetic genome by transformation-associated recombination cloning in the yeast. A similar strategy was applied to the marine cyanobacteria-Prochlorococcus, in which the 1.66Mb genome of the bacteria was assembled in yeast. Additionally, a synthetic M. mycoides genome could be transplanted into Mycoplasma capricolum to replace its native genome and form synthetic cells with the same phenotypic properties and capability of self-replication as Mycoplasma mycoides. Thus, gene synthesis has rapidly transformed our capacity to write DNA – the central code of cells.
Gene synthesis is also very important in the process of codon optimization. Commonly, it is desired to produce valuable compounds using proteins naturally found in rare microorganisms. However, the genetic engineering tools for these organisms are either not well developed or the native expression level from these organisms is too low. Therefore, these valuable proteins are usually expressed in a heterologous manner using well-established hosts such as Escherichia coli and yeast for large-scale production. However, evolutionary history has forced different organisms to prefer different codon usage. This bias can manifest itself in heterologous hosts as suboptimal protein expression due to unmatched codon usages. Successful codon optimization strategies usually involve replacing a rare codon(s) in the gene with that of a more frequently used codon in the host. As a demonstration of this power, a synthetic library of 154 variants of green fluorescent protein (GFP) varied randomly at synonymous sites was expressed in E. coli, resulting in proteins with a 250-fold range of expression level variation across the library. Nonetheless, numerous studies suggest that synonymous codon usage beyond the initiation region can impact expression.

Studies with E. coli strains overexpressing rare tRNAs were shown to be capable of significantly improved gene expression. Further, Welch et al.generated 40 synonymous variants of two different proteins, the DNA polymerase of Bacillus phage Φ29 and a synthetic antibody fragment scFv, resulting in more than a 40-fold difference in expression level across each library. Additionally, an in vitro programmed microfluidic droplet system was utilized to generate, in an autonomous manner, customized DNA libraries that successfully synthesize libraries of yeast ribosome binding sites and the bacterial Azurine. These recent advances in inexpensive de novo DNA synthesis enable the creation of a large library of gene variants that can subsequently be screened for expression in a high-throughput manner, thus providing a large pool of basic elements for synthetic biology applications.

Transcriptional Control

Gene transcription is regulated by several factors, such as promoter strength, cis- and trans-acting factors, cell growth phase, and expression level of RNA polymerase. This section details the typical promoter structure in prokaryotes and eukaryotes, and describes relevant examples of promoter engineering – a synthetic biology approach for tuning the expression of genes and generating novel synthetic expression parts. As a complement to well-characterized promoters, optimized gene expression vectors (also mentioned in this section) are critical for fine-tuned transcriptional control.

Promoter Engineering

The promoter is a key regulatory part that plays important roles in the performance of a gene, a gene cluster, or a designed gene circuit. A promoter can be thought of as a sequence of DNA, usually located upstream of the genes it controls, that provides an initial binding site for transcription factors and RNA polymerase. In prokaryotes such as E. coli, there are two conserved motifs in a “consensus” promoter sequence that are typically located 35 and 10bp upstream from the transcription start site. The sigma subunit of prokaryotic RNA polymerase and almost all alternatives constitute a set of transcription factors known as 휎 70 family proteins in E. coli. The subunits can interact with core RNA polymerase, recognize promoter DNA, and can direct the process of transcription initiation.

Promoter Strength Characterization

Knowledge of promoter strength is critical for synthetic circuit and pathway design. Ultimately, such information is critical to apply the appropriate promoters for the expression of multiple genes in a pathway. To this end, many native promoters have been well studied and subsequently diversified into a wide range of promoter strengths. There are several well-characterized reporter genes used to characterize promoter strength, such as GFP, βgalactosidase (LacZ), and β-lactamase. However, promoter strengths are not uniform and are influenced by growth conditions, among other factors.

For E. coli, a set of promoters associated with stationary-phase genes was determined by inserting 300–500bp DNA fragments of a variety of E. coli promoters upstream from the translation initiation codon into are an investor. Thus, when the culture changes from exponential to stationary phase, expression of growth relate genesis decreased, whereas amber of stationary-phase genes a returned on. In yeast, several reports have compared the strengths of constitutive and inducible promoters in great detail. Number et al. inserted four constitutive promoters CYC1, ADH, GPD, and TEF from yeast into plasmids with low (CEN/ARS) or high (2) copy number and characterized promoter activity using LacZ. The activity of LacZ was found to vary by approximately three orders of magnitude under different promoters and copy number combinations. The highest expression level was achieved with the 2 GPD construct, while the lowest one was with the CEN/ARSCYC 1 construct. The expression levels or CYC 1 and ADH varied between 2.6- and 30-fold, respectively, by moving between CEN/ARS and 2 vectors. Fang et al. further showed evidence that using different selective markers in both plasmids and genomic locations could affect protein expression.
Partow et al. constructed seven yeast-constitutive promoters in an integrative plasmid with a reporter gene lacZ. The different reporter systems were stably integrated into a single copy into the genome of S. cerevisiae CEN.PK 113-5D at the URA3 locus, thus avoiding gene copy number variations.

Promoter Library Synthesis Identification and characterization of native promoters is time-consuming, especially for newly identified organisms that may be of biotechnological interest. However, proven workflows enable one to start from a well-studied promoter and vary the expression using library-based screening approaches. The most commonly-evoked technique of direct devolution involves the introduction of mutations through methods such as error-prone PCR, DNA shuffling, and saturation mutagenesis. For example, tuning constitutive gene expression in lactic acid bacteria through library-based approaches has progressed over the last decade. First, a library of synthetic promoters with strength spanning over three to four orders of magnitude was constructed in Lactococcus lactis. By saturation mutagenesis of the spacer regions between consensus −35 and −10 motifs, a 400-fold change in activity was observed. However, this improvement was highly dependent on the organism, as the strength of promoter variants in L. lactis did not correlate well with their respective strengths in E. coli. A similar approach was applied to chromosomal genes in L. lactis. The phosphofructokinase (PD) gene was fused to a library of as promoters, and an additional gene copy was introduced into a phage attachment site on the chromosome, resulting in a range of pfk activities from 1.4- to 11-fold higher than the wild type. The simultaneous modulation of PFC, pyruvate kinase (pk), and lactate dehydrogenase (ldh) activities was further investigated by integrating a truncated pick fragment fused with a library of synthetic promoters at the pfk locus. The activity of the downstream gene and was thus altered from 50% to 350% of the wild-type level.

Optimization of Gene Expression Vectors

Beyond promoters, expression vectors and genetic context influence synthetic construct behavior. High-copy-number vectors (which usually replicate from tens to hundreds of copies per cell) have long been utilized for recombinant gene expression, particularly because of easy manipulation and high gene expression. However, recent studies have shown that this high overexpression system sometimes causes a metabolic burden in the host cell, and may further be structurally and segregationally unstable. The mRNA stability and copy number of multicopy (pMB1-based) and low-copy (F-based) plasmids were evaluated using an inducible promoter, a lacZ reporter gene, and 5 ′ -hairpin structures. Increased inducer significantly decreased cell growth from a high-copy plasmid, whereas the inducer concentration had little effect when using a low-copy plasmid. In an experiment with similar results, the isopentenyl diphosphate (IPP) pathway (which includes genes dx and XDR) was expressed from several expression vectors under the control of three different promoters and transformed into three different E. coli strains.

The comparison of different strains revealed that the dxs gene under the control of an arabinose-inducible promoter on a medium-copy plasmid resulted in twofold higher lycopene production than under the control of isopropyl β-D-1-thiogalactopyranoside (IPTG)-inducible try and lac promoters on medium-copy and high-copy plasmids. When investigating the IPP pathway in pMB1-based plasmids and F-plasmids, the accumulation of metabolites in the stationary phase was enhanced in both cases, although the cell density was 24% lower when using a high-copy plasmid than when using a low-copy plasmid. Overexpression of dx on a high-copy plasmid also significantly decreased cell growth and lycopene production. Thus, the highest copy plasmid is not always the most effective choice for pathway-engineering applications. Sometimes, a low or single-copy plasmid expressing a critical enzyme in a pathway can reduce metabolic burden, resulting in a better overall yield of the desired compound.

Functional and Robust Modules

Assembly of basic elements into functional and robust modules is a pressing challenge in synthetic biology. Modules are not simply combinations of random genes, rather the construction of functional modules requires computational- and experimental-based rational design to dictate the composition and arrangement of basic elements. Ultimately, synthetic biology can achieve a DNA-level specification of the desired function. In this regard, modules will be developed in a customized fashion with performance as the prescribed set of parameters. This next section discusses recent advances in developing pathway and gene circuit design in synthetic biology with strain engineering as the major application.

Synthetic Pathway Modules

Module design at the pathway level usually includes biological parts such as promoters, genes, proteins, and terminators, as described above. Traditional methods usually modify native pathways or express the partial or entire pathway in a heterologous host. In this regard, the repertoire of parts and enzymes was rather limited and thus the potential products were limited to a small subset of central metabolites and known secondary metabolites. However, recent advances have allowed the design of novel pathways for the production of desired compounds through computational tools. In particular, many databases are designed to help identify the required biological parts and maximize the efficiency of the designed pathway, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG), Braunschweig ENzyme DAtabase (BRENDA), Universal Protein Resource (Uniprot), and BioCyc Database. For example, an uncharacterized enolase gene hpbD, from Pelagibacabermudensis, was identified as an amino acid racemase/epimerase by in silico ligand docking of a library of 87098 metabolites from the KEGG database. Further, with the prediction of several neighboring genes hpbJ and hpbB1 in the genome, it was found that the substrate for HpbD is a group of small betaines and reacts in a 1,1-proton transfer reaction. It was thus found that the entire HpbD genome neighborhood constitutes a catabolic pathway that can degrade betaine type-B into α-ketoglutarate.

Pathway Assembly Tools

Several common pathway assembly tools have been developed recently to tackle the goals described above. The Gibson assembly technology is based on the two-step thermocycler method which was used to synthesize a complete genome. By using exonuclease III and an antibody-bound Taq DNA polymerase, Gibson assembly enabled one-step in vitro recombination. Specifically, exonuclease III can remove several nucleotides from the 5 ′ -ends, exposing complementary 3 ′ - ends and allowing these single-stranded DNA overhangs to anneal. Then, the Taq DNA polymerase repairs the unpaired sequence to form a double-stranded DNA product. Finally, a ligase joins the 3 ′ -ends and the 5 ′ -ends of homologous DNA fragments into a covalent sealed molecule. Thistechnologycanenableconstruction is of DNA molecules as large as 583kb and construction of clones in E. coli as large as 300kb. For this reason, Gibson assembly has been used in a number of pathway and strain engineering applications recently. 
Another novel one-step cloning method is called the DNA assembler. On the basis of traditional homologous recombination mechanism in yeast, the DNA assembler allows the assembly of an entire biochemical pathway by combining each gene cassette along with a linearized vector through a single in vivo homologous recombination event in S. cerevisiae. As a proof of concept, a functional D -xylose utilization pathway (∼9kb, three genes), a functional zeaxanthin biosynthesis pathway (∼11kb, five genes), and a functional D -xylose to-zeaxanthin pathway (∼19kb, eight genes) were constructed using this method.

Pathway Metabolic Flux Optimization Approaches

The number of characterized enzymes and enzymes with solved crystal structures is rapidly growing. Still, it is challenging to identify the required enzymes for a synthetic pathway, let alone the exact balance of enzyme activities for optimal pathway performance. In order to maximize product titer and productivities, several groups have attempted to balance metabolic flux through a combinatorial approach, in which the expression level of each gene in the pathway is varied simultaneously to find the optimal pathway configuration. However, most of these studies suffer from limited libraries of metabolic pathways or inefficient in vitro cloning techniques. A combinatorial assembly of homologous pathway enzymes was developed in S. cerevisiae, termed “Customized Optimization of Metabolic PAthways by Combinatorial Transcriptional EngineeRing” (COMPACTER). This library contained multiple mutant pathways under the control of several mutant promoters and terminator pairs through a single DNA Assembler step. The ability of COMPACTER to generate improved pathways was demonstrated in both model strains and industrial strains. After a single round, a xylose-utilizing industrial strain was generated which exhibited 69% of the xylose consumption rate of the fastest reported xylose-utilizing strain. Also, a cellobiose-utilizing industrial strain was improved through COMPACTER, demonstrating the highest reported cellobiose consumption rate and ethanol productivity. An alternative approach is to rebuild the pathway from the ground up through a technique is known as pathway refactoring. Under this process, all native regulation is replaced by well-characterized synthetic parts and genes are codon-optimized. As an example, the nitrogen fixation pathway from Klebsiella oxytoca was refactored into a completely synthetic version of the natural pathway. As a result, the combination of synthetic parts and module development is capable of rapid strain engineering.

Synthetic Circuit Modules

Another major focus in the development of function alandr robust module sins synthetic biology is the design of gene regulatory circuits at the network level. Despite significant efforts to balance metabolic flux mentioned above, a metabolic pathway in alternate biological systems can often lead to the accumulation of metabolic intermediates, resulting in toxicity to the cell and causing a metabolic burden. In order to optimize cellular bioprocess and precisely control gene expression in response to environmental stimuli, significant efforts have been directed toward engineering responsive gene regulatory circuits. Similar to the construction of electrical circuits, the gene circuit approach treats promoters, repressors, activators, reporter genes, or ribosome binding sites as the nodes in an electrical circuit. While there are inherent flaws in viewing biological components with this paradigm, significant progress has nonetheless been made in creating functional genetic circuits.

Synthetic Circuit Design

The early development of gene circuit design was usually based on a simple mathematical model. To understand and assemble more complex gene networks, more sophisticated computational design tools are being developed. System model optimization in gene circuit design is often carried out through iterative rounds of mathematical design, genetic manipulation, experimental observation, and model refinement. Collectively, these steps comprise the design-build-test cycle described in synthetic biology. As the first step, the development of computer-aided tools provides a basis for efficient gene circuit engineering in synthetic biology. Several software packages have been reported to choose precise genetic components, optimize gene expression, and predict the performance of the resulting system.

Next-Generation Synthetic Circuits

Pioneering work has enabled significant progress in designing biological parts and assembling them into functional synthetic circuits by integrating modeling and experiments. However, the construction of useful next-generation synthetic gene networks to solve societal issues is still an ongoing challenge. More closely linking synthetic circuit design with endogenous cellular processes has begun, developing a new generation of synthetic circuits. Such complex gene circuits, coupled with more sophisticated integration of fundamental genetic parts, would better mimic natural cellular networks in different organisms. Moreover, it would allow for better integration and control of both innate and heterologous components. Recent studies have begun to emerge in this direction. Examples include a tunable bandpass filter developed in E. coli, which will be useful for studying cellular differentiation and development. In addition, recent developments in stem cell biology have started to uncover the genetic networks responsible for uncontrolled population growth and differentiation. Additionally, analog-to-digital and digital-to-analog converters would enable both the activation of genetic pathways with analog inputs as well as the conversion of digital representations of cell metabolism back into analog outputs. Thus, by creating a new design and computational tools that take into account biological variability by design,more-reliable synthetic circuits and advanced behaviors can be developed, such as precise control of cell life cycle for maximum productivity, flexible control of gene expression to minimize metabolic burden, and reversible switching of a cell between two metabolic states. The future is thus bright for merging these next-generation synthetic circuits with strain engineering, ranging from process control for chemical production, gene therapy, programmed microbiome therapy, to “intelligent” plants.

"Microbial Communities"

Up to this point, we have considered strain engineering in single organisms. However, the idea of consortia-based bioprocessing is gaining interest, especially in biomass processing into fuels. More generally, in either a natural or artificial biosystem, a distributed set of organisms may be necessary to achieve the macroscopic performance of complex functions. Synthetic biology has the potential to create control systems capable of stabilizing and harnessing microbial consortia. Previousresearchhasestablishedthecapacityforcell-cell communications via synthetic regulatory networks and small-molecule signals. In this regard, the development of synthetic gene networks has leveraged well-studied, natural cell-cell communication systems, such as the quorum-sensing phenomenon found in the bacterium Vibrio Fischer. For example, a synthetic cell-cell communication circuit was created using an artificial quorum sensor in E. coli. The secretion of acetate was used as the signal molecule, and acetyl phosphate (ACP) interacted with two-component regulators which were involved in the phosphate starvation response, nitrogen regulation, and chemotaxis. In this case, GFP expression was driven by the ACP-regulated hybrid promoter based on glnAp2. The signal molecule accumulated as a function of cell density, and once the concentration of the molecule reached a sufficient level, the signal could diffuse across the cell membrane. As a result, a fast transition to high GFP expression was observed in the culture. 

No comments:

Post a comment