Dror Baran
Synthetic Biology and Protein Engineering in Antibody Discovery
Updated: Oct 22, 2020
Synthetic biology is a scientific discipline in which organisms are designed to gain new and improved abilities. These modified organisms have various applications; they can serve as small factories for substance production like drugs and chemicals, improving nutritional benefits of agricultural crops and increasing drought and disease tolerance, and even be used as molecular sensors or catalysts for drug discovery. Synthetic biology is an especially exciting industry as it enables both eco-friendly and cheaper production in comparison to traditional chemical-based production.
In recent years, the synthetic biology revolution has exploded and rapidly progressed, mainly due to breakthroughs in protein engineering, recombinant DNA technology (DNA synthesis and next-generation sequencing), and reduction in computing costs. All of these technologies ushered in a new era in which researchers can test billions of molecules in a single experiment, acquire highly accurate data in mere weeks, and extrapolate new findings and understanding from it.
One of the most well-established fields in which synthetic biology is applied extensively is antibody discovery and engineering. Antibodies are the Y-shaped proteins made of two identical “arms,” each composed of a heavy chain and a light chain that together form the unique specificity of the antibody for its respective antigen (the molecule causing an immune response).

The image on the left is of an ‘IgG’, the most common type of antibody found in blood circulation. The light chains are colored yellow. Each ‘arm’ of the antibody can bind a separate antigen
Antibodies are produced by our immune system and their role is to neutralize pathogens and other foreign substances by attaching to the antigen (the site where the antibody binds the antigen specifically is called an antigenic determinant or epitope). This antibody-antigen binding can physically block the pathogen from entering our cells (e.g. in the case of viruses) and also alert the immune system, by tagging the foreign molecule, recruiting it to eliminate the pathogen.
The ability of antibodies to recognize a specific epitope has made them an important tool for research and diagnostics. Antibodies are utilized as molecular sensors that can accurately detect even trace amounts of materials from blood samples. Also, antibodies are used for blocking or activating cognate signal transduction pathways within our bodies serving as pharmaceutical drugs for fighting inflammatory disorders, anti-cancer drugs, and many more illnesses. In fact, 73 antibody therapies were approved by the European Medicines Agency or Food and Drug Administration, comprising more than half of the top-selling pharmaceutical products.
Each and every one of us has our own extensive antibody repertoire estimated to comprise at least 10^12 unique antibodies. In other words, our immune system maintains billions of different cell-producing antibodies, each one capable of producing a unique antibody that can potentially bind to a different antigen epitope. This antibody repertoire is analogous to a barracks of immunity soldiers, protecting our body from pathogens, and constantly improving in the arms race following recurring antigen exposures with new and improved antibodies in a process called affinity maturation.
The ability of our immune system to generate so many different antibodies and to maturate them relies on their structure, a structure that was designed by evolution over the span of billions of years. Most of the antibody chains are highly conserved in structure and amino-acid sequence while the regions that come in contact with the antigen and determine the antibody’s specificity, named Complementarity-determining regions (CDRs), are diverse. This diversity is what allows our antibodies to theoretically recognize any molecule on earth.

The image on the left is of the antibody variable-fragment, the region that directly binds the antigen. The light and heavy domains are colored in green and yellow respectively. The CDRs of the heavy domain are colored and annotated in red. Each antibody binding arm have 6 CDRs
How does the immune system create this repertoire of diverse antibodies?
B-Cells, the antibody-producing cells, during their cellular development undergo a genetic process called V(D)J recombination. In this process, a full antibody gene is being assembled by a semi-random rearrangement of several gene segments and additional processing. Besides the ability of antibodies to mature and improve over recurring antigen exposure, there is an additional hypermutation process wherein B-cells that produce pathogen-recognizing antibodies are mutated and selected by our immune system, generating new B-cells. These new B-cells then produce antibodies that better bind to the pathogen, strengthening the immune system’s defense.
The remarkable yet elegant methods in which our immune system creates antibody repertoires and matures them have been used by synthetic biologists for decades. Scientists have been using molecular biology tools to create relatively simple repertoires by using a few DNA templates and inserting random/rational mutations by error-prone amplification and DNA shuffling methods. Current DNA synthesis technologies and computing capabilities are enabling scientists to create DNA libraries of almost infinite size. Most of these new-age libraries are explicitly designed therefore have a reduced number of unwanted sequences, deleterious mutations and in fact, allow researchers to investigate the “functional” antibody sequence space in a much more efficient and affordable manner.
In the last 30 years, synthetic biology was used to develop genetically modified microorganisms such as bacteria, yeast, and even phages that uptake these antibody-encoding DNA libraries, express them on their cell surface, and can be isolated according to their ability to bind an antigen of interest. Each of these microorganisms is practically used as an independent small factory; maintaining its unique antibody sequence, translating the DNA to an antibody protein, and shuttling it onto the organism’s surface,ready to bind antigens. Harnessing synthetic factory-like microorganisms for screening and isolating antibodies equipped scientists with the ability to easily create and test antibody repertoires that are >10 times larger than our immune system in a single experiment, in a single tube.
The breakthroughs in DNA synthesis and recombinant microorganisms together with next-generation sequencing have allowed scientists to perform experiments and gather an immense amount of experimental data. This accumulated data has paved the way for computational biology, the new wave of synthetic biology, to come to the forefront. While the computational design of proteins is a relatively young discipline in synthetic biology in comparison with others, major breakthroughs are constantly being made in protein design. In recent years, the computational design discipline has allowed scientists to design antibodies with tailor-made functionalities and desired biochemical properties using advanced algorithms for macromolecular modeling of proteins.
These algorithms can model the complex 3D structures of antibodies and design them accordingly taking into account the extensive knowledge scientists have gained over decades, crafting our understanding of the physicochemical laws of nature such as protein-protein interaction, electrostatic forces, hydrophobicity, etc. This recent addition to antibody engineers in combination with novel DNA synthesis and high throughput screening platforms creates a closed learning-loop methodology in which researchers can quickly design, build and test novel antibodies and gain key insights from millions of data points generated from experimental testing, leading to more accurate design capabilities that go beyond only antibody and can also be applied on other protein families. Taking into account this new learning methodology, by generating massive amounts of data, the reduction of computing costs, and advancements in machine learning, an inevitable synthetic biology revolution is undoubtedly coming soon.