Accession Number:



Emergent Rules for Codon Choice Elucidated by Editing Rare Arginine Codons in Escherichia coli

Corporate Author:

Harvard Medical School Boston United States

Report Date:



The degeneracy of the genetic code allows nucleic acids to encode amino acid identity as well as noncoding information for gene regulation and genome maintenance. The rare arginine codons AGA and AGG AGR present a case study in codon choice, with AGRs encoding important transcriptional and translational properties distinct from the other synonymous alternatives CGN. We created a strain of Escherichia coli with all 123 instances of AGR codons removed from all essential genes. We readily replaced 110 AGR codons with the synonymous CGU codons, but the remaining 13recalcitrant AGRs required diversification to identify viable alternatives. Successful replacement codons tended to conserve local ribosomal binding site-like motifs and local mRNA secondary structure, sometimes at the expense of amino acid identity. Based on these observations, we empirically defined metrics for a multidimensional safe replacement zone SRZ within which alternative codons are more likely to be viable. To evaluate synonymous and nonsynonymous alternatives to essential AGRs further, we implemented a CRISPRCas9-based method to deplete a diversified population of a wild-type allele, allowing us to evaluate exhaustively the fitness impact of all 64 codon alternatives. Using this method, we confirmed the relevance of the SRZ by tracking codon fitness over time in 14 different genes, finding that codons that fall outside the SRZ are rapidly depleted from a growing population. Our unbiased and systematic strategy for identifying unpredicted design flaws in synthetic genomes and for elucidating rules governing codon choice will be crucial for designing genomes exhibiting radically altered genetic codes.

Descriptive Note:

Journal Article

Supplementary Note:

Proceedings of the National Academy of Sciences of the United States of America , 113, 38, 01 Jan 0001, 01 Jan 0001, Freely available online through the PNAS open access option. Data deposition: The sequence reported in this paper have been deposited in the BioProject database, (accession no. PRJNA298327). This article contains supporting information online at



Communities Of Interest:

Modernization Areas:

Distribution Statement:

Approved For Public Release;

Contract Number:


File Size: