Identification of conserved RNA regulatory switches in living cells using RNA secondary structure ensemble mapping and covariation analysis

Strains, growth conditions and in vivo DMS probing
E. coli K-12 MG1655 derivative strains DH5α and TOP10 were streaked on Luria–Bertani (LB) plates and a single colony was picked, inoculated in 4 ml of LB broth and grown overnight at 37 °C with shaking. The day after, the culture was diluted to an optical density at 600 nm (OD600) = 0.05 in 25 ml of LB broth and grown at 37 °C until OD600 ≈ 0.5 (~2 h). For cold shock, 2 ml of this culture was mixed with 2 ml of LB broth prechilled to 0 °C in a water–ice slurry and then incubated at 10 °C for 20 min. For DMS (D186309, Merck) probing, DMS from a fresh 1:4 dilution in ethanol (~2.64 M) was added to the bacteria at a final concentration of 200 mM. Probing was conducted for 2 min at 37 °C or for 30 min at 10 °C (to achieve comparable modification efficiencies) with moderate shaking (800 rpm). Reactions were then quenched by addition of one volume of 1 M DTT, after which bacteria were collected by centrifugation at 17,000g for 1 min. The supernatant was discarded; the pellet was washed twice with 0.5 M DTT and then immediately subjected to RNA extraction.
Human HEK293 cells were cultured in high-glucose DMEM medium (L0104, Biowest), supplemented with 10% FBS (H1138, Merck), 25 U per ml penicillin and 25 μg ml−1 streptomycin at 37 °C and 5% CO2. For ATP depletion experiments, cells were washed twice in PBS and then kept for 20 min in glucose-free DMEM (11966025, Thermo Fisher), supplemented with 10% FBS, 1 mM sodium pyruvate, 25 U per ml penicillin, 25 μg ml−1 streptomycin, 10 mM 2-deoxy-d-glucose (25972-M, Merck) and 10 mM sodium azide (71289, Merck) at 37 °C and 5% CO2. DMS from a fresh 1:4 dilution in ethanol (~2.64 M) was directly added to the cells at a final concentration of 150 mM. Probing was conducted for 2 min at 37 °C. Reactions were then quenched by addition of one volume of 1 M DTT, after which cells were collected by centrifugation at 5,000g for 1 min. Supernatant was discarded and pellets were immediately lysed by direct addition of 1 ml of ice-cold TRIzol reagent (15596018, Thermo Fisher Scientific).
RNA extraction
For E. coli, cell pellets were resuspended in 62.5 μl of resuspension buffer (20 mM Tris-HCl pH 8.0, 80 mM NaCl and 10 mM EDTA pH 8.0), supplemented with 100 μg ml−1 final lysozyme (L6876, Merck) and 20 U of SUPERase•In RNase inhibitor (A2696, Thermo Fisher Scientific), by vigorous vortexing. Samples were incubated at room temperature for 1 min, followed by addition of 62.5 μl of lysis buffer (0.5% Tween-20, 0.4% sodium deoxycholate, 2 M NaCl and 10 mM EDTA). Samples were then inverted 5–10 times and incubated at room temperature for 2 min, followed by an additional 2 min on ice. Then, 1 ml of ice-cold TRIzol reagent was then added and samples were vigorously vortexed for 15 s.
Both bacterial and human samples were extracted as per manufacturer instructions. Residual genomic DNA (gDNA) was removed by digestion with TURBO DNase I (AM2239, Thermo Fisher Scientific) at 37 °C for 30 min.
DMS probing of bacterial in vitro refolded RNA
First, 10 μg of total RNA from exponentially growing E. coli was diluted in 89 μl of nuclease-free water, then heat-denatured at 95 °C for 2 min and immediately chilled on ice for 1 min. Next, 10 μl of ice-cold 10× folding buffer (250 mM HEPES pH 7.5 and 2 M KCl) were then added and samples were incubated at 37 °C for 15 min. Then, 1 μl of 500 mM MgCl2 (prewarmed at 37 °C) was added and samples were incubated at 37 °C for 15 min to enable tertiary-structure formation. Probing was conducted by adding DMS at a final concentration of 200 mM and incubating the samples at 37 °C for 2 min. Reactions were then quenched by addition of 1 volume 1 M DTT, after which RNA was cleaned up on Monarch spin RNA cleanup columns (10 μg; T2030L, New England Biolabs) as per manufacturer instructions.
Extraction and DMS probing of bacterial native deproteinized rRNA
Native deproteinized E. coli rRNA was prepared as previously described22. Briefly, 2 ml of DH5α or TOP10 cells grown to OD600 ≈ 0.5 were collected by centrifugation at 1,000g for 5 min (4 °C) and then resuspended in 1 ml of resuspension buffer (15 mM Tris-HCl pH 8.0, 450 mM sucrose and 8 mM EDTA), supplemented with 100 μg ml−1 final lysozyme. Samples were incubated at 22 °C for 5 min and then on ice for an additional 10 min, after which protoplasts were collected by centrifugation at 5,000g for 5 min (4 °C). The protoplast pellet was then resuspended in 120 μl of protoplast lysis buffer (50 mM HEPES pH 8.0, 200 mM NaCl, 5 mM MgCl2 and 1.5% SDS), supplemented with 0.2 μg μl−1 proteinase K (P2308, Merck) and samples were incubated at 22 °C for 5 min, followed by 5 min on ice. SDS was precipitated by addition of 30 μl of SDS precipitation buffer (50 mM HEPES pH 8.0, 1 M potassium acetate and 5 mM MgCl2), followed by centrifugation at 17,000g for 5 min (4 °C). Supernatant was extracted twice with phenol, chloroform and isoamyl alcohol (25:24:1), pre-equilibrated in RNA folding buffer (50 mM HEPES pH 8.0, 200 mM NaCl and 5 mM MgCl2), and twice with chloroform. Deproteinized samples were then supplemented with 20 U of SUPERase•In RNase inhibitor equilibrated at 37 °C for 20 min. DMS from a 1:4 dilution in ethanol was added to a final concentration of 200 mM and samples were incubated at 37 °C for 2 min with shaking (800 rpm). Reactions were quenched by the addition of one volume of 1 M DTT and then cleaned up using Monarch spin RNA cleanup columns as per manufacturer instructions.
DMS probing of candidate bacterial RNA thermometers in vitro
T7 templates of cspB, cspG, cspI, cpxP and lpxP, including the 5′ UTR and CDS, were generated by PCR from DH5α gDNA using Q5 high-fidelity 2× master mix (M0492L, New England Biolabs). In vitro transcription reactions were performed using the HiScribe T7 high-yield RNA synthesis kit (E2040L, New England Biolabs) in 20 μl, using 1 μg of an equimolar pool of all templates. Reactions were incubated for 4 h at either 37 °C or 10 °C, after which RNA was probed by directly adding 200 mM final DMS to the reactions and incubating at 37 °C for 2 min or at 10 °C for 30 min. Reactions were then quenched by addition of one volume of 1 M DTT, after which RNA was cleaned up on Monarch spin RNA cleanup columns as per manufacturer instructions. Template DNA was then removed by digestion with TURBO DNase I (AM2239, Thermo Fisher Scientific) at 37 °C for 30 min and RNA samples were again cleaned up on Monarch spin RNA cleanup columns.
Bacterial DMS-MaPseq library preparation
DMS-MaPseq libraries were prepared as previously described4, with minor changes. Before library preparation, highly abundant short RNA species, such as tRNAs, were depleted on Monarch spin RNA cleanup columns by loading a 1:1:1 mixture of total RNA in nuclease-free water, RNA-binding buffer and 100% ethanol. For transcriptome-wide DMS-MaPseq libraries, rRNA depletion was performed on 1.1 μg of total RNA using the RiboCop for bacteria kit (126, Lexogen), with two minor changes to the manufacturer’s protocol; the denaturation temperature was increased to 95 °C and probe annealing temperature was lowered to 55 °C. Following rRNA depletion, RNA was cleaned up on Monarch spin RNA cleanup columns and eluted in 8 μl of nuclease-free water. For total RNA DMS-MaPseq libraries used for the optimization of folding parameters, 1 μg of total RNA was instead directly used as input for the subsequent step. RNA was supplemented with 2 μl of 100 μM random hexamers, 2 μl of deoxynucleoside triphosphates (dNTPs; 10 mM each) and 4 μl of 5× RT buffer (250 mM Tris-HCl pH 8.3, 375 mM KCl and 15 mM MgCl2). Samples were then incubated at 94 °C for 5.5 min to simultaneously denature and fragment the RNA to a median size of 200 nt and immediately transferred to ice for 1 min. Samples were then supplemented with 1 μl of 0.1 M DTT, 20 U of SUPERase•In RNase inhibitor, 200 U of TGIRT-III enzyme (TGIRT50, InGex) and 25 ng μl−1 actinomycin D (A1410, Merck) and incubated at 25 °C for 10 min, 57 °C for 1 h and 60 °C for 1 h. Addition of actinomycin D increased strand specificity by ~10%. TGIRT-III was degraded by adding 2 μg of proteinase K and incubating at 37 °C for 20 min. Proteinase K was inactivated by the addition of protease inhibitor cocktail (P8340, Merck). cDNA–RNA hybrids were then converted to double-stranded DNA (dsDNA) using the NEBNext Ultra II directional RNA second-strand synthesis module (E7550, New England Biolabs) by incubating at 16 °C for 1 h. dsDNA was cleaned up with 1.8 volumes of NucleoMag NGS cleanup and size select beads (744970, Macherey Nagel) and used as input for the NEBNext Ultra II DNA library prep kit for Illumina (E7645S, New England Biolabs) as per manufacturer instructions.
5′UTR-MaP library preparation
Before library preparation, ~1.5 μg of poly(A)+ RNA was enriched per sample using oligo d(T)25 magnetic beads (S1419S, New England Biolabs). RNA was directly eluted from the beads by fragmentation in 4 mM MgCl2 for 5.5 min at 94 °C and then cleaned up on Monarch spin RNA cleanup columns. Endogenous 5′-phosphate groups and 2′,3′-cyclic phosphates generated by chemical fragmentation were removed by treatment with 1 U of shrimp alkaline phosphatase (rSAP) (M0371L, New England Biolabs) in a final volume of 20 μl at 37 °C for 30 min, followed by cleanup on Monarch spin RNA cleanup columns. Decapping of 5′-capped RNA fragments was performed by treating the RNA with 5 U of Cap-Clip acid pyrophosphatase (C-CC15011H, CellScript) in a final volume of 20 μl at 37 °C for 1 h, followed by cleanup on Monarch spin RNA cleanup columns. Decapped RNA fragments were then ligated to an RNA adaptor (CUACACGACGCUCUUCCGAUCU) harboring a 5′-biotin–TEG modification. Decapped RNA fragments and the RNA adaptor (1 μl of a 10 μM dilution) were first denatured by incubation at 70 °C for 5 min, after which the samples were snap-cooled on ice for 1 min. Samples were then supplemented with 30 U of high-concentration T4 RNA ligase 1 (single-stranded RNA ligase; M0437M, New England Biolabs) and ligation was performed in a final volume of 20 μl at 25 °C for 2 h in the presence of 12.5% PEG-8000 and 1 mM ATP. Then, 10 min before the end of the incubation, 20 μl of Dynabeads MyOne Streptavidin T1 beads (65601, Thermo Fisher Scientific) were aliquoted in a 2-ml tube, washed twice in 100 μl of 2× binding and wash buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA and 2 M NaCl) and then resuspended in 40 μl of the same buffer. RNA samples were then supplemented with 20 μl of nuclease-free water to dilute the PEG and then transferred to the washed beads. Samples were extensively vortexed and then incubated for 15 min at 22 °C in a thermomixer with constant shaking (1,000 rpm). Samples were then placed on the magnet, the supernatant was discarded and beads were washed twice with 500 μl of 1× binding and wash buffer by extensive vortexing. Two additional washes were then performed with 500 μl of nuclease-free water by incubating at 80 °C for 2 min. We found this step to be critical when preparing libraries from DMS-treated samples but not 2A3-treated samples, as DMS modifications introduce positive charges on the RNA that, because of the negative charge of the phosphate backbone, cause the RNA to aggregate. Heat denaturation at this stage allows washing away non-5′-cap-derived fragments. Ligated RNA fragments were eluted by incubating the beads in 50 μl of formamide elution buffer (95% formamide and 10 mM EDTA) at 95 °C for 3 min and then cleaned up on Monarch spin RNA cleanup columns. Eluted RNA fragments were ligated to a 5′-preadenylated and C3 spacer 3′-blocked DNA 3′ adaptor (rApp-AGATCGGAAGAGCACACGTCT-SpC3). RNA fragments and the adaptor (1 μl of a 10 μM dilution) were first denatured by incubation at 70 °C for 5 min, after which the samples were snap-cooled on ice for 1 min. Samples were then supplemented with 200 U of T4 RNA ligase 2, truncated KQ (M0373L, New England Biolabs), and ligation was performed in a final volume of 20 μl at 25 °C for 2 h in the presence of 12.5% PEG-8000. Samples were cleaned up on Monarch spin RNA cleanup columns. For 2A3-treated samples, RNA was eluted in 8 μl of nuclease-free water, supplemented with 2 μl of 10 μM RT primer and 1 μl of 10 mM dNTPs; for DMS-treated samples, RNA was eluted in 9 μl of nuclease-free water, supplemented with 2 μl of 10 μM RT primer and 2 μl of 10 mM dNTPs. RNA was incubated at 70 °C for 5 min and then snap-cooled on ice for 1 min. RT reactions were performed in a final volume of 20 μl. For 2A3-treated samples reactions were supplemented with 4 μl of 5× RT buffer (250 mM Tris-HCl pH 8.0 and 375 mM KCl), 2 μl of DTT 0.1 M, 20 U of SUPERase•In RNase inhibitor, 200 U of SuperScript II RTase (18064022, Thermo Fisher Scientific) and 6 mM final MnCl2 and incubated for 1.5 h at 42 °C, 10 min at 50 °C, 10 min at 55 °C, 10 min at 60 °C and 15 min at 75 °C. For DMS-treated samples, reactions were supplemented with 4 μl of 5× RT buffer (250 mM Tris-HCl pH 8.3, 375 mM KCl and 15 mM MgCl2], 1 μl of 0.1 M DTT, 20 U of SUPERase•In RNase inhibitor and 200 U of TGIRT-III enzyme and incubated for 10 min at 42 °C, 1 h at 57 °C and 1 h at 60 °C. The TGIRT-III–RNA–cDNA complex was destroyed by the addition of 1 μl 10 M NaOH, followed by incubation at 95 °C for 3 min. Reactions were cleaned up on Monarch spin RNA cleanup columns, using one volume of RNA-binding buffer and one volume of 100% ethanol to only recover fragments ≥ 200 nt. Barcoding was performed by PCR using the NEBNext Ultra II Q5 master mix (M0544X, New England Biolabs) as per manufacturer instructions.
HEK293 total RNA DMS-MaPseq library prep
To mimic the same conditions used for the 5′UTR-MaP library preparation, 100 ng of total RNA per sample was fragmented in 4 mM MgCl2 for 5.5 min at 94 °C and then cleaned up on Monarch spin RNA cleanup columns as per manufacturer instructions. The 2′,3′-cyclic phosphates generated by chemical fragmentation were removed by treatment with 1 U of rSAP in a final volume of 20 μl at 37 °C for 30 min, followed by heat inactivation of the enzyme at 70 °C for 5 min. Reactions were then supplemented with 20 U of T4 polynucleotide kinase (M0201L, New England Biolabs), 1 mM ATP and 5 mM DTT in a final volume of 50 μl and incubated at 37 °C for 1 h. The 5′-phosphorylated RNA fragments were then cleaned up on Monarch spin RNA cleanup columns and subjected to adaptor ligation, RT and PCR as detailed above.
Targeted DMS-MaPseq analysis of CKS2 and TXNL4A
Targeted DMS-MaPseq analysis of CKS2 and TXNL4A 5′ UTRs was performed using total RNA from HEK293 transfected for 24 h with the pEF6 vector carrying the wild-type 5′ UTR sequences as described below and probed with 150 mM DMS for 2 min at 37 °C. RT was carried out using gene-specific RT primers targeting the CDS of EGFP, harboring the reverse-complemented Illumina 3′ adaptor. Here, 3 μg of RNA was supplemented with 2 μl of 10 μM gene-specific RT primer and 2 μl of 10 mM dNTPs. RNA was incubated at 70 °C for 5 min and then snap-cooled on ice for 1 min. Samples were supplemented with 4 μl of 5× RT buffer (250 mM Tris-HCl pH 8.3, 375 mM KCl and 15 mM MgCl2), 1 μl of 0.1 M DTT, 20 U of SUPERase•In RNase inhibitor and 200 U of TGIRT-III enzyme and incubated for 10 min at 50 °C, 1 h at 57 °C and 1 h at 60 °C. RT reactions were performed in a final volume of 20 μl. The TGIRT-III–RNA–cDNA complex was destroyed by the addition of 1 μl of 10 M NaOH, followed by incubation at 95 °C for 3 min. Reactions were cleaned up on Monarch spin RNA cleanup columns as per manufacturer instructions. Addition of the Illumina 5′ adaptor and barcoding were performed simultaneously by PCR, using 0.5 μM of i5 and i7 multiplexing primers, 0.025 μM of gene-specific forward primer harboring the Illumina 5′ adaptor and the NEBNext Ultra II Q5 master mix, as per manufacturer instructions.
Cloning of cspG, cpxP and lpxP constructs and mutagenesis of lpxP
Wild-type cspG, cpxP and lpxP FLAG-tagged, IPTG-inducible constructs, including the 5′ UTR and CDS, were prepared by amplifying the relevant regions from DH5α gDNA and cloning them in pET22b(+) vector (69744, Merck) between the XbaI and EcoRI sites. The exact transcription start site (TSS) was determined from DMS-MaPseq coverage. Similarly, cspG, cpxP and lpxP FLAG-tagged, IPTG-inducible constructs, including the sole CDS, were cloned in pET22b(+) between the NdeI and EcoRI sites. For cpxP, as the identified candidate thermometer encompassed part of the CDS, the CDS was cloned starting at the third in-frame ATG codon by exploiting a naturally occurring, in-frame NdeI site. As RNA cotranscriptional folding can be influenced by the speed of the RNA polymerase, the vector’s T7 promoter was replaced with a tac promoter. The SLalt-stabilized lpxP 5′ UTR mutant was prepared using the Q5 site-directed mutagenesis kit (E0554S, New England Biolabs) as per manufacturer instructions. All cloning steps were performed in NEB 5α competent E. coli cells (C2987H, New England Biolabs). All vectors were verified by Sanger sequencing (Macrogen Europe). The sequences of primers used for cloning and mutagenesis are available in Supplementary Table 9. The vector containing the wild-type lpxP gene (inclusive of 5′ UTR) was deposited to Addgene (plasmid 212594).
Cloning and mutagenesis of CKS2 and TXNL4A 5′ UTRs
Wild-type CKS2 and TXNL4A 5′ UTRs were cloned in a modified pEF6 vector between the BamHI and EcoRI sites. Briefly, a sequence encoding EGFP (frame 1)–STOP–T2A–mCherry (frame 3) was assembled by PCR and cloned between the EcoRI and XbaI sites of the pEF6/V5-His vector (K961020, Thermo Fisher Scientific). CKS2 was reverse-transcribed and amplified from HEK293 total RNA. CKS2 mutants designed to stabilize conformations A or B were prepared using the Q5 site-directed mutagenesis kit as per manufacturer instructions. For TXNL4A, amplification proved much more challenging because of the extreme G+C content. Therefore, both wild-type and mutant stabilizing conformation B and the mutant disrupting the CUG start codon of the candidate uORF were prepared by PCR assembly of overlapping oligonucleotides using Q5 high-fidelity 2× master mix. As the candidate uORF of TXNL4A resided on frame 2, one nucleotide (G160) was deleted from a loop region at the end of the 5′ UTR of TXNL4A to align it to the mCherry frame. The sequences of primers used for cloning and mutagenesis are available in Supplementary Table 9.
Western blot analysis of bacterial protein expression at 37 °C versus 10 °C
Sanger-verified vectors were transformed in BL21(DE3) competent E. coli cells (C2627H, New England Biolabs). Two independent colonies were picked and inoculated in 3 ml of LB broth and grown overnight at 37 °C with shaking. The next day, bacteria were diluted to OD600 ≈ 0.05 and grown until OD600 ≈ 0.3. At this point, IPTG was added to a final concentration of 1 mM and cells were incubated with shaking at 37 °C for 30 min. Bacteria were then split into two separate aliquots, pelleted and resuspended in LB broth at 37 °C or 10 °C. Bacteria were then grown with shaking at 37 °C or 10 °C and 2-ml aliquots were collected after 30 min, 1 h or 2 h. Collected bacteria were pelleted and pellets were resuspended in 60 μl of lysis buffer (10 mM Tris-HCl pH 8.0, 10 mM EDTA and 0.1% Triton X-100), supplemented with 1 μg μl−1 lysozyme and 1:100 dilution protease inhibitor cocktail. Samples were then subjected to ten cycles of sonication (5 s on, 5 s off) using a UP200St (Hielscher) ultrasonic processor. Protein concentrations were determined using a Pierce BCA protein assay kit (23225, Thermo Fisher Scientific) as per manufacturer instructions. Then, 30 μg of lysate was resolved on 10% SDS–PAGE gels, followed by transfer to nitrocellulose membrane using the iBlot 2 gel transfer system (IB21001, Thermo Fisher Scientific). Membranes were blocked by incubation for 1 h in 5% (w/v) nonfat dry milk (A0830, PanReac AppliChem ITW Reagents) in PBS, supplemented with 0.001% final Tween-20. Immunoblotting was performed using monoclonal anti-FLAG M2 antibody (F1804, Merck) or anti-LacI (9A5) universal antibody (EG1501, Kerafast) and Immobilon Forte western HRP substrate (WBLUF0100, Merck). PageRuler Plus (26619, Thermo Fisher Scientific) was used as the size standard.
Analysis of lpxP expression in csp knockouts
Wild-type and csp-knockout BW25113 E. coli cells from the KEIO collection53 were first made competent using the Mix&Go! E. coli transformation kit (T3001, Zymo Research) and then transformed with the IPTG-inducible lpxP vector as described above. Two independent colonies were picked, inoculated in 3 ml of LB broth and grown overnight at 37 °C with shaking. The next day, bacteria were diluted to OD600 ≈ 0.05 and grown until OD600 ≈ 0.5. At this point, 1 ml of bacteria were directly mixed with 1 ml of ice-cold LB broth containing 0.02 mM IPTG and bacteria were incubated at 10 °C for 1 h with moderate shaking (800 rpm). Lysis and western blot analysis were conducted as described above. Knockout of csp genes was validated by PCR on gDNA from the individual clones.
In vitro transcription–translation using the PURE system
In vitro translation analysis of full-length wild-type and SLalt-stabilized mutant or CDS-only lpxP was performed using the PURExpress in vitro protein synthesis kit (E6800S, New England Biolabs). Reactions were conducted in a final volume of 6.25 μl, using 2.5 μl of solution A, 1.875 μl of solution B, 0.1 μl of SUPERase•In RNase inhibitor and ~50 fmol of pET22b(+) template (harboring a T7 promoter instead of a tac promoter). Reactions were incubated at 37 °C for 1.5 h, then immediately mixed with 2× loading dye and resolved on a 12% polyacrylamide gel.
Western blot analysis of CKS2 wild type and conformation-stabilizing mutants
Sanger-verified vectors were transfected in HEK293 cells. Briefly, on the first day, 800,000 cells were plated per well in a six-well plate precoated with 0.001% poly(l-lysine) (P8920, Merck). On the second day, 5 μg of plasmid DNA was transfected using 10 μl of Lipofectamine 2000 transfection reagent (11668019, Thermo Fisher Scientific) in 800 μl of Opti-MEM reduced-serum medium (51985034, Thermo Fisher Scientific). Then, 6 h after transfection, cells were supplemented with 1 ml of complete DMEM, supplemented with 20% FBS but without antibiotics. On the third day, cells were washed twice in PBS and then collected in radioimmunoprecipitation assay buffer (10 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.1% SDS, 0.1% sodium deoxycholate and 1% Triton X-100), supplemented with protease inhibitor cocktail. After discarding membranes by centrifugation at 17,000g for 10 min (4 °C), protein concentrations were determined using a Pierce BCA protein assay kit as per manufacturer instructions. Then, 10 μg of lysate was resolved on 10% SDS–PAGE gels, followed by transfer to nitrocellulose membrane using the iBlot 2 gel transfer system. Membranes were blocked by incubation for 1 h in 5% (w/v) nonfat dry milk in PBS, supplemented with 0.001% final Tween-20. Immunoblotting was performed using anti-EGFP polyclonal antibody (CAB4211, Thermo Fisher Scientific), anti-HA tag polyclonal antibody (PA1-985, Thermo Fisher Scientific), anti-GAPDH monoclonal antibody (60004-1-Ig, Proteintech) and Immobilon Forte western HRP substrate. PageRuler Plus was used as the size standard.
Fluorescence microscopy analysis of CKS2 and TXNL4A wild type and conformation-stabilizing mutants
For fluorescence microscopy analysis, on day one, 50,000 HEK293 cells were plated on 96-well flat-bottom plates precoated with 0.001% poly(l-lysine). On the second day, 75 ng of plasmid DNA was transfected using 0.625 μl of Lipofectamine 2000 transfection reagent in 50 μl of Opti-MEM reduced-serum medium. Then, 6 h after transfection, cells were supplemented with 100 μl of complete DMEM, supplemented with 20% FBS but without antibiotics. On the third day, cells were imaged with a Zeiss Observer Z1 widefield microscope and ×10 objective lens. The fluorescence signal per cell was quantified with Fiji77 version 2.14.0/1.54g. The EGFP channel was used to identify particles by signal thresholding.
Processing of bacterial DMS-MaPseq data
Following sequencing, paired-end reads were clipped of sequencing adaptors using Cutadapt78 version 4.4 (parameters: -A AGATCGGAAG -a AGATCGGAAG -m 100:100 -O 1) and merged using PEAR79 version 0.9.11 (parameters: -n 100 -q 20 -u 0 -e -y 10G -z). Merged reads were then combined with R1 and the reverse-complemented R2 for read pairs that could not be merged. Next, a comprehensive annotation of E. coli transcriptional units with experimentally determined TSSs was built by aggregating 5′ UTR information from RegulonDB33 and transcriptional units from EcoCyc80 and the corresponding sequences were extracted from the E. coli str. K-12 substr. MG1655 genome (GenBank U00096.3). For the analysis of known riboswitches, a reference was built including the sole riboswitch regions ± 50 nt. Reads were then mapped to this reference using the rf-map tool of RNA Framework81 version 2.8.3 and Bowtie2 (ref. 82) version 2.3.5.1 after clipping terminal bases with Phred quality < 20, discarding reads containing internal Ns and trimming the six 5′-most bases to account for possible mispriming artifacts (parameters: -b2 -cq5 20 -ctn -cmn 0 -cl 50 -mp –very-sensitive-local –nofw -b5 6). Alignments in SAM format were sorted and converted to BAM format using SAMtools83 version 1.15.1. BAM alignments were then processed using RNA Framework’s rf-count to generate both RC files (containing per-base mutations and coverage) and MM files (containing a map of mutated positions per read). Aligned reads spanning less than 100 nt of a transcript and reads having more than 10% mutated bases or fewer than two mutations were discarded. Insertions, ambiguously aligned deletions and deletions longer than 1 nt were ignored. Mutations were considered only if both the mutated base and the two surrounding bases had Phred quality > 20; consecutive mutations falling within 3 nt of each other were ignored (parameters: -m -mm -wl 2000 -ds 100 -es -na -ni -md 1 -dc 3 -me 0.1). DMS-MaPseq data from total RNA, used for the calibration of folding parameters described below, was analyzed with minor changes to the above protocol. Briefly, reads were mapped to a reference composed of only the 16S and 23S rRNA sequences. The minimum length spanned by reads was decreased to 90 nt (as total RNA DMS-MaPseq experiments were sequenced as single-read 100 bp, in contrast to rRNA-depleted DMS-MaPseq experiments that were sequenced as paired-end 150 bp) and reads harboring <2 mutations were also retained (parameters: -m -ds 90 -es -na -ni -dc 3 -ow -me 0.1 -md 1).
Processing of human 5′UTR-MaP data
Following sequencing, paired-end reads were clipped of sequencing adaptors using Cutadapt (parameters: -A AGATCGGAAG -a AGATCGGAAG -m 75:75 -O 1) and merged using PEAR (parameters: -n 75 -q 20 -u 0 -e -y 10G -z). Merged reads were then combined with R1 and the reverse-complemented R2 for read pairs that could not be merged. Reads were then mapped to the MANE version 1.2 reference (plus the 18S and 28S rRNA sequences) using the rf-map tool of RNA Framework and Bowtie2 after clipping terminal bases with Phred quality < 20 and discarding reads containing internal Ns (parameters: -b2 -cq5 20 -ctn -cmn 0 -mp –very-sensitive-local -bnr). Alignments in SAM format were sorted and converted to BAM format using SAMtools. BAM alignments were then processed using RNA Framework’s rf-count to generate both RC files (containing per-base mutations and coverage) and MM files (containing a map of mutated positions per read). Aligned reads spanning less than 90 nt of a transcript and reads having more than 10% mutated bases or fewer than two mutations were discarded. Insertions, ambiguously aligned deletions and deletions longer than 1 nt were ignored. Mutations were considered only if both the mutated base and the two surrounding bases had Phred quality > 20 and consecutive mutations falling within 3 nt of each other were ignored (parameters: -m -mm -wl 2000 -ds 100 -es -na -ni -md 1 -dc 3 -me 0.1). DMS-MaPseq data from total RNA, used for the calibration of folding parameters described below, were analyzed with minor changes to the above protocol. Briefly, no merging was performed as samples were sequenced as single reads and reads were mapped to a reference composed of only the 18S and 28S rRNA sequences.
Optimization of folding parameters
RC files from total RNA DMS-MaPseq experiments were processed using RNA Framework’s rf-norm to obtain normalized reactivity profiles (parameters: -sm 4 -nm 3 -rb AC -mm 1 -n 1,000). For E. coli secondary structure modeling, optimal slope (4.8) and intercept (−0.8) values were identified through jackknifing by simultaneously optimizing folding of 16S and 23S rRNAs over both in vivo and ex vivo deproteinized DMS-MaPseq data from both DH5α and TOP10 cells using RNA Framework’s rf-jackknife (parameters: -rp -md 600 -x -m), ViennaRNA84 version 2.5.1 and the modified Fowlkes–Mallows index5.
For human secondary structure modeling, optimal slope (4.6) and intercept (−2) values were identified by simultaneously optimizing folding of 28S rRNA over both replicate experiments.
Ensemble deconvolution analysis
Ensemble deconvolution was performed using the DRACO algorithm4. Briefly, DRACO slides a window of a user-defined length along each transcript, retaining only those reads falling entirely within the window’s boundaries. For each window, a graph is then generated by exploiting the comutation information so that, basically, each mutation in a read represents a vertex and two bases observed to comutate within the same read are connected by an edge. The normalized Laplacian of the graph’s adjacency matrix is then subjected to eigen decomposition and eigengap analysis to identify the number of coexisting RNA conformations making up the ensemble. This number is then used to perform a soft partitioning of the graph (graph-cut) to reconstruct the individual reactivity profiles of the different conformations and their relative stoichiometries. In its original implementation, this graph-cut step involved randomly initializing the weight of each vertex for each conformation N times (with N = 50), followed by selection of the set of weights yielding the lowest normalized graph-cut score. This initial set of weights was then iteratively altered by a factor \(\varepsilon =\,\frac{1}{2C}\), where C is the number of conformations making up the ensemble, until the normalized graph-cut score was minimized. As this procedure was performed only once, the risk was that the identified set of weights would represent only a local minimum of the graph-cut score rather than the true minimum, potentially leading to inconsistent conformation reconstruction across consecutive DRACO runs. Furthermore, the value of ε was typically too large to enable the accurate reweighting of the vertices (for instance, with C = 2 and ε = 0.25). To address these issues, we introduced the following improvements in the DRACO algorithm (available as version 1.2 from the repository https://github.com/dincarnato/draco/): (1) the number of random initializations N was increased to 500 (adjustable through the –softClusteringInits parameter); (2) the weight factor ε was lowered to 0.005 (adjustable through the —softClusteringWeightModule parameter); and (3) the entire graph-cut procedure is now repeated multiple times (adjustable through the —softClusteringIters parameter), to ensure convergence toward the true normalized graph-cut score minimum. Before running DRACO, the MM files generated by rf-count were preprocessed using the filterMM utility (available from the repository https://github.com/dincarnato/labtools) to discard reads having <2 A/C mutated bases and regions of extremely high coverage were randomly downsampled to achieve a maximum per-base coverage of 500,000×.
For E. coli, DRACO analysis was performed with a window size of 100 nt, slid in 5-nt increments, requiring a minimum base coverage of 2,000× and a minimum of 2,000 reads after filtering to perform the eigen deconvolution and repeating the graph-cut procedure 30 times (parameters: –absWinLen 100 –absWinOffset 5 –minBaseCoverage 2000 –minFilteredReads 2000 –minPermutations 10 –maxPermutations 50 –firstEigengapShift 0.95 –lookaheadEigengaps 1 –softClusteringIters 30 –softClusteringInits 500 –softClusteringWeightModule 0.005).
For HEK293, the window size was reduced to 90 nt and slid in 1-nt increments (parameters: –absWinLen 90 –absWinOffset 1) to account for the smaller library insert size, whereas all other parameters were left unchanged.
Evaluation of sequencing depth’s effect on the ensemble deconvolution of known riboswitches
To evaluate the ability of DRACO to detect known riboswitches from in vivo probing data, we used data from TOP10 bacteria and selected four riboswitches belonging to mRNAs having different expression levels in our dataset. A reference was built including only the riboswitch ± 50 nt and reads were preprocessed and mapped as detailed in the previous paragraphs. The resulting MM files were then randomly subsampled using the extract function of the rf-mmtools utility of RNA Framework by setting the value of the -rs parameter to 2, 4, 8, 10, 40 or 80 to subsample 1/2, 1/4, 1/8, 1/10, 1/40 or 1/80 of the reads mapping to each riboswitch, respectively. A total of 20 random subsamplings were performed for each. The resulting MM files were then subjected to DRACO analysis (parameters: –absWinLen 100 –absWinOffset 1 –minBaseCoverage 2000 –minFilteredReads 2000 –minPermutations 10 –maxPermutations 50 –firstEigengapShift 0.95 –lookaheadEigengaps 1 –softClusteringIters 30 –softClusteringInits 500 –softClusteringWeightModule 0.005). If at least one window overlapping the riboswitch was found to populate >1 conformation, the riboswitch was considered detected.
Comparison of DH5α versus TOP10 strains, 37 °C versus 10 °C and standard versus ATP-depleted conditions
Correlation between experiments (related to Supplementary Figs. 1c,d, 5a, 10b and 16d) was calculated on the raw mutation frequencies of A/C bases in transcriptional units for which ≥50% of A/C bases had coverage ≥ 10,000× after removing outliers (raw reactivity > 0.1). The number of conformations populated by each base in the covered transcriptome (related to Figs. 1a, 2a and 4c,f) was determined by parsing DRACO’s JSON-formatted output files. As DRACO uses a sliding window approach, consecutive overlapping windows might be found to populate different numbers of conformations; in such cases, overlapping bases were assigned the highest number of conformations. Windows populating different numbers of conformations between 37 °C and 10 °C (related to Fig. 2b) or between standard and ATP-depleted conditions (related to Fig. 4g) were identified as follows. First, windows populating one or two or more conformations were extracted from DRACO’s JSON-formatted output files into BED format and overlapping windows were merged using the mergeBed tool of BEDTools85 version 2.31.0. Any portion of the windows populating one conformation overlapping with the windows populating two or more conformations was removed using BEDTools’ subtractBed. Then, windows populating one or two or more conformations common to both DH5α and TOP10 at either 37 °C or 10 °C or both replicate experiments in HEK293 cultured under standard or ATP-depleted conditions were identified by intersecting the corresponding sets from both experiments using BEDTools intersectBed. Only windows populating the same number of conformations in both DH5α and TOP10 or in both HEK293 replicate experiments were retained. Lastly, common windows populating two or more conformations in both DH5α and TOP10 at 37 °C or in both HEK293 replicate experiments under standard conditions and one conformation in both DH5α and TOP10 at 10 °C or in both HEK293 replicate experiments under ATP-depleted conditions (less than ensemble heterogeneity) or vice versa (greater than ensemble heterogeneity), as well as regions populating the same number of conformations in both strains or replicate experiments at both temperatures or culture conditions (no change), were identified by intersecting the windows set determined in the previous step using BEDTools intersectBed. Window coordinates were then intersected with gene coordinates to identify which genes contained windows showing differential ensemble heterogeneity between 37 °C and 10 °C or between standard and ATP-depleted conditions.
Translation efficiency analysis
Ribosome profiling and RNA-seq data for E. coli cells at 37 °C or shocked at 10 °C for 10 min were obtained from a previous study35 (GSE103421). Reads were aligned to the same transcriptome reference used for DMS-MaPseq analysis, using RNA Framework’s rf-map and Bowtie86 version 1.3.1, allowing a maximum of to mapping positions (parameters: -ca3 CTGTAGGCACCATCAA -bnr -ow -bm 2 -bc 32000 -ba). After discarding all reads mapping to the rRNA operons, read counts for protein-coding genes containing windows showing differential heterogeneity between 37 °C and 10 °C as described above were calculated by intersecting CDS coordinates in BED format with the relevant BAM files using BEDTools intersectBed (parameter: -c). Only windows ≥ 50 nt (half of the window size used for DRACO analysis) were considered. For both Ribo-seq and RNA-seq data, per-gene reads per kilobase per million mapped reads (RPKMs) were calculated as follows:
$${\rm{RPKM}}=\,\frac{C}{{NL}}\,\times \,1,000,000$$
where C is the read count on the gene, N is the total number of reads mapped in the experiment and L is the length of the gene in kilobases. Translation efficiency for each gene (related to Fig. 3e) was then calculated as follows:
$${\rm{Translation}}\;{\rm{efficiency}}=\,\frac{{\rm{RPKM}}_{\rm{Ribo}-{seq}}+0.1}{{\rm{RPKM}}_{\rm{RNA}-{seq}}+0.1}$$
where 0.1 is a pseudo count added to avoid division by zero. Only genes expressed at ≥1 RPKM at both 37 °C and 10 °C were considered.
For HEK293, ribosome profiling and RNA-seq data were obtained from two previous studies59,87 (GSE112353 and GSE228010). Reads were first aligned to a reference including rRNAs, tRNAs and small nucleolar RNAs, using RNA Framework’s rf-map and Bowtie2 version 2.3.5.1. Unmapped reads were then aligned to the same transcriptome reference used for 5′UTR-MaP analysis and read counts for protein-coding genes containing windows showing differential heterogeneity between standard and ATP-depleted conditions described above were calculated by intersecting CDS coordinates in BED format with the relevant BAM files. Only windows ≥ 45 nt (half of the window size used for DRACO analysis) were considered. Only genes expressed at ≥1 RPKM both under standard and ATP-depleted conditions were considered.
Comparison of regions populating one versus two or more conformations
Eight features were evaluated for regions populating one versus two or more conformations: A+C content, G+C content, median Shannon entropy, median unpaired probability, median reactivity, Gini index, median percentage conservation and Z score (related to Figs. 1b–g and 4d,e and Supplementary Figs. 4, 6 and 7). For all analyses, only regions spanning at least half of the window size used for DRACO analysis (50 nt for E. coli, 45 nt for HEK293) were included. Furthermore, all 2+ regions were retained for this analysis (whether or not they populated the same number of conformations in both strains or replicate experiments), provided that they populated two or more conformations in both strains (or replicate experiments). First, bulk reactivity profiles for both DH5α and TOP10 grown at 37 °C or HEK293 grown under standard conditions were obtained by normalizing the respective RC files as described above using RNA Framework’s rf-norm (parameters: -sm 4 -nm 3 -rb AC -mm 1 -n 1,000) and the resulting normalized XML reactivity files were combined using RNA Framework’s rf-combine. From these XML files, reactivity data for regions populating one or two or more conformations were extracted and used to calculate the median reactivity and Gini index distributions. Combined XML files were then passed to RNA Framework’s rf-fold to compute base-paring probabilities and Shannon entropies (parameters: -sl 4.8 -in -0.8 -md 600 -dp -sh for E. coli or –sl 4.6 -in -2 -md 600 -dp -sh for HEK293). Unpaired probabilities per base were calculated as follows:
$$1-\,\mathop{\sum }\limits_{j=i}^{J}p(i,j)$$
where p(i,j) is the base-pairing probability between nucleotides i and j over all possible J partners. For unconstrained predictions, the same parameters were used with the addition of the –i parameter to ignore experimental reactivities. Distributions of folding free energy Z scores were calculated on the nucleotide sequences corresponding to the regions populating one or two or more conformations in the absence of any constraint using ViennaRNA. For Z-score calculation, the sequence of each region was shuffled 100 times while preserving dinucleotide frequencies and the corresponding folding free energies were predicted using RNAfold. The Z score for each region was then calculated as follows:
$$Z=\,\frac{\triangle G-\mu }{\sigma }$$
where ∆G is the folding free energy for the original sequence, while μ and σ are the average and s.d., respectively, of the folding free energies across the 100 shuffled sequences.
The same 1 and 2+ regions, as defined above, were used for all analyses, including translation efficiency (related to Fig. 4h,i and Supplementary Figs. 5h and 10d) and gene ontology analyses. Gene ontology was performed using DAVID88.
Sequence-level conservation analysis
To evaluate the conservation of regions populating one versus two or more conformations, a multiple-sequence alignment was computed using Mugsy89 version 1.2.3 and ten Gram-negative bacteria genomes: E. coli str. K-12 substr. MG1655 (GenBank U00096.3), S. enterica subsp. enterica serovar Typhimurium str. LT2 (GenBank AE006468.2), Shigella flexneri 2a str. 2457T (GenBank AE014073.1), Klebsiella pneumoniae subsp. pneumoniae HS11286 (GenBank CP003200.1), Yersinia pestis CO92 (GenBank AL590842.1), Enterobacter sp. 638 (GenBank CP000653.1), Serratia marcescens strain KS10 (GenBank CP027798.1), Pectobacterium carotovorum strain WPP14 (GenBank CP027798.1), Shigella dysenteriae strain SWHEFF_49 (GenBank CP055055.1) and Enterobacter cloacae isolate 1382 (GenBank OW968328.1). The resulting alignment was parsed to calculate the percentage conservation at each position with respect to the E. coli genome.
Reactivity profile reconstruction and structure modeling for high-confidence regions
High-confidence structurally heterogeneous regions for which the deconvolved reactivity profiles could be nonambiguously matched between DH5α and TOP10 or between HEK293 replicate experiments (average correlation of reactivity profiles ≥ 0.65) were extracted using RNA Framework’s rf-json2rc by including 20 extra bases on either side of the structure (parameters: -ec 1,000 -mom 0.35 -e 20 -cf 0.1 -i 0.1 -mcm 0.65 -mcr 0.65). The tool processes DRACO’s JSON-formatted output files from two experiments, aggregating those regions showing sufficient agreement between the deconvolved reactivity profiles across the two experiments and yielding two RC files containing the per-base coverage and mutations across the different conformations reconstructed by DRACO for the analyzed RNAs. The resulting RC files were then processed using RNA Framework’s rf-norm to yield normalized reactivity profiles (parameters: -sm 4 -nm 3 -rb AC -mm 1 -n 100). Structure modeling was performed using the consensusFold utility (available from the repository https://github.com/dincarnato/labtools), which leverages RNAalifold90 to aggregate multiple reactivity profiles into a consensus secondary structure (parameters: -sl 4.8 -in -0.8 -md 600 for E. coli or -sl 4.6 -in -2 -md 600 for HEK293). For the modeling of secondary structures under cold shock conditions an additional parameter (-t 10) was specified to set the folding temperature to 10 °C.
Normalization of 5′UTR-MaP reactivity data
As 5′UTR-MaP selectively enriches 5′ UTR regions, which are intrinsically highly structured because of their high G+C content, traditional gene-level normalization of reactivities would lead to notable biases because of the low number of highly reactive bases. To address this issue, we adopted an experiment-level normalization approach. Briefly, bases covered across all experiments were sorted and values greater than 1.5× the interquartile range (IQR) + the 75th percentile were removed. After excluding these outliers, the next 10% of remaining bases common to all experiments were averaged, yielding an experiment-level normalization factor. We implemented this approach in the rf-normfactor tool of RNA Framework (parameters: -sm 4 -nm 3 -rb AC -mc 1,000). The resulting normalization factors were then passed to the rf-norm tool using the -nf parameter.
Covariation analysis
To evaluate the conservation of the identified E. coli structures, we implemented the evolutionary conservation analysis module of the DeConStruct framework, built on top of the cm-builder pipeline (available from the repository https://github.com/dincarnato/labtools) we previously introduced4,22 (which exploits Infernal91 version 1.1.3 and R-scape39,92 version 2.0.0.q), to be able to handle full bacterial genomes rather than just individual transcripts. For each predicted structure (filtering out those with a known match in Rfam93) a CM was first built using Infernal’s cmbuild and the sole E. coli sequence. The CM was then used to search a database of 7,598 representative archaeal and bacterial genomes (and associated plasmids when present) from RefSeq to iteratively identify putative homologs. In its original implementation, cm-builder used an E-value-based approach to search in the database. This approach had two main limitations. Firstly, the E value for the identified matches was dependent on the size of the searched database, potentially leading to different results with different database sizes. Secondly, it required the calibration of the CMs using Infernal’s cmcalibrate module, a computationally intensive task, which is not easily scalable to hundreds of candidates. To address these issues, we implemented a bit-score-based search. Briefly, to trick Infernal into thinking that a CM had been calibrated, a fake set of ECMLC, ECMGC, ECMLI and ECMGI field values was introduced into the CM. These fields are only used to determine the E value of a database search but they do not affect the bit score. Then, a decoy database was built by randomly extracting and reversing ~10% of the sequences from the original genome database. Infernal’s cmsearch was then used to search the CM against the decoy database. A noise threshold N was defined by taking the highest possible bit score returned by this search and by rounding it up to the nearest multiple of 5. If N < 20, then N was set to 20. The search was then repeated against the original database, retaining only those matches having bit score > N. Matches having <50% canonical base pairs and truncated hits covering <75% of the structure were discarded. The resulting set of candidate homologs was then realigned against the original CM using Infernal’s cmalign. The whole procedure was repeated a maximum of three times. At each iteration, N was increased by 10 and the alignment of candidate homologs was analyzed using R-scape’s average product correction (APC)-corrected G-test statistics and a relaxed E-value threshold of 0.1 (to account for those structures falling within coding regions for which sequence variation might be ‘constrained’ by the underlying amino acid sequence). If the number of significantly covarying base pairs dropped with respect to the previous iteration (except for the first iteration), the procedure was stopped. The final alignment was then polished by discarding sequences with a length that was significantly different from the majority of the sequences in the alignment. This was achieved by converting sequence lengths to Z scores and discarding sequences with abs(Z score) > 2 and length difference > 10% with respect to the average sequence length in the alignment (implemented in the stockholmPolish tool available from the repository https://github.com/dincarnato/labtools). To further select only high-confidence alignments, we performed a stringent filtering by selecting alignments matching three criteria: (1) ≥25% of the helices showing helix-level covariation (R-scape’s Lancaster aggregated E value < 0.05); (2) ≥12.5% of the base pairs showing covariation (R-scape APC-corrected G-test statistic E value < 0.1); and (3) ≥5 base pairs showing covariation.
For human RNA structures, sequences of candidate structural homologs were directly extracted from multiz100way MAF files (https://hgdownload.soe.ucsc.edu/goldenPath/hg38/multiz100way/maf/) using the mafsInRegion tool of the kentUtils (available from the repository https://github.com/ENCODE-DCC/kentUtils) after lifting the identified structurally heterogeneous regions from transcriptome-level to genome-level coordinates using the transcriptome2genome tool (available from the repository https://github.com/dincarnato/labtools). Extracted MAF blocks were concatenated, gaps were removed and the resulting set of sequences was used as database for the cm-builder tool. This part of the analysis was implemented in the dbFromMAF tool (available as part of the DeConStruct pipeline from the repository https://github.com/dincarnato/papers). As this set of sequences represents a higher-confidence set as compared to the set of complete bacterial genomes used for the analysis of E. coli structures, two parameters were relaxed for the construction of alignments; at each iteration, the bit-score noise threshold (N) was increased by 5 (rather than 10) and matches having <35% (rather than 50%) canonical base pairs were discarded. No polishing was performed on the output alignments and filtering was relaxed by selecting all structures having at least three covarying base pairs (R-scape APC-corrected G-test statistic E value < 0.1) and two covarying helices (R-scape’s Lancaster aggregated E value < 0.05).
Design of conformation-stabilizing mutants
The mutant stabilizing the SLalt conformation of lpxP was designed manually by introducing point mutations in the 5′ half of the stem but taking care not to touch any nucleotide in the surroundings of the RBS residing on the 3′ half of the stem. Mutants stabilizing the different conformations of CKS2 and TXNL4A were automatically designed using the rf-mutate tool of RNA Framework. For this purpose, the program was modified to enable specifying a target structure. For example, to stabilize conformation A, this was provided as the target structure, while conformation B was provided as the wild-type structure, so that the program would design mutations minimizing the probability of forming conformation B while simultaneously maximizing the probability of forming conformation A (with a maximum tolerated base-pair distance of 25%). Mutations were designed in such a way that the underlying amino acid sequences of both the uORF and the main ORF were preserved.
Evaluation of energy barriers and fraction changed base pairs between conformations
Transition barriers were estimated on the set of structures predicted from structurally heterogeneous regions whose DRACO-deconvolved reactivity profiles could be nonambiguously matched between DH5α and TOP10 cells as described above. Estimation was performed using DrFindpath, a component of the DrTransformer package94. DrFindpath uses the Findpath heuristic95, which is implemented in the ViennaRNA library. The fraction of changed base pairs between conformations was calculated as follows:
$$F=\,1-\frac{C}{{c}_{1}+{c}_{2}+C}$$
where C is the number of base pairs common to both conformations and c1 and c2 are the numbers of base pairs unique to either conformation.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



