Mapping cis-regulatory mutations at scale in sorghum enables modulation of gene expression

0 0 16 minutes read

Mapping cis-regulatory mutations at scale in sorghum enables modulation of gene expression

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Sorghum cultivation and protoplast generation

S. bicolor cv. RTx430 isolation was adapted from maize protoplast protocols^55,59,60 and optimized for application in etiolated sorghum seedlings. Seeds were sterilized (70% ethanol for 5 min, 30% sodium hypochlorite + 0.1% Tween-20 for 20 min and three washes with double-distilled (dd)H₂O for 5 min each) and germinated for 48 h on damp filter paper in the dark at 28 °C. Generally, larger seeds are more suitable for protoplasting. Once germinated, seedlings with coleoptiles 2–3 cm were gently transplanted into 36-mm peat plugs and grown in the light (100 μmol m⁻² s⁻¹, 16:8 photoperiod, 28 °C, 80% relative humidity) for 2–3 days to inhibit mesocotyl elongation. As soon as plants began developing a third leaf (that is, the second leaf after the coleoptile leaf) or once they began noticeably greening, they were moved to a darkened growth chamber (28 °C, 60% relative humidity) to develop for 8–10 days. All plants must be maximally etiolated for viable protoplast harvest; thus, any seedlings that were not fully yellow at the time of protoplasting were discarded, as were any that displayed any foliar abnormalities.

Plants were harvested when the third leaf was maximally extended, when plant height was 15–25 cm. Turgid, healthy third leaves were detached from plants with a razor blade and their base and tip were removed, leaving a ~10-cm section. Approximately ten sections at a time were bundled and dissected into 0.5-mm strips perpendicular to the vein on clean paper. Any leaf strips that produced excess moisture on the paper when sliced were discarded. Once cut, strips were immediately transferred to 50–75-ml aliquots of fresh digestive solution (10 mM KCl, 0.5 M mannitol, 8 mM MES pH 5.7, 1 mM CaCl₂, 1.5% w/v cellulase, 0.75% w/v macerozyme and 0.1% w/v BSA) in 200-ml beakers covered in foil to fully exclude light (~25–35 leaves per beaker). Digesting strips were vacuum infiltrated (15 inHg, room temperature) for 3–5 min, incubated in a dark shaking incubator (40 rpm, 28 °C) for 6 h, then mixed with an an equal volume of W5 solution (154 mM NaCl, 125 mM CaCl₂, 5 mM KCl and 2 mM MES pH 5.7) and incubated (80 rpm, 28 °C) for 1 h. Digestive solution was filtered with 40-μm cell strainers into 50-ml conical tubes and then spun down at 100g for 5 min. Yellow cell pellets were consolidated into 1–2 50-ml conical tubes, resuspended in 5–10 ml of W5 solution and then checked for viability using Evans blue dye. Any preparation that had more than 15% stained cells was discarded. Cells were spun down under the same conditions and resuspended in fresh MMG solution (0.4 M mannitol, 15 mM MgCl₂ and 4 mM MES pH 5.7) at a concentration of 2.5 × 10⁶ cells per ml.

To transfect libraries, 1.2 ml of cells (3 × 10⁶ cells total) in 15-ml conical tubes were combined with 100 μg of purified plasmid in 200 μl of H₂O and an equal volume (1.4 ml) of fresh, filtered PEG-CaCl₂ solution (0.2 M mannitol, 0.1 M CaCl₂ and 40% w/v PEG-4000). Tubes were mixed by gently rocking back and forth until the solution appeared well mixed (not ‘streaky’) and left to sit in the dark at room temperature for 15 min. After transfection, the solution was diluted with 4.8 ml of W5 solution and spun down for 5 min at 100g for 5 min. Transfection solution was removed by pipetting; then, the pellet was resuspended in 4 ml of WI solution (0.5 M mannitol, 4 mM KCI and 4 mM MES pH 5.7) and transferred to a six-well culture plate. After 18 h of dark incubation at room temperature, cells were placed under benchtop light (~25 μmol m⁻² s⁻¹) for 4 h before RNA harvest.

For all experiments, a GFP-expressing pUC19 plasmid control was used to ensure suitable transfection efficiency and cell health. Any seed batch that was not >80% fluorescent when transfected with GFP was discarded. All library experiments were performed with at least two biological replicates, achieved by running different batches of WT seed on different days. During all steps of the process, seedlings and leaf strips were exposed to as little light as possible to minimize etiolation. As cells are highly sensitive, wide-O pipette tips were used in every instance that cells were pipetted.

Rice cultivation and mesophyll protoplast isolation

A total of 120 to 300 Oryza sativa cv. Kitaake seeds were dehulled and sterilized (75% ethanol for 1 min, 2.5% sodium hypochlorite for 20 min and washed four times with 50 ml of ddH₂O) and planted on 7% ½ MS agar (adjusted to pH 5.7 with KOH) in a sterile flow hood. The seeds were incubated for 7 days in a growth chamber (150 μmol m⁻² s⁻¹, 12:12 photoperiod, 27 °C/25 °C, 80% relative humidity) until the first true leaf was close to being fully developed. The seedlings were left in the chamber for two more days with the light intensity halved until the second leaf stage (that is, to where the leaf was close to full expansion). The second leaves were harvested with their base and tip removed and subsequently sliced into ~0.5-mm strips perpendicular to the vein on a large petri dish. The leaf strips were immediately transferred to another tin-foil covered petri dish with 0.6 M mannitol and 20 mM MES (pH 5.7) to initiate plasmolysis until the remainder of the ~300 leaves were processed. All leaf strips were subjected to 10 min of additional in-dark plasmolysis. The mannitol was then drained and the leaf strips were resuspended in 200 ml of freshly filtered protoplast digestive solution (10 mM KCl, 0.6 M mannitol, 20 mM MES, 1.5% w/v cellulase (Onozuka R-10), 0.75% w/v macrozyme (Duchefa Biochmi), 0.1% w/v BSA, 10 mM CaCl₂, 5 mM β-mercaptoethanol and 50 μg ml⁻¹ carbinecillin) before transfer into 500-ml flasks covered with foil to exclude light. The leaf strips were then subjected to three rounds of vacuum infiltration (15 inHg for 10 min each) in 500-ml flasks in the dark and transferred to a dark shaking incubator for digestion over 6 h (60 rpm for the first 1.5 h, 70 rpm for the next 3.5 h and 80 rpm for the last 1 h before harvesting). Mesophyll protoplasts were harvested in two sequential steps. In the initial harvest, a volume of W5 equal to that of the digestive solution was added to quench the digestion reaction. The mixture was then filtered through 40-μm cell strainers into 50-ml conical tubes and centrifuged at 4 °C at 200g for 5 min. Green mesophyll pellets were consolidated in W5 as the initial harvest. The second harvest was performed by adding 100 ml of W5 to the leaf strips, swirling and incubating the mixture for 10 min. The mixture was then filtered through 40-μm cell strainers into 50-ml conical tubes and spun down at 200g for 3 min. Both harvests were let to sediment naturally under gravity for 1 h in the dark. The naturally sedimented pellet was isolated and resuspended in 15 ml of MMG. The first and second harvests were pooled followed by viability staining with Evans blue dye. Any preparation that had more than 15% stained cells was discarded. The same transfection protocol used for sorghum protoplasts was used for rice library transfections. Rice protoplasts were exposed to light 12 h after transfection and harvested for RNA extraction 16 h after transfection.

RNA extraction, reverse transcription and library amplification

RNA was extracted using the Qiagen RNeasy plant mini kit (74904) according to the manufacturer’s protocol with the exception of the precipitation step where one volume (instead of 0.5 volumes) was used for RNA precipitation. On-column DNase digestion was performed according to the manufacturer’s protocol to remove DNA contamination and 40 μl of DEPC water was used to elute concentrated RNA. Any samples with an A_260/230 value below 1.80 were cleaned up using the Monarch spin RNA cleanup kit (T2030L). Reverse transcription with the Omniscript RT kit was performed using a 19-mer oligodT primer. Six separate reactions were set up with 1 μg of RNA as input. For each sample, cDNA derived from a total of 6 μg of RNA was cleaned up using the ZYMO DNA clean and concentrator-5 kit with a 7:1 ratio of DNA-binding buffer to cDNA and used as template for Illumina PCR amplification. Gene-specific primers with Illumina PCR1 adaptors were used to amplify barcoded fragments using Q5 HiFi polymerase with the minimal number of cycles to obtain an gel-extractable band (typically 15–20 cycles, primers p219–p222; PCR conditions in Supplementary Table 2). These PCR1 amplicons were gel-extracted, cleaned up, amplified to PCR2 products and sequenced on an Illumina Nextseq 2000.

Dual-luciferase assay

For the dual-luciferase assay in sorghum, select mutants of the PsbS and SBPase promoter/5′ UTR region were cloned upstream of a nanoluciferase gene on a pUC19 backbone (strains). This was performed by producing a WT promoter version of both genes and then introducing mutations through around-the-horn PCR and Gibson homology cloning. The annotated 3′ UTR for each gene tested was attached downstream of the enzyme sequence. Plasmids were constructed to harbor a firefly luciferase gene with a switchgrass Ubi2 promoter and pea RbcS E9 terminator for strong expression.

For sorghum protoplast transfections, at least two biological replicates were used, with a technical replicate for each. A total of 1 million protoplasts in 1 ml of MMG solution were transfected with 20 μg of plasmid in 40 μl of ddH₂O in 15-ml conical tubes. Transformation was performed with PEG solution as described above, after which cells were incubated in the dark in 2 ml of WI solution for 16 h and then exposed to diffuse benchtop light (15–20 μmol m⁻² s⁻¹) for 4 h. For rice protoplast transfections, four technical replicates were performed with similar cell and DNA conditions to the sorghum assay. After transformation, cells were incubated for 12 h in 2 ml of WI solution and then exposed to benchtop light for 4 h.

For luminescence measurements in both species, cells were spun down in 15-ml conical tubes (250g, 5 min, room temperature), resuspended in 1 ml of WI buffer, spun down again under the same conditions and then resuspended in 1× passive lysis buffer (Promega). Luciferase and nanoluciferase activity were measured on a Tecan Infinite M1000 Pro using the Promega Nano-Glo dual-luciferase reporter assay system (N1620) according to the manufacturer’s instructions. Specifically, 80 μl of lysate was combined with 80 μl of luciferase substrate, then shaken in the plate reader (orbital, 2-mm diameter, ~300 rpm) and left to sit for 2 min before measuring luminescence. The sample was thoroughly mixed with 80 μl of NanoDLR Stop&Glo reagent, then shaken (orbital, 1-mm diameter, ~600 rpm) for 3 min and left to sit for 12 min before measuring nanoluciferase luminescence. Reported values of nanoluciferase/luciferase intensity were normalized to that of an unmodified WT promoter tested in the relevant biological replicate.

Promoter and gene expression analysis

For tissue-specific expression of candidate genes (Supplementary Fig. 1e), expression values were compiled from the GeneAtlas version 2 database⁶¹ for S. bicolor version 3.1.1 for mature leaves (leaf middle whorl.vegetative), seedling leaves (lower leaf upper.juvenile), roots (root top.juvenile) and grain (seed dry grain maturity). For single-cell expression analysis, cell-type-resolved RNA-seq profiles from sorghum plants undergoing deetiolation were taken from a previous study³⁶ and queried for target genes. The average expression level of and percentage of cells expressing PsbS, Raf1 and SBPase were plotted using the DotPlot function in Seurat (version 5.3.0)⁶². For single-cell ATACseq analysis, cell-type-resolved chromatin accessibility profiles were obtained from a previous study⁶³, visualized using the Plant Epigenome Browser⁶⁴ and exported for combined visualization in Geneious Prime (version 2025.2.2). The PlantCARE⁶⁵ and JASPAR databases⁶⁶ were used for annotating putative TFBSs and identifying high-effect cREs.

For local sequence alignments in the Poaceae, mVISTA was used⁶⁷. Promoter (defined as 2 kb upstream from the annotated TSS) and 5′ UTR sequences for each gene were retrieved from the Phytozome database for sorghum (S. bicolor RTx430 version 2.1), maize (Zea mays RefGen_V4), miscanthus (Miscanthus sinensis version 7.1), Setaria italica (S. italica version 2.2), brachypodium (Brachypodium distachyon version 3.2) and rice (O. sativa version 7.0). Sequences were aligned using mVISTA with alignment windows of 100 bp at a similarity threshold of 70% and plotted between 50% and 100% sequence similarity. For base-pair-resolved sequence alignments, Clustal Omega was used through the EMBL-EBI portal⁶⁸.

For displaying single-nucleotide variant effects, we generated a position-specific sequence logo that visualizes substitution preference at each genomic position on the basis of measured effect sizes. For each position, we converted the base-specific effect sizes into a probability distribution (P_i at position i) through a softmax transform with temperature parameter β_scale (set to 5). We then computed the information content per position as IC_i = 2 − H(P_i), where the term H(P_i) quantifies the Shannon entropy (bits; maximum IC = 2 for DNA). The plotted letter heights are P_b,i × IC_i, where P_b,i is the softmax-normalized preference for base b at position i. Logos were rendered with logomaker using a colorblind-aware palette.

Library design

Using gene models from Phytozome S. bicolor RTx430 version 2.1 (psbS gene ID: SbiRTX430.03G398900, raf1 gene ID: SbiRTX430.06G286900, SBPase gene ID: SbiRTX430.03G387300), we designed libraries to span −2,000 bp upstream of the annotated TSS until the beginning of the protein-coding sequence (that is, including the 5′ UTR. The designed libraries include four 500-bp deletions and ten 200-bp deletions for the entire region. In the region 1 kb from the TSS through the UTR, all single-nucleotide substitutions, all 2-bp deletions, all 12-bp deletions and A-to-G transitions within each 5-bp window were designed. In addition, we inserted putative TF motifs every 5 bp in the forward and reverse directions at positions −150 to +50 from the TSS. We chose to insert 40 motifs from each source: (1) imputed TF motifs cataloged in PlantTFDB¹⁸ for sorghum leaf-tissue-expressed TFs identified in the Sorghum Riken Database (http://sorghum.riken.jp/) and (2) DNase I footprints identified in sorghum¹⁹. Details of library design can be found on GitHub (https://github.com/SavageLab/plant_promoter_bashing).

Library cloning

We performed all cloning using a pUC19 plasmid backbone containing an ampicillin resistance cassette and TOP10 E. coli cloning strain. Cells were grown in Luria–Bertani broth supplemented with 0.1 mg ml⁻¹ carbenicillin at 37 °C. All PCRs were performed using a Bio-Rad T100 thermal cycler. All PCR products were purified using Zymo DNA clean and concentrator-5 kit (D4033) after amplification and before electroporation. All library cloning steps were performed in vitro until the final bottleneck step to prevent bias to the library.

We first created WT plasmids for each gene (Extended Data Fig. 3a). In the case of the synthetic construct, this contained the gene promoter/5′ UTR-driving GFP with a pea rbcS E9 terminator. In the case of the natural gene construct, plasmids contained the full gene including 2 kb of promoter, 5′ UTR, all introns and exons and an additional 1 kb downstream to capture the native 3′ UTR sequence. To do so, we used primers (primers p1–p16) to amplify the promoter or full-length gene from sorghum RTx430 genomic DNA using NEBiolabs Q5 high-fidelity 2× master mix (3 min at 95 °C, followed by 35 cycles of 20 s at 98 °C, 20 s at 67 °C and 2–3 min at 72 °C, with a final extension for 1 min at 72 °C). We performed around-the-horn PCR (primers p223–p224) using KAPA HiFi HotStart ReadyMix (Roche, 09420398001) (3 min at 95 °C, followed by 35 cycles of 20 s at 95 °C, 20 s at 66.4 °C and 15 s at 72 °C, with a final extension for 1 min at 72 °C) to linearize the pUC19 destination plasmid. These fragments were then combined using Gibson assembly (NEBuilder HiFi DNA assembly master mix, E2621L; 60 min at 50 °C). These Gibson reactions were transformed into chemically competent TOP10 cells and individual clones were isolated and sequence-verified.

Next, we cloned a random barcode into each plasmid. To do so, we performed an around-the-horn PCR using primers containing randomized 15 bp (primers p213–p218 for native libraries, p225–p226 for GFP libraries) and then used a selfing Gibson reaction to circularize our plasmids. These randomized plasmid pools then served as the template to introduce the promoter and 5′ UTR variants. An unused subsample of these reactions was transformed into electrocompetent TOP10 cells to ensure a sufficient amount of barcodes were obtained by titer plating and Illumina sequencing was performed (Illumina iSeq, kit v2) on the barcoded region to ensure barcode diversity and a count suitable for downstream cloning.

For constructing the large deletions (200-bp and 500-bp deletions), we performed an around-the-horn PCR using primers that miss these specific regions (primers p17–p114) with KAPA HiFi HotStart ReadyMix (Roche, 09420398001) (3 min at 95 °C, followed by 35 cycles of 20 s at 98 °C, 20 s at 65 °C and 5 min at 72 °C, with a final extension for 5 min at 72 °C and a hold at 4 °C) on the respective gene-specific WT plasmid. Linearized DNA was then circularized using Golden Gate reactions with BsaI-HF-v2 (New England Biolabs, R3733S) for Raf1, BsmBI-v2 (New England Biolabs, R0739S) for SBPase and AarI (New England Biolabs, R0745S) for PsbS (for BsaI and AarI, using 30 cycles of 5 min each at 37 °C and 16 °C; for BsmBI, using 30 cycles of 5 min each at 42 °C and 16 °C).

To create the higher resolution variant libraries, we ordered ~270-bp oligopools from Twist Bioscience. As the mutagenized region (>1 kilobase) is larger than the synthesis length limit, we split the oligopool into 7–8 different 200-bp fragments and performed individual Golden Gate assemblies for each fragment and gene combination. Each gene sublibrary was amplified using KAPA HiFi HotStart ReadyMix (Roche, 09420398001) from the PCR-cleaned synthesized pool (using primers p159–p202; 3 min at 95 °C, followed by 35 cycles of 20 s at 98 °C, 15 s at 57 °C and 15 s at 72 °C, with a final extension for 1 min at 72 °C). Corresponding Golden-Gate-compatible linearized vectors were amplified using KAPA HiFi polymerase (using primers p115–p158; 3 min at 95 °C, followed by 35 cycles of 20 s at 98 °C, 15 s at 65 °C and 1 min per kb at 72 °C, with a final extension for 10 min at 72 °C).

We created synthetic construct libraries first with all desired mutations in the plasmid pool and then amplified and ported the entire promoter/5′ UTR region into the barcoded natural gene constructs. Full promoter/UTR library regions were amplified from the synthetic libraries alongside the conjugate barcoded backbone vectors using Q5 2× master mix (primers p203–212; 3 min at 95 °C, followed by 35 cycles of 20 s at 98 °C, 20 s at 65 °C and 5 min at 72 °C, with a final extension for 5 min at 72 °C and a hold at 4 °C) and circularized using Gibson assembly (NEBuilder HiFi DNA assembly master mix (E2621L); 60 min at 50 °C).

We then PCR-purified the assembly products and electroporated 50 ng into 50 μl of electrocompetent TOP10 cells using 0.1-cm cuvettes (Bio-Rad, 1652089) using a Bio-Rad Micropulser with the EC1 setting (1.8 kV). Cells were grown at 37 °C to an optical density of ~0.5 to limit the number of cell divisions causing library bias, and cryostocks were made by 1:1 dilution with 50% glycerol. Cells were then bottlenecked to around 100,000 clones by plating on LB–carbenicillin agar plates to limit the maximum number of individual barcodes propagated. All library construction steps until the final bottlenecking were performed in vitro to minimize mutant or barcode bias.

Long-read sequencing for mapping barcodes to promoter variants

To link individual barcodes with the promoter variants, we performed long-read sequencing using PacBio Revio. To do so, we first grouped reads by their barcode and created a consensus sequence for each barcode using SAMtools version 1.2 (using ‘consensus –config hifi’). We then mapped consensus sequences to the WT reference plasmid using minimap2 version 2.26-r1175 (using ‘–MD -Lax map hifi’) to identify mutations in each consensus read. All consensus variants were length-filtered to be no less than 600 bp shorter than the WT plasmid. We observed 100,883 SBPase, 33,126 Raf1 and 47,423 PsbS barcode-to-variant mappings.

Short-read, high-volume sequencing to determine expression levels for barcodes

To determine the expression changes because of a crDNA mutation, we used deep Illumina sequencing of the plasmid library, as well as cDNA isolated from RNA 16–18 h after protoplast transformation. For Illumina sequencing analysis, we followed a specific workflow. We merged paired end reads using FLASH (version 1.2.11), with a maximum number of overlap bases of 150. We then applied a quality filter using vsearch (version 2.28.1) (using ‘–fastq_truncqual 20, –fastq_maxns 3, –fastq_max33 0.5’). This allowed us to count and calculate a log₂ read ratio after expression versus before transformation for each barcode using a pseudocount of 1. We then merged these barcode read ratios with promoter variants identified by long-read sequencing. We then removed any barcodes with mutations in the priming site for barcode sequencing (for SBPase, positions 1280–1335 and 1448–1484; for Raf1, positions 675–710 and 485–525; for PsbS, positions 1360–1393 and 1192–1211, all relative to the reference plasmid sequence), as we detected that mutations here introduced artifactual skew on the log read ratios independent of the crDNA variants observed. For Raf1, we also observed mutations T3575A, C5576T, A5574G and T5575A (relative to the reference plasmid sequence) in almost all reads and redefined these mutations as the new reference sequence.

Individual mutant effect inference

The majority of reads contained multiple mutations because of synthesis, cloning and sequencing artifacts. These were particularly pronounced at polynucleotide repeats, where addition or subtraction of one repeat base is observed frequently (Extended Data Fig. 2a). For example, for PsbS, we found that >80% of reads contain an additional A insertion at a 10-bp poly(A) tract 297 bp downstream from the translational stop site. To deconvolute potentially casual individual variant effects, we performed ordinary least squares linear regression using statsmodels version 0.14.5 to infer individual variant effect sizes and derive P values using two-tailed t-tests. Using additive assumptions, this enables inference of whether a particular individual mutation can explain the observed expression effects additionally to co-occurring mutations, including sequencing and cloning artifacts, to pinpoint potential causal mutations (Extended Data Fig. 2c). For example, a mutation that co-occurs in almost all reads, such as the polynucleotide repeat mutations above, cannot explain overexpression in a subset of high-expressing combinations of mutations. We stringently called variants significant and meaningful if their P value passed a 0.05 Bonferroni correction and if both their inferred effect and the mean log read ratio of reads containing that mutation were larger than 1.5. We included this raw log read ratio cutoff to prevent erroneously negatively called passenger mutants from driving a co-occurring candidate mutant’s positive inferred effect without having observed that candidate mutant in hypermorphic reads.

Fine-tuning the genomic language model

We developed an automated pipeline to create a gene expression prediction dataset from RNA-seq data in Expression Atlas⁶⁹. The pipeline is available on GitHub (https://github.com/songlab-cal/gpn/tree/main/workflow/make_expression_dataset_from_gxa) and should be applicable to many other species available in Expression Atlas. As input, the user must specify a list of experiment accessions. In our case, we chose sorghum experiments using the reference cultivar BTx623: E-MTAB-4021, E-GEOD-98817, E-MTAB-4203, E-MTAB-4273, E-MTAB-4400, E-CURD-25 and E-MTAB-5956. The pipeline has the following steps: (1) download transcript-level transcript per million (TPM) values; (2) filter samples with variance less than a threshold (default: 1); (3) drop conditions with no biological replicates or those where Pearson correlation between replicates is less than a threshold (default: 0.8); (4) average expression among replicates; (5) apply a log1p transformation, such that the final prediction target is log(1 + TPM); (6) for each transcript, extract the reference genome sequence around the TSS (default: ±256 bp).

This resulted in a total of 26 conditions, as detailed online (https://huggingface.co/datasets/gonzalobenegas/gxa-sorghum-v1/blob/main/labels.txt). GPN-Brassicales⁴⁵ was fine-tuned in a multitask fashion to predict all of the conditions, using a mean squared error loss. To evaluate on MPRA data, the condition ‘E-MTAB-4021_leaf mesophyll’ was used. The model architecture constituted three steps. First, GPN base-pair-level embeddings from the last layer were first averaged across spatial positions. Second, an MLP mapped this high-dimensional embedding into preactivations for each output condition; Third, preactivations were transformed into positive predicted gene expression values using the softplus activations.

The model was trained on chromosomes 1–8. Chromosomes 9 and 10 were left for potential validation and testing, although we did not find held-out performance on RNA-seq useful for indicating performance on MPRA variants. The following hyperparameters were used, chosen to be reasonable defaults but not systematically tuned: max epochs, 30; batch size, 128; optimizer, AdamW; learning rate, max 1 × 10⁻³, with 1% warmup ratio and cosine decay down to 1 × 10⁴; weight decay, 0.01.