RNA Design with Infrared

Several in silico RNA design works

Infrared framework

In (Yao et al., 2024), we presented the framework Infrared for bioinformatic problems that can be formulated as a form of weighted Constraint Satisfaction Problem (CSP) with variables \(\small \mathcal{X}\), domains \(\small \mathcal{D}\), constraints \(\small \mathcal{C}\), and features \(\small \mathcal{F}\). An assignment \(\small x\) is a set of mappings \(\small X_i \mapsto x_i\) from a variable \(\small X_i\in\mathcal{X}\) to a value \(\small x_i\) in the domain \(\small D_i\in\mathcal{D}\). An assignment is valid if it satisfies each constraint in \(\mathcal{C}\). Given a set of weights \(\small \alpha=(\alpha_F)_{F\in\mathcal{F}}\) for each feature \(\small F\) in \(\small\mathcal{F}\), we define the assignment evaluation as \(\small E(x,\alpha)=\sum_{F\in\mathcal{F}}\alpha_FF(x)\). Infrared framework provides solutions for two associated problems

  1. Optimization. Return the valid assignment \(\small x^*\) that is maximal \[\small x^*=\displaystyle\text{argmax}_{\text{valid } x} E(x,\alpha)\]
  2. Sampling. Return valid assignment \(\small x\) with a probability w.r.t. its Boltzmann weight \[\small \mathbb{P}(x)\propto\exp(E(x,\alpha)).\]

RNA Design problem consists of two paradigms in principle: the positive design aims to optimize the affinity of sequences, such as free-energy to the target structure; and the negative design looks for the sequence specificity, for example, avoid of folding into the alternative structures other than the target. The positive design is usually achieved by sequence sampling and can be formalized as a weighted CSP. Indeed, one can see each position in RNA as a variable associated with a domain of \(\small \{\sf A, C, G, U \}\) and the design objectives as constraints and features. We demonstrated in (Yao et al., 2024) the usage of Infrared for RNA Design, including the negative design with post-sampling optimization for objectives that are hard to be expressed in weighted CSP (glocal strategy).

Scheme of Infrared for RNA design. Here, we have constraints from given target structures and features to sample sequences with $\small\sf GC$ content and free-energies to targets

RNAPOND

Given a secondary structure to design, a naive approach is to, first, find a sequence compatible with the target structure, i.e. sequence with randomly assignment of \(\small \{\sf A, C, G, U \}\) to unpaired positions and \(\small \{\sf AU, CG, GU \}\) to paired positions. However, such sequence often doesn’t adapt the target structure as its MFE conformation. The next step is changing the nucleotides of positions that form base pairs presenting in the MFE structure but not in the target such that they are not \(\small \{\sf AU, CG, GU \}\). Then, repeating the last step until a solution is found. We generalized this idea as the design tool “RNA POsitive and Negative Design” (RNAPOND) (Yao et al., 2021). The design problem is formalized as a CSP with two types of constraints: those positions that need to form a base pair and those cannot forming a base pair.

Workflow of RNAPOND

sRNA-mRNA interaction

The system sRNA DsrA and mRNA rpoS is a well known example of gene regulation through RNA-RNA interaction. The binding of DsrA on the 5’ UTR region of rpoS leads to a conformation change that makes the RBS accessible, thus upregulating translation. Inspired by DsrA-rpoS system, we demonstrated, as a proof of concept, an artificial DsrA-mRNA system with the consideration of kinetic aspect of interaction formation (Waldl† et al., 2024).

Infrared is used to sample mRNA sequences that are compatible with the conformations with and without binding of DsrA. Then, a post-sampling optimization is performed with the objective function containing the following five criteria, expressed in terms of free-energy,

  1. sRNA and mRNA should bind strongly;
  2. the mRNA should exhibit poor RBS accessibility without the sRNA;
  3. in the bound state the RBS is highly accessible;
  4. there is a suitable, accessible, seed interaction;
  5. the energy barrier for interaction formation is low.
DsrA‐rpoS system presented in cartoon

References

2024

  1. Journal
    infrared.gif
    Infrared: a declarative tree decomposition-powered framework for bioinformatics
    Hua-Ting Yao, Bertrand Marchand, Sarah J. Berkemer, Yann Ponty, and 1 more author
    Algorithms for Molecular Biology, 2024
  2. Book Chapter
    ../infrared_Workflow_summary.svg
    Developing Complex RNA Design Applications in the Infrared Framework
    Hua-Ting Yao, Yann Ponty, and Sebastian Will
    In RNA Folding: Methods and Protocols, 2024
  3. Book Chapter
    rri_cartoon.svg
    Sequence design for RNA-RNA interactions
    Maria Waldl, Hua-Ting Yao, and Ivo Hofacker
    In RNA Design: Methods and Protocols, Mar 2024

2021

  1. Conference
    Taming Disruptive Base Pairs to Reconcile Positive and Negative Structural Design of RNA
    In RECOMB 2021 - 25th international conference on research in computational molecular biology, Apr 2021