CParty: Hierarchically Constrained Partition Function of RNA Pseudoknots
Résumé
Biologically relevant RNA secondary structures are routinely predicted by efficient dynamic programming algorithms that minimize their free energy. Starting from such algorithms, one can devise partition function algorithms, which enable stochastic perspectives on RNA structure ensembles. As the most prominent example, McCaskill's partition function algorithm is derived from pseudoknot-free energy minimization. While this algorithm became hugely successful for the analysis of pseudoknot-free RNA structure ensembles, as of yet there exists only one pseudoknotted partition function implementation, which covers only simple pseudoknots and comes with a borderline-prohibitive complexity of O(n^5) in the RNA length n. Here, we develop a partition function algorithm corresponding to the hierarchical pseudoknot prediction of HFold, which performs exact optimization in a realistic pseudoknot energy model. In consequence, our algorithm CParty carries over HFold's advantages over classical pseudoknot prediction to characterizing the Boltzmann ensemble at equilibrium. Given an RNA sequence S and a pseudoknot-free structure G, CParty computes the partition function over all possibly pseudoknotted density-2 structures G ∪ G ′ of S that extend the fixed G by a disjoint pseudoknot-free structure G ′ . Thus, CParty follows the common hypothesis of hierarchical pseudoknot formation, where pseudoknots form as tertiary contacts only after a first pseudoknotfree 'core' G and we call the computed partition function hierarchically constrained (by G). Like HFold, the dynamic programming algorithm CParty is very efficient, achieving the low complexity of the pseudoknot-free algorithm, i.e. cubic time and quadratic space. Finally, by computing pseudoknotted ensemble energies, we unveil kinetics features of a therapeutic target in SARS-CoV-2.