Продолжая использовать сайт, вы даете свое согласие на работу с этими файлами.

Nussinov algorithm
Class | Nucleic acid structure prediction |
---|---|
Worst-case performance | |
Worst-case space complexity |
The Nussinov algorithm is a nucleic acid structure prediction algorithm used in computational biology to predict the folding of an RNA molecule that makes use of dynamic programming principles. The algorithm was developed by Ruth Nussinov in the late 1970s.
Background
RNA origami occurs when an RNA molecule "folds" and binds to itself. This folding often determines the function of the RNA molecule. RNA folds at different levels, this algorithm predicts the secondary structure of the RNA.
Algorithm
Scoring
We score a solution by counting the total number of paired bases. Thus, attempting to maximize the score that maximizes the total number of bonds between bases.
Motivation
Consider an RNA sequence whose elements are taken from the set
. Let us imagine we have an optimal solution to the subproblem of folding
to
, and an optimal solution for folding
to
. Now, to align
to
, we have two options:
- Leave
unpaired, and keep the structure of
to
. The score for this alignment will be equal to the score of the alignment of
to
, as no new base pairs were created.
- Pair
with
, where
. The score for this alignment will be the score of the base pairing, plus the score of the best alignment of
to
and
to
.
Algorithm
Consider an RNA sequence of length
such that
.
Construct an matrix
. Initialize
such that
for .
will contain the maximum score for the subsequence
. Now, fill in entries of
up and to the right, so that
where
After this step, we have a matrix where
represents the optimal score of the folding of
.
To determine the structure of the folded RNA by traceback, we first create an empty list of pairs . We initialize with
. Then, we follow one of three scenarios.
- If
, the procedure stops.
- If
, then set
and continue.
- Otherwise, for all
, if
and
are complementary and
, append
to
, then traceback both with
and
.
When the traceback finishes, contains all of the paired bases.
Limitations
The Nussinov algorithm does not account for the three-dimensional shape of RNA, nor predict RNA pseudoknots. Furthermore, in its basic form, it does not account for a minimum stem loop size. However, it is still useful as a fast algorithm for basic prediction of secondary structure.