Linear-Time Algorithms for Computing Maximum-Density Sequence Segments
with Bioinformatics Applications
by Michael H. Goldwasser, Ming-Yang Kao, and Hsueh-I Lu
Abstract: We study an abstract optimization problem arising
from biomolecular sequence analysis. For a sequence A of pairs
(a_{i},w_{i}) for
i = 1, ..., n and
w_{i}>0, a segment
A(i,j) is a consecutive subsequence of A starting with
index i and ending with index j. The width of
A(i,j) is
w(i,j) = sum_{{ i <= k <= j }} w_{k}
and the density is
(sum_{{ i <= k <= j }} a_{k})/w(i,j).
The maximum-density segment problem takes A and two
values L and U as input and asks for a segment of
A with the largest possible density among those of width at
least L and at most U. When U is unbounded, we
provide a relatively simple, O(n)-time algorithm, improving
upon the O(n log L)-time algorithm by Lin, Jiang and
Chao. We then extend this result, providing an O(n)-time
algorithm for the case when both L and U are specified.
- Citation:
- Linear-Time Algorithms for Computing Maximum-Density
Sequence Segments with Bioinformatics Applications
- Michael H. Goldwasser, Ming-Yang Kao, and Hsueh-I Lu
- Journal of Computer and System Sciences, 70(2):128-144, 2005.
- DOI:10.1016/j.jcss.2004.08.001
- Download preprint as:
[
pdf,
pdf.gz,
ps,
ps.gz
]
- A preliminary version originally appeared as:
- Fast Algorithms for Finding Maximum-Density Segments of
a Sequence with Applications to Bioinformatics
- Michael H. Goldwasser, Ming-Yang Kao, and Hsueh-I Lu
- Proceedings of the Second Annual Workshop on Algorithms in
Bioinformatics (WABI),
volume 2452 of Lecture Notes in Computer Science
(Springer-Verlag, Berlin), 2002, pp. 157-171.
- DOI:10.1145/645907.673125
- Download as:
[
abstract,
pdf,
pdf.gz,
ps,
ps.Z,
ps.gz
]
Michael Goldwasser