Fast Algorithms for Finding Maximum-Density Segments of a Sequence
with Applications to Bioinformatics
by Michael H. Goldwasser, Ming-Yang Kao and Hsueh-I Lu
Abstract:
We study an abstract optimization problem arising from biomolecular
sequence analysis. For a sequence
A = < a1, a2, ..., an >
of real numbers, a segment S is a consecutive
subsequence
< ai, ai+1, ..., aj >.
The width of S is j-i+1, while the
density is
(sum{i <= k <= j} ak)/(j-i+1).
The maximum-density segment problem takes A and two
integers L and U as input and asks for a segment of
A with the largest possible density among those of width at
least L and at most U. If U=n (or equivalently,
U = 2L-1), we can solve the problem in
O(n) time, improving upon the
O(n log L)-time algorithm by Lin, Jiang and Chao for
a general sequence A. Furthermore, if U and L
are arbitrary, we solve the problem in
O(n + n log(U-L+1)) time. There has been no
nontrivial result for this case previously. Both results also hold
for a weighted variant of the maximum-density segment problem.
- Citation:
- Fast Algorithms for Finding Maximum-Density Segments of
a Sequence with Applications to Bioinformatics
- Michael H. Goldwasser, Ming-Yang Kao, and Hsueh-I Lu
- Proceedings of the Second Annual Workshop on Algorithms in
Bioinformatics (WABI),
volume 2452 of Lecture Notes in Computer Science
(Springer-Verlag, Berlin), 2002, pp. 157-171.
- DOI:10.1145/645907.673125
- Download as:
[
abstract,
pdf,
pdf.gz,
ps,
ps.Z,
ps.gz
]
- A complete version (with improvements) later appeared as:
- Linear-Time Algorithms for Computing Maximum-Density
Sequence Segments with Bioinformatics Applications
- Michael H. Goldwasser, Ming-Yang Kao, and Hsueh-I Lu
- Journal of Computer and System Sciences, 70(2):128-144, 2005.
- DOI:10.1016/j.jcss.2004.08.001
- Download preprint as:
[
abstract,
pdf,
pdf.gz,
ps,
ps.gz
]
Michael Goldwasser