Published in Volume XXXI, Issue 1, 2021, pages 79–110, doi: 10.7561/SACS.2021.1.79

Authors: L.M.S. Russo, A.P. Francisco

Abstract

We consider the problem of identifying tandem scattered subsequences within a string. Our algorithm identifies a longest subsequence which occurs twice without overlap in a string. This algorithm is based on the Hunt-Szymanski algorithm, therefore its performance improves if the string is not self similar, which occurs naturally on strings over large alphabets. Our algorithm relies on new results for data structures that support dynamic longest increasing sub-sequences. In the process we also obtain improved algorithms for the decremental string comparison problem.

Full Text (PDF)

References

[1] A. Apostolico. Improving the Worst-Case Performance of the Hunt-Szymanski Strategy for the Longest Common Subsequence of Two Strings. Information Processing Letters 23 (2), 63-69, 1986. doi:10.1016/0020-0190(86)90044-X

[2] A. Apostolico, C. Guerra. The Longest Common Subsequence Problem Revisited. Algorithmica 2(1-4), 315-336, 1987. doi:10.1007/BF01840365

[3] B. Chazelle, L.J. Guibas. Fractional Cascading: I. A Data Structuring Technique. Algorithmica 1(1-4), 133-162, 1986. doi:10.1007/BF01840440

[4] A. Chen, T.y Chu, N. Pinsker. The Dynamic Longest Increasing Subsequence Problem. 2013. arXiv:1309.7724

[5] M. Crochemore, C.S. Iliopoulous, Y.J. Pinzon. Speeding-up Hirschberg and Hunt-Szymanski LCS Algorithms. Fundamenta Informaticae 56(1-2), 89-103, 2003.

[6] H.D. Booth. An Overview Over Red-Black and Finger Trees. 1996.

[7] L.J. Guibas, R. Sedgewick. A Dichromatic Framework for Balanced Trees. 19th Annual Symposium on Foundations of Computer Science (SFCS 1978), 8-21, 1978. doi:10.1109/sfcs.1978.3

[8] R. Hinze, R. Paterson. Finger Trees: A Simple General-Purpose Data Structure. Journal of Functional Programming 16(2), 197-217, 2006. doi:10.1017/S0956796805005769

[9] J.W. Hunt, T.G. Szymanski. A Fast Algorithm for Computing Longest Common Subsequences. Communications of the ACM 20(5), 350-353, 1977. doi:10.1145/359581.359603

[10] T. Inoue, S. Inenaga, H. Bannai. Longest Square Subsequence Problem Revisited, 2020. 27th International Symposium on String Processing and Information Retrieval (SPIRE 2020), 147-154, 2020. doi:10.1007/978-3-030-59212-7\_11

[11] Y. Ishida, S. Inenaga, A. Shinohara, M. Takeda. Fully Incremental LCS Computation. 15th International Symposium on Fundamentals of Computation Theory (FCT 2005), 563-574, 2005. doi:10.1007/11537311\_49

[12] G. Jacobson, K.-P. Vo. Heaviest Increasing/Common Subsequence Problems.  Third Annual Symposium on Combinatorial Pattern Matching (CPM 1992), 52-66, 1992. doi:10.1007/3-540-56024-6\_5

[13] J.L.W.V. Jensen. Sur les Fonctions Convexes et les Inegalites Entre les Valeurs Moyennes. Acta Mathematic 30(0), 175-193, 1906. doi:10.1007/bf02418571

[14] S.-R. Kim, K. Park. A Dynamic Edit Distance Table. 11th Annual Symposium on Combinatorial Pattern Matching (CPM 2000), 60-68, 2000. doi:10.1007/3-540-45123-4\_7

[15] A. Kosowski. An Efficient Algorithm for the Longest Tandem Scattered Subsequence Problem. 11th International Symposium on String Processing and Information Retrieval (SPIRE 2004), 93-100, 2004. doi:10.1007/978-3-540-30213-1\_13

[16] G.M. Landau, E.W. Myers, J.P. Schmidt. Incremental String Comparison. SIAM Journal on Computing 27(2), 557-582, 1998. doi:10.1137/S0097539794264810

[17] G.M. Landau, E.W. Myers, M. Ziv-Ukelson. Two Algorithms for LCS Consecutive Suffix Alignment. 15th Annual Symposium on Combinatorial Pattern Matching (CPM 2004), 173-193, 2004. doi:10.1007/978-3-540-27801-6\_13

[18] G.M. Landau, E.W. Myers, M. Ziv-Ukelson. Two Algorithms for LCS Consecutive Suffix Alignment. Journal of Computer and System Sciences 73(7), 1095-1117, 2007. doi:10.1016/j.jcss.2007.03.004

[19] P.A. Pevzner, M.S. Waterman. Matrix Longest Common Subsequence Problem, Duality and Hilbert Bases. 3rd Annual Symposium on Combinatorial Pattern Matching (CPM 1992), 79-89, 1992. doi:10.1007/3-540-56024-6_7

[20] A. Tiskin. Semi-local String Comparison: Algorithmic Techniques and Applications. Mathematics in Computer Science 1(4), 571-603, 2008. oi:10.1007/s11786-007-0033-3

Bibtex

@article{sacscuza:russo21sltss,
  title={Small Longest Tandem Scattered Subsequencess},
  author={L. M. S. Russo, A. P. Francisco},
  journal={Scientific Annals of Computer Science},
  volume={31},
  number={1},
  organization={Alexandru Ioan Cuza University, Ia\c si, Rom\^ania},
  year={2021},
  pages={79–110},
  publisher={Alexandru Ioan Cuza University Press, Ia\c si},
  doi={10.7561/SACS.2021.1.79}
}