Filters








16,226 Hits in 4.0 sec

Algorithms for Finding a Minimum Repetition Representation of a String [chapter]

Atsuyoshi Nakamura, Tomoya Saito, Ichigaku Takigawa, Hiroshi Mamitsuka, Mineichi Kudo
2010 Lecture Notes in Computer Science  
We refer to such a compact representation as a repetition representation string or RRS, by which a set of disjoint or nested tandem arrays can be compacted.  ...  In this paper, we study the problem of finding a minimum RRS or MRRS, where the size of an RRS is defined to be the sum of its component letter sizes and the sizes needed to describe the repetitions (·  ...  As an evaluation measure for RRSs, we use their sizes, that is, we consider the problem of finding a minimum RRS (MRRS) for a given string.  ... 
doi:10.1007/978-3-642-16321-0_18 fatcat:uoq5maptgnavrpuemfimkxqknm

A Pattern Extraction Algorithm For Abstract Melodic Representations That Allow Partial Overlapping Of Intervallic Categories

Emilios Cambouropoulos, Maxime Crochemore, Costas S. Iliopoulos, Manal Mohamed, Marie-France Sagot
2005 Zenodo  
Iliopoulos is partially supported by a Marie Curie fellowship, Wellcome Foundation, Nato and Royal Society grants. Manal Mohamed is supported by an EPSRC studentship.  ...  Here, we present a method for finding all maximalpairs and all repetitions with a hole in a given string x, where x may have occurrences of binary don't cares.  ...  For string x = x[1..n] with d binary don't cares, we propose an algorithm for computing special kinds of repetition that we refer to as "maximal-pairs" and "repetitions with a hole".  ... 
doi:10.5281/zenodo.1415007 fatcat:6asi7s3uezhmrlndl5dmub7lzy

Page 1208 of Mathematical Reviews Vol. , Issue 97B [page]

1997 Mathematical Reviews  
Summary: “We consider the problem of finding the repetitive structures of a given string x.  ...  We give a polynomial-time algorithm using membership and equivalence queries that finds the minimum obdd for the target respecting a given ordering.  ... 

All maximal-pairs in step–leap representation of melodic sequence

Emilios Cambouropoulos, Maxime Crochemore, Costas S. Iliopoulos, Manal Mohamed, Marie-France Sagot
2007 Information Sciences  
This paper proposes an efficient pattern extraction algorithm that can be applied on melodic sequences that are represented as strings of abstract intervallic symbols; the melodic representation introduces  ...  We propose an Oðn þ dðn À dÞ þ zÞ-time algorithm for computing all maximal-pairs in a given sequence x ¼ x½1::n, where x contains d occurrences of binary don't cares and z is the number of reported maximal-pairs  ...  For string x ¼ x½1::n with d binary don't cares, we propose an algorithm for computing a special kind of repetition that we refer to as maximal-pair.  ... 
doi:10.1016/j.ins.2006.11.012 fatcat:ikabzouuvzgcngugngooaiibjm

LCS approximation via embedding into locally non-repetitive strings

G.M. Landau, A. Levy, I. Newman
2011 Information and Computation  
The search for efficient algorithms for finding the LCS has been going on for more than three decades.  ...  Our new method (the embedding together with the approximation algorithm) gives a strictly sub-quadratic time algorithm (i.e., of complexity O(n 2− ) for some constant ) which can find common subsequences  ...  There exist (almost) linear algorithms that for every n long strings A and B, give SLNR-sketch of size O(log 3 n) which enables finding the maximum w and approximating to a factor of 2 the minimum t for  ... 
doi:10.1016/j.ic.2010.12.006 fatcat:iog5phbeinebhat44utgmwf2k4

Content Based Indexing of Music Objects Using Approximate Sequential Patterns

Vikram D, Shashi M
2015 International Journal of Data Mining & Knowledge Management Process  
Hence, extraction of approximate patterns is essential for a MIR system. This paper proposes a novel method of finding approximate repeating patterns for the purpose of MIR.  ...  The MIR involves representation of main melody as a sequence of notes played, extraction of repeating patterns from it and matching of query sequence with frequent repeating sequential patterns constituting  ...  ACKNOWLEDGEMENTS This work was supported by the Council of Scientific & Industrial Research (CSIR) and Andhra University, Visakhapatnam, Andhra Pradesh, India.  ... 
doi:10.5121/ijdkp.2015.5207 fatcat:miicuf2shfepfie3r6gu7bcbge

An Online Algorithm for Lightweight Grammar-Based Compression

Shirou Maruyama, Hiroshi Sakamoto, Masayuki Takeda
2012 Algorithms  
Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive texts.  ...  Our algorithm guarantees O(log 2 n)-approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space.  ...  For a current string S (initially the input string w), this algorithm categorizes all occurrences of pairs in S into one of the classes of repetition, maximal pair, minimal pair, and others.  ... 
doi:10.3390/a5020214 fatcat:fa5hgt5hebf7rcelaqrxouet6i

An Online Algorithm for Lightweight Grammar-Based Compression

Shirou Maruyama, Masayuki Takeda, Masaya Nakahara, Hiroshi Sakamoto
2011 2011 First International Conference on Data Compression, Communications and Processing  
Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.  ...  Our algorithm guarantees O(log 2 n)approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space.  ...  This work was supported by JST PRESTO program and Grants-in-Aid for Young Scientists (A), MEXT (No. 23680016).  ... 
doi:10.1109/ccp.2011.40 dblp:conf/ccp/MaruyamaTNS11 fatcat:hnec7pvt45cqlm7fv6nrijuila

Using Adaptive Automata in Grammar Based Text Compression to Identify Frequent Substrings

Newton Kiyotaka Miura, João José Neto
2019 Zenodo  
Compression techniques allow reduction in the data storage space required by applications dealing with large amount of data by increasing the information entropy of its representation.  ...  This paper presents an adaptive rule-driven device - the adaptive automata - as the device to identify recurring sequences of symbols to be compressed in a grammar-based lossless data compression scheme  ...  The intrinsic hierarchical definition of a CFG allows string-manipulation algorithms to perform operations directly on their compressed representations without the need for a prior decompression [2]  ... 
doi:10.5281/zenodo.3484443 fatcat:i3kjbzkeebbz5hpjf36gupoxg4

Approximating Optimal Bidirectional Macro Schemes [article]

Luís M. S. Russo, Ana D. Correia, Gonzalo Navarro, Alexandre P. Francisco
2020 arXiv   pre-print
We test our algorithm on a number of artificial repetitive texts and verify that it is efficient in practice and outperforms Lempel-Ziv, sometimes by a wide margin.  ...  Optimal bidirectional macro schemes are NP-complete to find, but they may provide much better compression on highly repetitive sequences.  ...  To test the algorithm we devise a method for generating highly repetitive strings whose optimal macro scheme is known.  ... 
arXiv:2003.02336v1 fatcat:ecfjurml2vbxzondih5ilmhbji

Lempel-Ziv Factorization: Simple, Fast, Practical [chapter]

Dominik Kempa, Simon J. Puglisi
2013 2013 Proceedings of the Fifteenth Workshop on Algorithm Engineering and Experiments (ALENEX)  
For decades the Lempel-Ziv (LZ77) factorization has been a cornerstone of data compression and string processing algorithms, and uses for it are still being uncovered.  ...  A common feature of the new algorithms is their avoidance of the longest-common-prefix array, essential to nearly all prior art.  ...  , and for explicating details of their experiments; and to the anonymous referees whose comments materially improved this paper.  ... 
doi:10.1137/1.9781611972931.9 dblp:conf/alenex/KempaP13 fatcat:fuan7givyndyviin6tbu5gcdoy

Towards Compact and Tractable Automaton-Based Representations of Time Granularities [chapter]

Ugo Dal Lago, Angelo Montanari, Gabriele Puppis
2003 Lecture Notes in Computer Science  
We focus our attention on two kinds of optimization problems for automaton-based representations, namely, computing the smallest representation and computing the most tractable representation, that is,  ...  We first introduce and compare these two minimization problems; then, we give a polynomial time algorithm that solves the latter.  ...  of finding the n-th occurrence of a given symbol in a string.  ... 
doi:10.1007/978-3-540-45208-9_7 fatcat:os2ns3gspnem3lnfj7caaghjze

Frequent Itemset Mining for Big Data Using Greatest Common Divisor Technique

Mohamed A. Gawwad, Mona F. Ahmed, Magda B. Fayek
2017 Data Science Journal  
In the Big Data era the need for a customizable algorithm to work with big data sets in a reasonable time becomes a necessity.  ...  In this paper we propose a new algorithm for frequent itemset discovery that could work in distributed manner with big datasets.  ...  Transaction ID String Representation Partition Number T1 2 3 7 Par 1 for all transactions with string representation starting by "2" T2 3 7 11 Par 2 for all transactions with string representation  ... 
doi:10.5334/dsj-2017-025 fatcat:mbyvkchusfhnln7ffncqsxmvmm

Indexes and Computation over Compressed Structured Data (Dagstuhl Seminar 13232)

Sebastian Maneth, Gonzalo Navarro, Marc Herbstritt
2013 Dagstuhl Reports  
It focuses on algorithms for sequence analysis (string algorithms), but also covers genome rearrangement problems and phylogenetic reconstruction methods.  ...  Extensive experiments show that the new algorithm is superior, and particularly so at the lowest memory levels and for highly repetitive data.  ...  We study the problem of encoding the positions the top-k elements of an array A [1..n] for a given parameter 1 ≤ k ≤ n.  ... 
doi:10.4230/dagrep.3.6.22 dblp:journals/dagstuhl-reports/ManethN13 fatcat:b35at6erjbe63hvelnqnrt4jle

LCS Approximation via Embedding into Local Non-repetitive Strings [chapter]

Gad M. Landau, Avivit Levy, Ilan Newman
2009 Lecture Notes in Computer Science  
The search for efficient algorithms for finding the LCS has been going on for more than three decades.  ...  Our new method (the embedding together with the approximation algorithm) gives a strictly sub-quadratic time algorithm (i.e., of complexity O(n 2− ) for some constant ) which can find common subsequences  ...  a factor of 2 the minimum t for which f (A) and f (B) are both (1, w)-non-repetitive.  ... 
doi:10.1007/978-3-642-02441-2_9 fatcat:pzxlq4c5kfbz3mqakucx2awo4m
« Previous Showing results 1 — 15 out of 16,226 results