A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
POSIX Regular Expression Parsing with Derivatives
[chapter]
2014
Lecture Notes in Computer Science
We show how to obtain a POSIX algorithm for the general parsing problem based on Brzozowski's regular expression derivatives. ...
We adapt the POSIX policy to the setting of regular expression parsing. POSIX favors longest left-most parse trees. ...
The idea is to incrementally annotate regular expressions with partial parse tree information during the derivative step. Annotated regular expressions ri are defined in Figure 5 . ...
doi:10.1007/978-3-319-07151-0_13
fatcat:vq6gvcdenncnxbn453vz22od7u
Derivative-Based Diagnosis of Regular Expression Ambiguity
[article]
2016
arXiv
pre-print
Regular expressions are often ambiguous. We present a novel method based on Brzozowski's derivatives to aid the user in diagnosing ambiguous regular expressions. ...
We introduce a derivative-based finite state transducer to generate parse trees and minimal counter-examples. ...
Computing Parse Trees via Derivatives Derivatives denote left quotients and they can be computed via a simple syntactic transformation. Definition 6 (Regular Expression Derivatives). ...
arXiv:1604.06644v2
fatcat:neorj6cyzrbsnouzau7dvwrqfe
The Design of a Verified Derivative-Based Parsing Tool for Regular Expressions
2021
CLEI Electronic Journal
We describe the formalization of Brzozowski and Antimirov derivative based algorithms for regular expression parsing, in the dependently typed language Agda. ...
A tool for regular expression based search in the style of the well known GNU grep has been developed with the certified algorithms. Practical experiments conducted with this tool are reported. ...
Regular Expressions Regular expressions are defined with respect to a given alphabet. ...
doi:10.19153/cleiej.24.3.2
fatcat:ebvo2a2wo5gezadk55tieuxi5i
Certified Derivative-Based Parsing of Regular Expressions
[chapter]
2016
Lecture Notes in Computer Science
We describe the formalization of a certified algorithm for regular expression parsing based on Brzozowski derivatives, in the dependently typed language Idris. ...
A tool for regular expression based search in the style of the well known GNU grep has been developed with the certified algorithm, and practical experiments were conducted with this tool. ...
Regular Expressions Regular expressions are defined with respect to a given alphabet. ...
doi:10.1007/978-3-319-45279-1_7
fatcat:lzvpux3jcvc6tpfu2kvazcea7u
Disambiguation in Regular Expression Matching via Position Automata with Augmented Transitions
[chapter]
2011
Lecture Notes in Computer Science
Disambiguation in Regular Expression Matching via Position Automata with Augmented Transitions regular expression, automata, pattern matching This paper offers a new efficient regular expression matching ...
Abstract This paper offers a new efficient regular expression matching algorithm which follows the leftmost-longest rule specified in POSIX 1003.2 standard. ...
We say a parse configuration ⟨u, t, v⟩ derives a string uwv if the canonical parse tree t derives the string w. ...
doi:10.1007/978-3-642-18098-9_25
fatcat:qa7ieweuabc2zpghzqyyt4umju
RE2C: A lexer generator based on lookahead-TDFA
2020
Software Impacts
RE2C is a regular expression compiler: it transforms regular expressions into finite state machines and encodes them as programs in the target language. ...
J o u r n a l P r e -p r o o f Journal Pre-proof J o u r n a l P r e -p r o o f Journal Pre-proof [5] Angelo Borsotti, Ulya Trofimovich, Efficient POSIX Submatch Extraction on NFA, preprint, 2019, URL: ...
Submatch extraction is a special case of the parsing problem: in addition to solving the recognition problem it has to find the derivation of the input string in the grammar defined by the regular expression ...
doi:10.1016/j.simpa.2020.100027
fatcat:nneqzhzmy5dzfb2u2cfyv53grm
Declarative Cleaning of Inconsistencies in Information Extraction
2016
ACM Transactions on Database Systems
We show that our framework captures the popular cleaning policies, as well as the POSIX semantics for extraction through regular expressions. ...
., always results in a single repair) and whether it increases the expressive power of the extraction language. ...
An example of a spanner representation is a regex formula: a regular expression with embedded capture variables that are viewed as relational attributes. ...
doi:10.1145/2877202
fatcat:rkgfhc7nkjcg7gs7jfau3mngka
Cleaning inconsistencies in information extraction via prioritized repairs
2014
Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '14
We show that our framework captures the popular cleaning policies, as well as the POSIX semantics for extraction through regular expressions. ...
Dealing with inconsistencies is hence of crucial importance in IE systems. ...
A regular expression with capture variables, or just variable regex for short, is an expression in the following syntax that extends that of regular expressions: γ : r | | σ | γ γ | γ ¤ γ | γ ¦ | xtγu ...
doi:10.1145/2594538.2594540
dblp:conf/pods/FaginKRV14
fatcat:hslqgvt4yfcile4h7wtruoz6lq
Type inference for unique pattern matching
2006
ACM Transactions on Programming Languages and Systems
Regular expression patterns provide a natural, declarative way to express constraints on semistructured data and to extract relevant information from it. ...
Since regular expressions can be ambiguous in general, different disambiguation policies have been proposed to get a unique matching strategy. ...
Syntactically, regular (hedge) expression patterns are regular (hedge) expressions annotated with variable binders. ...
doi:10.1145/1133651.1133652
fatcat:jbhrv3ffyndg5hl6krq4wdrko4
Stream Processing using Grammars and Regular Expressions
[article]
2017
arXiv
pre-print
In the first part we develop two linear-time algorithms for regular expression based parsing with Perl-style greedy disambiguation. ...
In the second part we also develop a new linear-time streaming parsing algorithm for parsing expression grammars (PEG) which generalizes the regular grammars of Kleenex. ...
INTRODUCTION
Regular Expression Based Parsing The first part will be concerned with regular expressions, an algebraic formalism with a well-understood theory that is commonly used to express patterns ...
arXiv:1704.08820v1
fatcat:twr3ysgyzzg3nap22n6xvepp5i
A Relational Framework for Information Extraction
2016
SIGMOD record
This task is pervasive in contemporary computational challenges associated with Big Data. ...
Regular expressions that have this feature are called extended regular expressions (xregex for short) [1, 8, 9] . It is known that xregex can recognize non-regular languages, such as tss | s P Σ˚u. ...
Various languages for querying semi-structured and graph databases are based on regular expressions. A simple form of such queries are the regular path queries ...
doi:10.1145/2935694.2935696
fatcat:yxhivm3s4bennonn6zfmhyin5a
Tagged Deterministic Finite Automata with Lookahead
[article]
2019
arXiv
pre-print
This paper extends the work of Laurikari and Kuklewicz on tagged deterministic finite automata (TDFA) in the context of submatch extraction in regular expressions. ...
The proposed algorithm can handle repeated submatch and therefore is applicable to full parsing. ...
One useful extension of traditional regular expressions that cannot be implemented using ordinary DFA is submatch extraction and parsing. ...
arXiv:1907.08837v1
fatcat:gnlgj5jx3vhkfguyzrgkg2gc6q
Regular expression containment
2011
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '11
Our axiomatization gives rise to a natural computational interpretation of regular expressions as simple types that represent parse trees, and of containment proofs as coercions. ...
We show how to encode regular expression equivalence proofs in Salomaa's, Kozen's and Grabmayer's axiomatizations into our containment system, which equips their axiomatizations with a computational interpretation ...
the applications of regular expressions as types. ...
doi:10.1145/1926385.1926429
dblp:conf/popl/HengleinN11
fatcat:u3xtunpeavap5mbkd5whm3umva
Analysing installation scenarios of Debian packages
[chapter]
2020
Lecture Notes in Computer Science
The Debian distribution includes more than 28 thousand maintainer scripts, almost all of them are written in Posix shell. ...
These scripts are executed with root privileges at installation, update, and removal of a package, which make them critical for system maintenance. ...
All of those scripts that are syntactically correct with respect to the Posix standard (99.9%) are parsed successfully by our parser. ...
doi:10.1007/978-3-030-45237-7_14
fatcat:43ybsru5lnc6plmy5oflpqitaa
Transducers from Rewrite Rules with Backreferences
[article]
1999
arXiv
pre-print
The range of possibilities leaves plenty of room for future research. lists the relevant regular expression operators. FSA Utilities offers the possibility to define new regular expression operators. ...
To compare this with a backreference in Perl, suppose that Tacr is a subroutine that converts phrases into acronyms and that Racr is a regular expression matching phrases that can be converted into acronyms ...
arXiv:cs/9904008v1
fatcat:igrpk7ij25b2dndcsc3ny2kcke
« Previous
Showing results 1 — 15 out of 597 results