The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Filters
GutenTag: A Multi-Term Caching Optimized Tag Query Processor for Key-Value Based NoSQL Storage Systems
[article]
2011
arXiv
pre-print
However, NoSQL systems do not efficiently support services referring more than one data object, e.g. the term-based search for data objects. ...
To address this issue we propose our architecture based on an inverted index on top of a NoSQL system. ...
However, a keyword-based search solely using a singleterm inverted index scales poorly in terms of bandwidth consumption in large distributed systems. ...
arXiv:1105.4452v1
fatcat:lssbrlc4pvgxpbun2hh75st5w4
A Scalable Document-based Architecture for Text Analysis
[article]
2016
arXiv
pre-print
Fundamental issues in text analysis include the lack of structure in document datasets, the need for various preprocessing steps %(e.g., stem or lemma extraction, part-of-speech tagging, named entities ...
Thus, we propose in this paper a new generic text analysis architecture, where document structure is flexible, many preprocessing techniques are integrated and textual datasets are indexed for efficient ...
DODBMSs are a class of NoSQL systems that aim to store, manage and process data using a semistructured model. DODBMSs encapsulate data in collections of documents [8] . ...
arXiv:1612.06195v1
fatcat:crux7ggde5bptjmkg2ufaa2any
A Scalable Document-Based Architecture for Text Analysis
[chapter]
2016
Lecture Notes in Computer Science
Fundamental issues in text analysis include the lack of structure in document datasets, the need for various preprocessing steps and performance and scaling issues. ...
Thus, we propose in this paper a new generic text analysis architecture, where document structure is flexible, many preprocessing techniques are integrated and textual datasets are indexed for efficient ...
DODBMSs are a class of NoSQL systems that aim to store, manage and process data using a semistructured model. DODBMSs encapsulate data in collections of documents [8] . ...
doi:10.1007/978-3-319-49586-6_33
fatcat:p67tpdebmzgdhfzxfsblf32d7u
A new Nested Graph Model for Data Integration
2018
path expressions), thus directly providing the data structure to be searched within the graph. ...
The previously defined query language is so simple that we cannot extract multiterms and avoid some intermediate characters. ...
order to search the common values * gl = gsm for the left elements * gr = gsm for the right elements * vartheta = binary predicate creating the equivalences *) (* Same function as cross , but keeps the ...
doi:10.6092/unibo/amsdottorato/8348
fatcat:4tfvtyhvhngejolqpooseqrkqu