svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery [article]

Nicolas Turenne
2015 arXiv   pre-print
We present a new R package which takes a numerical matrix format as data input, and computes clusters using a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to speed up cluster extraction. In this sense, SVC can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a Jaccard-Radial base kernel can help to classify well enough a set of terms into ontological classes and
more » ... help to define regular expression rules for information extraction in documents; our case study concerns a set of terms and documents about developmental and molecular biology.
arXiv:1504.06080v1 fatcat:nz5uj6i23ndgzm7knx3w7xwigq