Giving Text Analytics a Boost

Raphael Polig, Kubilay Atasu, Laura Chiticariu, Christoph Hagleitner, H. Peter Hofstee, Frederick R. Reiss, Huaiyu Zhu, Eva Sitaridi
2014 IEEE Micro  
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text analytics system, which offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing the so-called "Big Data" in an efficient way, despite the high memory bandwidth that is available. We show that by using a streaming hardware accelerator
more » ... mented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. We present how such a system can be deployed by extending SystemT's existing compilation flow and by using a multi-threaded communication interface that can efficiently use the bandwidth of the accelerator.
doi:10.1109/mm.2014.69 fatcat:htqz25kpz5etdktyni7xagcyqy