Pipelined RLS adaptive architecture using relaxed Givens rotations (RGR)
2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353)
In this paper, we focus on developing a new relaxed Givens rotations (RGR)-RLS algorithm and the corresponding RGR-RLS systolic array. The resulting algorithm and architecture possess fine-grain pipelining, nearly the same convergence as the QRD-RLS, good robustness for , and square-root free computation with a little area overhead. I. Introduction Recursive least squares (RLS) based adaptive digital filters have wide applications in adaptive equalization , beamforming and image processing.
... istorically, the gradient descent algorithms such as the least-mean-square (LMS) and delay LMS (DLMS)  algorithms are very cost-effective but unfortunately they are not suitable for all applications. The incurred major problem based on the LMS/DLMS algorithm is the slow convergence rate for a broad dynamic range signal environment. The convergence of the RLS algorithm is faster than that of the LMS and DLMS algorithms, but its computational complexity higher than the latter is an order of magnitude. The QR decomposition (QRD)-RLS algorithm [3-4] using triangularization process is the most promising RLS algorithm since it is known to have good numerical properties and can be mapped to a coarse-grain pipelining systolic array. The QRD-RLS algorithm is, hence, very suited to VLSI implementation. The critical period of the QRD-RLS algorithm is limited by the operation time in the recursive loop of the individual cells. In many applications such as equalization and image restoration, very high throughput would be desired, and the QRD-RLS algorithm may not be capable of operating at such high throughput. In order to overcome this drawback, some research has provided several schemes as follows [5, 6] . However, these algorithms also have the same fine-grain pipelining difficulty as the QRD-RLS algorithm. Apart from being used to increase speed, fine-grain pipelining can also be used to reduce power dissipation in low to moderate speed applications. To increase the speed of the QRD-RLS algorithm, the look-ahead technique leading to fine-grain pipelining can be used. However, using look-ahead in the QRD-RLS algorithm results in large hardware overhead. Consequently, this technique is not practical for the QRD-RLS algorithm. Recently, the STAR-RLS algorithm  solves the fine-grain pipelining difficulty; however, it diverges for small value of the forgetting factor. It is known that the smaller value of the forgetting factor results in faster convergence than the larger one. In , CORDIC-based QRD-RLS needs ROM/RAM that consumes a large area to implement the architecture. Based on these unsolved problems, we are motivated to provide a new algorithm and architecture. This paper is organized as follows. We propose the RGR-RLS and Pipelined RGR-RLS (PRGR-RLS)