Biologically Inspired Visual System Architecture for Object Recognition in Autonomous Systems [article]

Dan Malowany, Hugo Guterman
2020 arXiv   pre-print
Findings in recent years on the sensitivity of convolutional neural networks to additive noise, light conditions and to the wholeness of the training dataset, indicate that this technology still lacks the robustness needed for the autonomous robotic industry. In an attempt to bring computer vision algorithms closer to the capabilities of a human operator, the mechanisms of the human visual system was analyzed in this work. Recent studies show that the mechanisms behind the recognition process
more » ... the human brain include continuous generation of predictions based on prior knowledge of the world. These predictions enable rapid generation of contextual hypotheses that bias the outcome of the recognition process. This mechanism is especially advantageous in situations of uncertainty, when visual input is ambiguous. In addition, the human visual system continuously updates its knowledge about the world based on the gaps between its prediction and the visual feedback. Convolutional neural networks are feed forward in nature and lack such top-down contextual attenuation mechanisms. As a result, although they process massive amounts of visual information during their operation, the information is not transformed into knowledge that can be used to generate contextual predictions and improve their performance. In this work, an architecture was designed that aims to integrate the concepts behind the top-down prediction and learning processes of the human visual system with the state of the art bottom-up object recognition models, e.g., deep convolutional neural networks. The work focuses on two mechanisms of the human visual system: anticipation-driven perception and reinforcement-driven learning. Imitating these top-down mechanisms, together with the state of the art bottom-up feed-forward algorithms, resulted in an accurate, robust, and continuously improving target recognition model.
arXiv:2002.03472v2 fatcat:k3y6f7turrgjvk47g3m53olbxe