Carry Out Computer Tasks with Gesture using Image Pre-Processing and TensorFlow Framework
Pratik Magar, JSPM NTC
2020
International Journal of Engineering Research and
The method for real time Hand Gesture Recognition and feature extraction using a web camera. For humans, hands are used most frequently to communicate and interact with machines. Mouse and Keyboard are the basic input/output to computers and the use of both the devices require the use of hands. Most important and immediate information exchange between man and machine is through visual and aural aid, but this communication is one-sided. To help somewhat mouse remedies this problem, but there are
more »
... limitations as well. Although hands are most commonly used for day to day physical manipulation tasks, but in some cases they are also used for communication. Hand gestures support us in our daily communications to convey our messages clearly. Hands are most important for mute and deaf people, who depends on their hands and gestures to communicate, so hand gestures are vital for communication in sign language. Hand gesture interaction has been the trending technology for humancomputer interaction (HCI). Frequently a number of research works are carried out in this area to expedite and contrive interaction with computers. In this project, we are attempting to create a real-time human-computer interaction system (HCI) using different hand gestures i.e, hand signs. We implemented a system that recognizes the hand gesture performed using the simple web camera of a PC. We used the already established OpenCV library with TensorFlow to implement the various methods of Image Processing. Operations that were carried out were Capturing frames, Background Subtraction using MOG filter, Noise Reduction using Gaussian Blur, converting the captured image to binary image, finding the contours through Convex Hull method which is used for removing convexity defects and segment the image. Using these segmented images we built our own dataset that was used to train the model with TensorFlow. Then we used these methods again for the segmentation of the input image which is then passed to the model to get the output class label. Index Terms -HCI, GOS, CNN, GUI, ROI, GPU. I. INTRODUCTION This paper is about a Real time System which can recognize the human gesture and perform a specific operation on the computer as per the human gesture. We have implemented a simple GUI which will guide the user in understanding how the system works. A CNN model is trained using the dataset we created, which is used to predict the gesture based on the input taken from the web camera of user's PC. Since the advent of the computer, the user has been forced to conform to the interface dictated by the machine. In the 1960s the keyboard of the punch card machine and teletype was considered a big improvement over flipping banks of switches, but the user still had to learn to operate multiple machines to use a computer. When the interactive dumb terminals arrived in the 1970s, all the user had to do was learn to type. However even typing was seen as a burden, and a more efficient interface was developed. Graphical operating systems of the 1980s, inspired by the "look-and-feel" of a desktop, introduced the mouse-a simple pointing device for the user. In the 1990s, with increases in computational power, decent speech recognition and pen-based computing has become a reality. Some user interfaces have explored the individual modes of communication in a limited sense. All the aforementioned interfaces, with possible exception of the speech recognizer, have been efficient for a trained user, but they are typically inefficient as a human-centric form of communication. For example, when communicating with a friend, we would rather see the person and converse with him or her, rather than staring at a terminal full of characters while typing a message. With recent advances in Human Computer Intelligent Interaction (HCII) it has became more feasible to create interfaces that resemble forms of human communication. Although it is still impossible to create an ubiquitous interface that can handle all forms of human communication, it is possible to create a small multi-modal subset. Keeping that in mind a system that can take normal hand gestures as input is possible to implement. We need the interaction system to be easy to use so we have used the most common hand gestures that humans use in day to day life. Each of the gestures is mapped with a specific task, which will be performed by the computer when such a gesture is given as input to the system. This system can be used by any person without it being specifically tailored for anyone.
doi:10.17577/ijertv9is090303
fatcat:r5kezy667fhmjftzbda2t6b4vi