Code Analysis and Improvement of Onion Routing Anonymous Systems
International Journal of Future Generation Communication and Networking
With the development of Internet technology, network based activities such as e-commerce, internet voting and e-government, etc. have become increasingly frequent, People are increasingly concerned about the identity of network activity, content and other private information. In order to protect user privacy in network communications, governments, companies, universities and research institutes are pushing the research and development of the onion routing systems (TOR). Tor is the most popular
... nonymous communication system currently, which is based on technology of the second-generation onion routing. Tor has a low latency, encrypted data transmission, secure channel, etc., which are widely used in anonymous Web browsing, instant messaging, Secure Shell Client (SSH), etc. However, the development of the onion routing systems is constrained by complicated code factors. According to this situation, this paper put forward an overall architecture of TOR code, which contains Application Needs, Code Analysis and Improvement. This paper summarizes the overall framework of tor code structure, in order to make the code structure clearer, the whole code module is divided into several sub modules, using the function call diagram and UML diagram illustrates the main function of each module and call relation between each module. This paper gives code analysis for anonymous architecture of TOR. Finally, this paper put forward the improvement of routing algorithms of TOR. Based on this paper, a method has been created which is used to understand the tor code easily. Furthermore, systematic analysis of the tor code provided in this paper aims to research the working principle and promote the further development of the onion routing. Introduction In recent years, the development of computer technology and network technology has brought great convenience to people's life. With the continuous improvement of computer data handling capacity and the rapid development of data communications capability, along with the increase of the demand for all kinds of communication software, people also pay more attention to the security and anonymity of the communication software, there is a growing emphasis on network security and privacy protection. The existing network security technology is basically only concerned with the data itself, while ignored identity information of the communication entity, which leading to an attacker, through traffic analysis or other eavesdrop means, can obtain identity information between the two communication sides easily, this creates a great threat to the privacy of the user's personal identity. It is such a demand, the anonymous communication system came into being, which main purpose is to hide the identity of the communicating parties or hide communication relationship, under these conditions, the onion routing system arises at the historic moment. Tor  is a network of virtual tunnels that allows people and groups to 346 Copyright ⓒ 2016 SERSC improve their privacy and security on the Internet, and it is also an open source software, thus attracted government, companies, universities and research institutions pushing the research and development of the onion routing systems. However, because the source not only has a huge code structure, but also has a complex relationship of calling, and there is no very detailed code analysis material, these factors have greatly limited the understanding of the source code, which affect the further development of TOR software. The purpose of this paper is to analyze the code structure of the tor, the whole tor code module is decomposed into several sub-modules, through function calls diagram and UML diagram to analyze the main function of each module and call relation between each module. Routing algorithms for tor code at the same time also to do the corresponding improvement, no longer according to the bandwidth choice routing nodes, this paper proposes a routing algorithm of random and uniform distribution, thus increase the cost of the attacker steal information, and to some extent, improved the tor network anonymity and security. When the OP have downloaded the routing information from the directory server, then choosing a OR to join the communication line according to the predetermined algorithm, at the same time, OP and OR use Diffie-Hellman handshake negotiate a key, which used for generating symmetric encryption to establish communication links and encrypted communication data, and in order to prevent the attacker to tamper with data, tor agent and three tor routing establish a connection with the TLS (Trans Port Layers security) connection  . In the process of establishing the link, the user's OP will incremental build lines with a progress of a hop each time. The code as below: (3)The node selection scheme of private type: the nodes selected as this scheme only allow you to connect to a given host or network. Currently, most of the OR choose the selection scheme of middle type and private type, these selection scheme allows exit node connected to the majority of the target host, at the same time, can resist the attacker eavesdropping and traffic analysis, thus can ensure the safety of the exit node  . Execute steps as the following: By source analysis, we found three routes that they are all in optional collection, through the RAND_byte function of encryption algorithm of OpenSSL, priority selected node randomly based on bandwidth, which is used for load balancing. In addition, tor routing nodes of high-bandwidth can handle more services. The benefits of this kind of random selection is one of the best anonymity and security. The reason of such high anonymity is that routing algorithm has the largest route candidate set, and the reason of such high security is that the routing nodes selected by routing algorithm are usually in different countries, thus the possibility of being captured at the same time is very low, but it will also reduce the performance of the circuit  . For example, the entry node in Asia, middle node in Europe, exit node in Africa, which no doubt would lead to high latency.