Tutorial and critical analysis of phishing websites methods

Rami M. Mohammad, Fadi Thabtah, Lee McCluskey
2015 Computer Science Review  
The Internet has become an essential component of our everyday social and financial activities. Internet is not important for individual users only but also for organizations, because organizations that offer online trading can achieve a competitive edge by serving worldwide clients. Internet facilitates reaching customers all over the globe without any market place restrictions and with effective use of e-commerce. As a result, the number of customers who rely on the Internet to perform
more » ... ments is increasing dramatically. Hundreds of millions of dollars are transferred through the Internet every day. This amount of money was tempting the fraudsters to carry out their fraudulent operations. Hence, Internet users may be vulnerable to different types of web threats, which may cause financial damages, identity theft, loss of private information, brand reputation damage and loss of customers' confidence in e-commerce and online banking. Therefore, suitability of the Internet for commercial transactions becomes doubtful. Phishing is considered a form of web threats that is defined as the art of impersonating a website of an honest enterprise aiming to obtain user's confidential credentials such as usernames, passwords and social security numbers. In this article, the phishing phenomena will be discussed in detail. In addition, we present a survey of the state of the art research on such attack. Moreover, we aim to recognize the up-to-date developments in phishing and its precautionary measures and provide a comprehensive study and evaluation of these researches to realize the gap that is still predominating in this area. This research will mostly focus on the web based phishing detection methods rather than email based detection methods. INTRODUCTION Although phishing is a relatively new web-threat, it has a massive impact on the commercial and online transaction sectors. Presumably, phishing websites have high visual similarities to the legitimate ones in an attempt to defraud the honest people. Social engineering and technical tricks are commonly combined together in order to start a phishing attack. Typically, a phishing attack starts by sending an e-mail that seems authentic to potential victims urging them to update or validate their information by following a URL link within the e-mail. Predicting and stopping phishing attack is a critical step toward protecting online transactions. Several approaches were proposed to mitigate these attacks. Nonetheless, phishing websites are expected to be more sophisticated in the future. Therefore, a promising solution that must be improved constantly is needed to keep pace with this continuous evolution. Anti-phishing measures may take several forms including: legal, education and technical solutions. To date, there is no complete solution able to capture every phishing attack. The Internet community has put in a considerable amount of effort into defensive techniques against phishing. However, the problem is continuously evolving and ever more complicated deceptive methods to obtain sensitive information and perform e-crimes on the Internet are appearing. Anti-phishing tools, or sometimes so called fighting phishing tools, are employed to protect users from posting their information through a forged website. Recognizing phishing websites accurately and within a passable timescale as well as providing a good warning technique reflect how good an anti-phishing tool is. Designing a phishing websites has become much easier and much more sophisticated, and that was the motivation behind looking for an effective anti-phishing technique. Mixed research methodology has been adopted in our study. Since some previous studies suggest applying protection mechanisms without offering clear experimental results. Hence, qualitative methodology is best fits such researches. On the other hand, some researches taking into consideration experimental analysis, data gathering techniques, testing measures and comparing results, thus it is worthy applying quantitative methodology in such cases. This article is structured as follows: Section 3 discusses what phishing is and how it started. Section 4 introduces different phishing techniques. Section 5 describes the phishing websites life cycle. Section 6 discusses how and why people fall prey to phishing. Section 7 shows some phishing statistics. Section 8 describes phishing countermeasures. Section 9 introduces a detailed discussion of the up to date anti-phishing techniques. Section 10 compares between human and automatic based protection. Finally, we summarize in Section 11. THE STORY OF PHISHING Deceiving users into giving their passwords or other private information has a long tradition in the cybercrime community. In the early 90's, with the growing popularity of the Internet, we have witnessed the birth of a new type of cybercrime; that is phishing. In 1987 a detailed description of phishing was introduced, and the first recorded attack was in 1995 (James 2005). In the early age of phishing; phishers mainly designed their attacks to deceive English-speaking users. Today, phishers broaden their attack to cover users and businesses all over the globe (Sullins 2006) . At the beginning, phishers acted individually or in small and simple groups. Usually, phishing is accomplished through the practice of social engineering. An attacker may introduce himself as a humble and respectable person claiming to be new at the job, a helpdesk person or a researcher. An example of using social engineering is urgency; by asking the user to submit his information as soon as possible. Risk of terrible results if the user denies complying is another tactic used to start social phishing, for example warn the user that his account will be closed or the service will be terminated if he doesn't respond. However, some social engineering tactics promise big prizes by showing a message claiming that the user has won a big prize and to receive it he needs to submit his information. Nowadays, as monetary organizations have improved their online investments the economic benefit of obtaining online account information has become much larger. Thus, phishing attacks became more proficient, planned and efficient. Phishing is an alternate of the word "fishing" (Oxford Dictionaries 1990) and it refers to bait used by phishers who are waiting for the victims to be bitten (James 2005). Surveys commonly depict early phishers as mischief-makers aiming to collect information to make long-distance phone calls (Watson, Holz and Mueller 2005); such attack was called "Phone Phreaking". This name was behind the origin of the "ph" replacement of the character 'f' in the word "fishing" (Oxford Dictionaries 1990). Phishing websites are designed to give an impression that they came from a legitimate party with the aim to deceive users into divulging their personal information. The phishers may use this information for dishonest intentions, for instance money laundering or illegal online transactions. While those phishers focus on individual customers, the organizations that phishers are mimicking are also victims because their brand and reputation is compromised. There are several definitions of the term "Phishing". To have a good understanding of phishing and their attacking strategies, several definitions will be discussed. Some definitions believe that phishing demands sociological skills in combination with technical skills. As in the definition from the "Anti-Phishing Working Group" (APWG, Aaron and Manning 2014): "A criminal mechanism employing both social engineering and technical subterfuge to steal consumers' personal identity data and financial account credentials". Another definition comes from (Ming and Chaobo 2006): "A phishing website is a style of offence that network fishermen tempt victim with pseudo website to surrender important information voluntarily". A detailed description stated by (Kirda and Kruegel 2005) defines phishing as : "creating a fake online company to impersonate a legitimate organization; and asking for personal information from unwary consumers depending on social skills and website deceiving methods to trick victims into disclosure of their personal information which is usually used in an illegal transaction". Some definitions assumed that the success of phishing websites depends on their ability to mimic a legitimate website, because most Internet-users, even those having a good expertise in Internet and information security, have a propensity to decide on a website's validity based on its look-and-feel which might be orchestrated proficiently by phishers. An example of such definitions comes from (James 2005), who defines phishing as: "Attempts to masquerade as a trustworthy entity in an electronic communication to trick recipients' into divulging sensitive information such as bank account numbers, passwords, and credit card details". Since phishing websites request victims to submit their credentials through a webpage, it is necessary to convince these victims that they are dealing with an honest entity, so that a good definition was introduced by (Zhang, Hong and Cranor 2007) where they define phishing as: Phishing website satisfies the following criteria:  Showing a high visual similarity.  Containing at least one login form. We may outline all previous definitions in one sentence: "Phishing website is the practice of creating a copy of a legitimate website and use social skills to fool a victim into submitting his personal information". PHISHING TECHNIQUES Until recently, phishers relied heavily on spoofed emails to start phishing attacks by persuading the victims to reply with the desired information. These day's social networking websites are used to spread doubtful links to lure victims to visit phishing websites. A report published by (MessageLabs 2009) estimated that one phishing email occurs every 325.2 emails sent through their system every day. Microsoft Research (Florencio and Herley 2007) revealed that 0.4% of email receivers' were persecuted by phishing emails in 2007. A report published by (Symantec Corporation 2013) substantiate that the amount of phishing websites that mimic social networking websites rose by 12% in 2012. If phishers were able to acquire users' social media login information, they can send out phishing emails to all their friends using the breached account. An email that appears to be originated from a well-known person seems much more trustworthy. Moreover, phishers may send out fake emails to your friends using your account telling them that you face an emergent situation. For example, "Help! I'm stuck overseas and my wallet has been stolen. Please send $200 as soon as possible". Nowadays, phishing websites have evolved rapidly, maybe at a faster pace than the counter measures. Compromised identifications and phishing toolkit are widely offered for sale on Internet black-markets at low prices (Franklin and Paxson 2007). These days, innovative phishing techniques are becoming more frequent, such as malware and Man-In-The-Middle attacks (MITM) (Keizer 2007). Phishers use different tactics and strategies in designing phishing websites. These strategies can be categorised into three basic groups those are: 1. Mimicking attack: In this attack phishers typically send an email to victims asking them to confirm, update or validate their credentials by clicking on a URL link within the email which will redirect them to a phony webpage. Phishers pay careful attention to designing emails that will be sent to the victims using the same logos of the original website, or sometimes using a fake HTTPS protocol. This type of attack undermines the customer confidence in electronic trading. Forward attack: This attack starts once a victim clicks on the link shown within an email. He then redirected to a website asking him to submit his personal information. This information sent to a hostile server, and the victim is then forwarded to the real website using MITM technique. 3. Pop-up attack: Another method used by MITM technique is urging victims to submit their information by means of well-designed pop-up window. The phishers persuade the victims that submitting their information through a pop-up window is considered more secure. In order to accomplish their job, phishers use a set of intelligent tricks to give the impression to the victims that they are dealing with a legitimate website. Some of these tricks include using IP address in URL, adding a prefix and suffix to a domain name, hiding the true URL shown in the browser address bar, using a fake padlock-icon on the URL address bar and pretending that the SSL is enabled. These tricks make it difficult for the naïve user to distinguish a phishing website from a legitimate one. Overall, one principle if committed by organizations and customers will guarantee the security of their information; that is: "Organizations and consumers should be aware of phishing and antiphishing methods and take safety measure". Theoretically, this principle is easy, but in practice, it is very difficult to implement since there are new phishing techniques appearing constantly. PHISHING ATTACK LIFE CYCLE To combat phishing, we need to thoroughly investigate the nuts and bolts of the phishing attack. Following, we will describe the phishing attack life cycle.  Planning: Typically, phishers start planning for their attack by identifying their victims, the information to be achieved and which technique to use in the attack. The main aspect considered by the phishers to pick their targets is how to achieve the maximum profit at the lowest cost and least possible risk. A phisher might need to breach the employee list in an organization, the organization news from a social networking website or the organization calendar. Common social networking such as; email, Voice over IP (VoIP) and Instant Messaging (IM) are used to establish communication between the phisher and the potential victims. A classic phishing attack consists of two components: a trustworthy-looking email and a fraudulent webpage. The phishing emails contents are commonly designed to confuse, upset or excite the recipient. A fraudulent webpage has the look-and-feel of a legitimate webpage that it impersonates, often having a similar logo to the legitimate company, layout, and other critical features. A survey published in ACM magazine (Jagatic, et al. 2007) showed that Internet users were 4.5 times more likely to be victims of phishing if they received an invitation to visit a fake URL link from a person they knew. That explains why criminals target social networking websites. Efforts made by webmail providers in filtering phishing emails will decrease the extent of the problem and reduce the time needed to stop phishing attacks since they are the first point dealing with phishing emails. However, most webmail providers focus on filtering spam emails and they would be very happy if their spam filter catches phishing emails but without adding any phishing filters that may consume their resources. The main difference between spam emails and phishing emails is that, spam emails are annoying emails sent to advertise goods and services that have not been requested by the user. On the other hand, phishing emails are sent to get your personal information, which will be used later in fraud activities. The authors in (Chandrasekaran, Narayanan and Upadhyaya 2006) recommend stopping phishing attacks at this stage. The authors suggested dividing the email into several parts such as; subject line; email attachments and the salutation line in the email body. Then extracting some structural features from these parts and making some calculation to produce the final decision on the email legitimacy.  Collection: As soon as the victim takes an action making him susceptible to an information theft, he is then urged to submit his credentials through a trustworthy-looking webpage. Normally, the fake website is hosted on a compromised server, which has been exploited by the phisher for this purpose. A recent survey (Aaron and Rasmussen 2010) revealed that 78% of the servers holding phishing websites are either hacked file transfer protocol (FTP) or comprised of software application susceptibilities. Sometimes, the phishers may use the free cloud applications such as Google spreadsheets in order to host their fake websites (Seltzer 2011). Nobody is going to block "google.com" or even "spreadsheets.google.com", thus, not only naïve users will be deceived, but also expert users are less likely to block this website. In general, to reduce the possibility of being caught, phishers will exploit servers that have weak security or process loopholes operating from countries which have insufficient law enforcement resources (APWG 2003) .  Fraud: Finally, and once the phisher has achieved his goal, he then becomes involved in fraud by impersonating the victim. Sometimes, the information is sold on the Internet black-market. The amounts of activities that take place within the first few hours of a phishing life cycle are the most important aspect of any attack. Once the phishing website has been created and the phishing email has been sent to consumers, the anti-phishing tool should detect and stop the phishing website before the consumer submits his information as shown in Figure 1 . Seconds are important in this situation. Taking down the phishing website is the second line of defence. If we cannot stop the phishing attempt then the spoofed email could reach the victim's mailbox. An example of such an attack strategy happened when eBay costumers received an email claiming to be from the real eBay company asking customers to update their credentials so as not to freeze their accounts (BBC News 2005). The email contained a link that seemed to point to the real eBay's website. As soon as the user Fig.1 Phishing Websites Lifecycle clicked on that link, he was then transferred to a webpage that asked for his credentials, including credit card number, expiry date and full name. The phishing website had been designed carefully in an attempt to convince the user that he was dealing with a legitimate website.
doi:10.1016/j.cosrev.2015.04.001 fatcat:dcbl7izcufd5rkfq26rvr2dbdm