Discovery of email communication networks from the Enron corpus with a genetic algorithm using social network analysis

Garnett Wilson, Wolfgang Banzhaf
2009 2009 IEEE Congress on Evolutionary Computation  
During the legal investigation of Enron Corporation, the U.S. Federal Regulatory Commission (FERC) made public a substantial data set of the company's internal corporate emails. This work presents a genetic algorithm (GA) approach to social network analysis (SNA) using the Enron corpus. Three SNA metrics, degree, density, and proximity prestige, were applied to the detection of networks of high activity and presence of important actors with respect to email transactions. Quantitative analysis
more » ... vealed that density and proximity prestige captured networks of high activity more so than degree. Subsequent qualitative analysis reveals that there are trade-offs in the selection of SNA metrics. Examination of the discovered social networks revealed that density and proximity prestige isolated networks involving key actors to a greater extent than degree. In particular, density picked out interesting patterns in terms of email volume, while proximity prestige better isolated key actors at Enron. The roles of the particular actors picked out by the networks as reasons for their prominence are also discussed.
doi:10.1109/cec.2009.4983357 dblp:conf/cec/WilsonB09 fatcat:cfavhbtt4bbzdl53djcn64gjcm