A Critical Analysis of Databases used in Financial Misconduct Research

Jonathan M. Karpoff, Allison Koester, D. Scott Lee, Gerald S. Martin
2012 Social Science Research Network  
Financial misconduct is revealed to investors through a complex sequence of information events. We hand-collect data for 1,099 SEC enforcement cases to examine biases in four popular databases used in financial misconduct research. We find that initial public announcements of financial misconduct occur months before coverage in these databases. These databases omit most relevant announcements because they collect just one type of event or miss events they purportedly capture; most events they
more » ... capture are unrelated to financial fraud. Event studies and firm characteristic comparisons show that these database features can lead to economically meaningful biases in corporate finance research. JEL classifications: G38; K22; K42; M41 . We ask, however, that researchers postpone further requests for these data until this working paper is accepted for publication. We also encourage researchers to consider whether the HC data are appropriate for their tests, as the HC data are subject to their own biases. For example, deHaan et al. (2012) find that firm proximity to SEC offices and career considerations of SEC attorneys influence which cases attract SEC attention and therefore fall into the HC sample. Electronic copy available at: http://ssrn.com/abstract=2112569 We report four primary findings about the severity and economic significance of the four database features. First, the initial dates associated with the events included in each of the four popular databases occur an average of 150 to 1,017 calendar days (depending on the database) after the initial public disclosure of the financial misconduct. As a result, event studies that rely on the dates in these databases understate the initial one-day market-adjusted stock price reaction to news of financial misconduct by 56% to 73% (depending on the database). Second, each database (by design) captures only one type of misconduct-related event (e.g., the GAO and AA databases capture only restatements, the SCAC database captures only securities class action lawsuits, and the AAER database captures only SEC enforcement actions receiving a secondary AAER designation). As a result, each captures only 6% to 36% of the value-relevant announcements associated with the cases of misconduct they identify. Furthermore, the key informational events missed by each database have substantially larger impacts on firm value than the events captured by each database. Third, between 46% and 98% of the events in these databases are unrelated to charges of financial fraud. We show that the events that are related to financial fraud are systematically different from the other events in each database, indicating a need for substantial and systematic culling for researchers seeking samples of financial fraud. And fourth, these databases omit from 17% to 80% of the events they seek to capture during their sample periods. We show that firm and misconduct violation characteristics are statistically and economically different for included versus omitted cases in each database, indicating that many empirical tests are conducted on non-representative samples. None of the four databases scores consistently well or poorly across these four features. AAERs, for example, suffer less than the other databases from scope limitations (feature #2) and potentially extraneous events (feature #3), but perform very poorly in identifying when investors first learn of the misconduct (feature #1). The SCAC database has relatively few errors of omission (feature #4), but also captures the fewest number of relevant informational events that pertain to each case of misconduct (feature #2). Table 8 at the end of this paper provides a summary ranking of the four databases' performance along each of the four database features. The fact that no single database consistently
doi:10.2139/ssrn.2112569 fatcat:h3lqfhus6zbx5h5kmdfa76743e