Automatically classifying emails into activities

Mark Dredze, Tessa Lau, Nicholas Kushmerick
2006 Proceedings of the 11th international conference on Intelligent user interfaces - IUI '06  
Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, instead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages as belonging to an activity. This paper presents several algorithms for
more » ... ms for automatically recognizing emails as part of an ongoing activity. Our baseline methods are the use of message reply-to threads to determine activity membership and a naïve Bayes classifier. Our SimSubset and SimOverlap algorithms compare the people involved in an activity against the recipients of each incoming message. Our SimContent algorithm uses IRR (a variant of latent semantic indexing) to classify emails into activities using similarity based on message contents. An empirical evaluation shows that each of these methods provide a significant improvement to the baseline methods. In addition, we show that a combined approach that votes the predictions of the individual methods performs better than each individual method alone.
doi:10.1145/1111449.1111471 dblp:conf/iui/DredzeLK06 fatcat:ltc744bhxjddnp4p7n5gxvg5oq