A File Search Method Based on Intertask Relationships Derived from Access Frequency and RMC Operations on Files [chapter]

Yi Wu, Kenichi Otagiri, Yousuke Watanabe, Haruo Yokota
2011 Lecture Notes in Computer Science  
The tremendous growth in the number of files stored in filesystems makes it increasingly difficult to find desired files. Traditional keyword-based search engines are incapable of retrieving files that do not include keywords. To tackle this problem, we use file-access logs to derive intertask relationships for file search. Our observations are that 1) files related to the same task are frequently used together, and 2) a set of Rename, Move, and Copy (RMC) operations tends to initiate a new
more » ... . We have implemented a system named SUGOI, which detects two types of task, FI tasks and RMC tasks, from file-access logs. An FI task corresponds to a group of files frequently accessed together. An RMC task is generated by RMC operations and then constructs a graph of intertask relationships based on the influence of RMC operations and the similarity between tasks. In utilizing detected tasks and intertask relationships, our system expands the search results of a keyword-based search engine. Experiments using actual file-access logs indicate that the proposed approach significantly improves search results.
doi:10.1007/978-3-642-23088-2_27 fatcat:44htp4g5vrfslp4ixqex54szlm