I/O-Efficient Algorithms for Problems on Grid-Based Terrains

Lars Arge, Laura Toma, Jeffrey Scott Vitter
2001 ACM Journal of Experimental Algorithmics  
The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amounts of geospatial data from projects like NASA's Mission to Planet Earth. However, the use of these massive datasets also exposes scalability problems with existing GIS algorithms. These scalability problems are mainly due to the fact that most GIS algorithms have been designed to minimize internal computation time, while I/O communication often is the bottleneck
more » ... when processing massive amounts of data. In this paper, we consider I/O-e cient algorithms for problems on grid-based terrains. Detailed grid-based terrain data is rapidly becoming available for much of the earth's surface. We describe O( N B log M=B N B ) I/O algorithms for several problems on p N by p N grids for which o n l y O(N) algorithms were previously known. Here M denotes the size of the main memory and B the size of a disk block. We demonstrate the practical merits of our work by comparing the empirical performance of our new algorithm for the ow accumulation problem with that of the previously best known algorithm. Flow accumulation, which m o d e l s o w o f w ater through a terrain, is one of the most basic hydrologic attributes of a terrain. We p r e s e n t the results of an extensive set of experiments on real-life terrain datasets of di erent sizes and characteristics. Our experiments show that while our new algorithm scales nicely with dataset size, the previously known algorithm \breaks down" once the size of the dataset becomes bigger than the available main memory. F or example, while our algorithm computes the ow accumulation for the Appalachian Mountains in about three hours, the previously known algorithm takes several weeks.
doi:10.1145/945394.945395 fatcat:wcatskku7jhk5oq4zvs2op4cxe