Legion: Expressing locality and independence with logical regions

Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken
2012 2012 International Conference for High Performance Computing, Networking, Storage and Analysis  
Modern parallel architectures have both heterogeneous processors and deep, complex memory hierarchies. We present Legion, a programming model and runtime system for achieving high performance on these machines. Legion is organized around logical regions, which express both locality and independence of program data, and tasks, functions that perform computations on regions. We describe a runtime system that dynamically extracts parallelism from Legion programs, using a distributed, parallel
more » ... uted, parallel scheduling algorithm that identifies both independent tasks and nested parallelism. Legion also enables explicit, programmer controlled movement of data through the memory hierarchy and placement of tasks based on locality information via a novel mapping interface. We evaluate our Legion implementation on three applications: fluid-flow on a regular grid, a three-level AMR code solving a heat diffusion equation, and a circuit simulation. struct Node { float voltage, new charge, capacitance; }; 2 struct Wire rn { Node@rn in node, out node; float current, ... ; }; 3 struct Circuit { region r all nodes; / * contains all nodes for the circuit * / 4 region r all wires; / * contains all circuit wires * / }; 5 struct CircuitPiece { 6 region rn pvt, rn shr, rn ghost; / * private, shared, ghost node regions * / 7 region rw pvt; / * private wires region * / }; 8 9 void simulate circuit(Circuit c, float dt) : RWE(c.r all nodes, c.r all wires) 10 { 11 // The construction of the colorings is not shown. The colorings wire owner map, 12 // node owner map, and node neighbor map have MAX PIECES colors 13 // 0..MAX PIECES − 1. The coloring node sharing map has two colors 0 and 1. 14 // 15 // Partition of wires into MAX PIECES pieces 16 partition disjoint p wires = c.r all wires.partition(wire owner map); 17 // Partition nodes into two parts for all−private vs. all−shared 18 partition disjoint p nodes pvs = c.r all nodes.partition(node sharing map); 19 20 // Partition all−private into MAX PIECES disjoint circuit pieces 21 partition disjoint p pvt nodes = p nodes pvs[0].partition(node owner map); 22 // Partition all−shared into MAX PIECES disjoint circuit pieces 23 partition disjoint p shr nodes = p nodes pvs[1].partition(node owner map); 24 // Partition all−shared into MAX PIECES ghost regions, which may be aliased 25 partition aliased p ghost nodes = p nodes pvs[1].partition(node neighbor map); 26 27 CircuitPiece pieces[MAX PIECES]; 28 for(i = 0; i < MAX PIECES; i++) 29 pieces[i] = { rn pvt: p pvt nodes[i], rn shr: p shr nodes[i], 30 rn ghost: p ghost nodes[i], rw pvt: p wires[i] }; 31 for (t = 0; t < TIME STEPS; t++) { 32 spawn (i = 0; i < MAX PIECES; i++) calc new currents(pieces[i]); 33 spawn (i = 0; i < MAX PIECES; i++) distribute charge(pieces[i], dt); 34 spawn (i = 0; i < MAX PIECES; i++) update voltages(pieces[i]); 35 } 36 } 37 // ROE = Read−Only−Exclusive 38 void calc new currents(CircuitPiece piece): 39 RWE(piece.rw pvt), ROE(piece.rn pvt, piece.rn shr, piece.rn ghost) { 40 foreach(w : piece.rw pvt) 41 w→current = (w→in node→voltage − w→out node→voltage) / w→resistance; 42 } 43 // RdA = Reduce−Atomic 44 void distribute charge(CircuitPiece piece, float dt): 45 ROE(piece.rw pvt), RdA(piece.rn pvt, piece.rn shr, piece.rn ghost) { 46 foreach(w : piece.rw pvt) { 47
doi:10.1109/sc.2012.71 dblp:conf/sc/BauerTSA12 fatcat:gwgogopsvncy7k6iy2gb566sgm