def exprs(a, b): return [ Expression(Eq(a, a + b + 5.)), Expression(Eq(a, b - a)), Expression(Eq(a, 4 * (b * a))), Expression(Eq(a, (6. / b) + (8. * a))) ]
def _schedule_expressions(self, clusters): """Wrap :class:`Expression` objects, already grouped in :class:`Cluster` objects, within nested :class:`Iteration` objects (representing loops), according to dimensions and stencils.""" # Topologically sort Iterations ordering = partial_order([i.stencil.dimensions for i in clusters]) for i, d in enumerate(list(ordering)): if d.is_Buffered: ordering.insert(i, d.parent) # Build the Iteration/Expression tree processed = [] schedule = OrderedDict() atomics = () for i in clusters: # Build the Expression objects to be inserted within an Iteration tree expressions = [Expression(v, np.int32 if i.trace.is_index(k) else self.dtype) for k, v in i.trace.items()] if not i.stencil.empty: root = None entries = i.stencil.entries # Reorder based on the globally-established loop ordering entries = sorted(entries, key=lambda i: ordering.index(i.dim)) # Can I reuse any of the previously scheduled Iterations ? index = 0 for j0, j1 in zip(entries, list(schedule)): if j0 != j1 or j0.dim in atomics: break root = schedule[j1] index += 1 needed = entries[index:] # Build and insert the required Iterations iters = [Iteration([], j.dim, j.dim.size, offsets=j.ofs) for j in needed] body, tree = compose_nodes(iters + [expressions], retrieve=True) scheduling = OrderedDict(zip(needed, tree)) if root is None: processed.append(body) schedule = scheduling else: nodes = list(root.nodes) + [body] mapper = {root: root._rebuild(nodes, **root.args_frozen)} transformer = Transformer(mapper) processed = list(transformer.visit(processed)) schedule = OrderedDict(list(schedule.items())[:index] + list(scheduling.items())) for k, v in list(schedule.items()): schedule[k] = transformer.rebuilt.get(v, v) else: # No Iterations are needed processed.extend(expressions) # Track dimensions that cannot be fused at next stage atomics = i.atomics return List(body=processed)
def copy_arrays(mapper, reverse=False): """ Build an Iteration/Expression tree performing the copy ``k = v``, or ``v = k`` if reverse=True, for each (k, v) in mapper. (k, v) are expected to be of type :class:`IndexedData`. The loop bounds are inferred from the dimensions used in ``k``. """ if not mapper: return () # Build the Iteration tree for the copy iterations = [] for k, v in mapper.items(): handle = [] indices = k.function.indices for i, j in zip(k.shape, indices): handle.append(Iteration([], dimension=j, limits=i)) lhs, rhs = (v, k) if reverse else (k, v) handle.append( Expression(Eq(lhs[indices], rhs[indices]), dtype=k.function.dtype)) iterations.append(compose_nodes(handle)) # Maybe some Iterations are mergeable iterations = MergeOuterIterations().visit(iterations) return iterations
def exprs(a, b, c, d, a_dense, b_dense): return [Expression(Eq(a, a + b + 5.)), Expression(Eq(a, b*d - a*c)), Expression(Eq(b, a + b*b + 3)), Expression(Eq(a, a*b*d*c)), Expression(Eq(a, 4 * ((b + d) * (a + c)))), Expression(Eq(a, (6. / b) + (8. * a))), Expression(Eq(a_dense, a_dense + b_dense + 5.))]
def _schedule_expressions(self, clusters, ordering): """Wrap :class:`Expression` objects, already grouped in :class:`Cluster` objects, within nested :class:`Iteration` objects (representing loops), according to dimensions and stencils.""" processed = [] schedule = OrderedDict() for i in clusters: # Build the Expression objects to be inserted within an Iteration tree expressions = [ Expression(v, np.int32 if i.trace.is_index(k) else self.dtype) for k, v in i.trace.items() ] if not i.stencil.empty: root = None entries = i.stencil.entries # Can I reuse any of the previously scheduled Iterations ? index = 0 for j0, j1 in zip(entries, list(schedule)): if j0 != j1: break root = schedule[j1] index += 1 needed = entries[index:] # Build and insert the required Iterations iters = [ Iteration([], j.dim, j.dim.size, offsets=j.ofs) for j in needed ] body, tree = compose_nodes(iters + [expressions], retrieve=True) scheduling = OrderedDict(zip(needed, tree)) if root is None: processed.append(body) schedule = scheduling else: nodes = list(root.nodes) + [body] mapper = {root: root._rebuild(nodes, **root.args_frozen)} transformer = Transformer(mapper) processed = list(transformer.visit(processed)) schedule = OrderedDict( list(schedule.items())[:index] + list(scheduling.items())) for k, v in list(schedule.items()): schedule[k] = transformer.rebuilt.get(v, v) else: # No Iterations are needed processed.extend(expressions) return processed
def test_loops_ompized(fa, fb, fc, fd, t0, t1, t2, t3, exprs, expected, iters): scope = [fa, fb, fc, fd, t0, t1, t2, t3] node_exprs = [Expression(EVAL(i, *scope)) for i in exprs] ast = iters[6](iters[7](node_exprs)) nodes = transform(ast, mode='openmp').nodes assert len(nodes) == 1 ast = nodes[0] iterations = FindNodes(Iteration).visit(ast) assert len(iterations) == len(expected) # Check for presence of pragma omp for i, j in zip(iterations, expected): pragmas = i.pragmas if j is True: assert len(pragmas) == 1 pragma = pragmas[0] assert 'omp for' in pragma.value else: for k in pragmas: assert 'omp for' not in k.value
def make_grid_accesses(node): """ Construct a new Iteration/Expression based on ``node``, in which all :class:`interfaces.Indexed` accesses have been converted into YASK grid accesses. """ def make_grid_gets(expr): mapper = {} indexeds = retrieve_indexed(expr) data_carriers = [i for i in indexeds if i.base.function.from_YASK] for i in data_carriers: name = namespace['code-grid-name'](i.base.function.name) args = [INT(make_grid_gets(j)) for j in i.indices] mapper[i] = make_sharedptr_funcall(namespace['code-grid-get'], args, name) return expr.xreplace(mapper) mapper = {} for i, e in enumerate(FindNodes(Expression).visit(node)): lhs, rhs = e.expr.args # RHS translation rhs = make_grid_gets(rhs) # LHS translation if e.output_function.from_YASK: name = namespace['code-grid-name'](e.output_function.name) args = [rhs] + [INT(make_grid_gets(i)) for i in lhs.indices] handle = make_sharedptr_funcall(namespace['code-grid-put'], args, name) processed = Element(c.Statement(ccode(handle))) else: # Writing to a scalar temporary processed = Expression(e.expr.func(lhs, rhs), dtype=e.dtype) mapper.update({e: processed}) return Transformer(mapper).visit(node)
def _loop_fission(self, state, **kwargs): """ Apply loop fission to innermost :class:`Iteration` objects. This pass is not applied if the number of statements in an Iteration's body is lower than ``self.thresholds['fission'].`` """ processed = [] for node in state.nodes: mapper = {} for tree in retrieve_iteration_tree(node): if len(tree) <= 1: # Heuristically avoided continue candidate = tree[-1] expressions = [e for e in candidate.nodes if e.is_Expression] if len(expressions) < self.thresholds['max_fission']: # Heuristically avoided continue if len(expressions) != len(candidate.nodes): # Dangerous for correctness continue functions = list( set.union(*[set(e.functions) for e in expressions])) wrapped = [e.expr for e in expressions] if not functions or not wrapped: # Heuristically avoided continue # Promote temporaries from scalar to tensors handle = functions[0] dim = handle.indices[-1] size = handle.shape[-1] if any(dim != i.indices[-1] for i in functions): # Dangerous for correctness continue wrapped = promote_scalar_expressions(wrapped, (size, ), (dim, ), True) assert len(wrapped) == len(expressions) rebuilt = [ Expression(s, e.dtype) for s, e in zip(wrapped, expressions) ] # Group statements # TODO: Need a heuristic here to maximize reuse args_frozen = candidate.args_frozen properties = as_tuple( args_frozen['properties']) + (ELEMENTAL, ) args_frozen['properties'] = properties n = self.thresholds['min_fission'] fissioned = [ Iteration(g, **args_frozen) for g in grouper(rebuilt, n) ] mapper[candidate] = List(body=fissioned) processed.append(Transformer(mapper).visit(node)) return {'nodes': processed}