- Dec 14, 2018
-
-
René Heß authored
-
- Dec 13, 2018
- Dec 12, 2018
-
-
René Heß authored
-
- Nov 23, 2018
-
-
René Heß authored
Introduce different methods for realize_input/output realize_direct_input/output and setup_input/output. The setup methods cover code generation outside the sumfact kernel function (creating input array or accumulating result). realize and realize_direct handle the input/output in the nonfastdg and fastdg code branch. Seperate interface methods make it a lot easier to find out where each of those methods will be applied. Besides that most interface classes need to provide more that two of those methods anyway...
-
- Nov 22, 2018
-
-
René Heß authored
-
René Heß authored
-
René Heß authored
-
René Heß authored
Non-fastdg: Permutation of the input happens before the sum factorization kernel when we setup the input. This is done by a method of the corresponding interface class. Fastdg: In this case the input will always be ordered according to x,y,... This means the permutation needs to happen in the sumfact kernel. Since we want to vectorize sumfact kernels with different input permutation in an upper/lower way we need to do this permutation in the corresponding interface class. This is done in the realize_direct method and in the vectorized case the corresponding methods of the scalar sumfact kernels are called.
-
- Nov 15, 2018
- Nov 14, 2018
-
-
René Heß authored
Note: - direct_is_possible true/false could probably be handled in an upper/lower vectorization way. - Vectorization of SF kernels should be based on cost permuted matrix sequence.
-
- Nov 13, 2018
- Nov 09, 2018
- Oct 31, 2018
-
-
Dominic Kempf authored
-
- Oct 30, 2018
-
-
Dominic Kempf authored
Merge branch 'feature/project-renaming' into 'master' See merge request [dominic/dune-perftool!283] [dominic/dune-perftool!283]: Nonedominic/dune-perftool/merge_requests/283
-
Dominic Kempf authored
-
Dominic Kempf authored
-
Dominic Kempf authored
Hoperfully achieving more robustness w.r.t. submodule changes
-
Dominic Kempf authored
-
Dominic Kempf authored
Merge branch 'autotune-merge' into 'master' ref:dominic/dune-perftool Resolving conflicts of [!270] See merge request [dominic/dune-perftool!282] [!270]: gitlab.dune-project.org/NoneNone/merge_requests/270 [dominic/dune-perftool!282]: gitlab.dune-project.org/dominic/dune-perftool/merge_requests/282
-
Dominic Kempf authored
Merge branch 'feature/use-custom-geometry-transformation' into 'master' ref:dominic/dune-perftool This computes the determinant and jacobian inverse transposed directly within loopy and does not call the corresponding grid functions. Using some simple precomputations this is faster if number_of_blocks>=2. This also allows straight forward vectorization for unstructured grids. I don't know how the computation of the geometry transformation is done in the sumfactored case, but maybe there is some overlap, which could be reduced. See merge request [dominic/dune-perftool!276] [dominic/dune-perftool!276]: gitlab.dune-project.org/dominic/dune-perftool/merge_requests/276
-
Dominic Kempf authored
-
Dominic Kempf authored
Merge branch 'feature/skylake-single-precision-transposes' into 'master' See merge request [dominic/dune-perftool!271] [dominic/dune-perftool!271]: Nonedominic/dune-perftool/merge_requests/271
-
Dominic Kempf authored
Merge branch 'feature/matrix-inversion' into 'master' ref:dominic/dune-perftool Matrix inversion at code generation time does only work to a very limited extent (up to n=4). We can instead assemble the tensor in C++ and invert it there (e.g. using Dune::FieldMatrix) This fixes [#123]. Still TODO: - \[ \] Vectorized Inversion See merge request [dominic/dune-perftool!263] [#123]: gitlab.dune-project.org/NoneNone/issues/123 [dominic/dune-perftool!263]: gitlab.dune-project.org/dominic/dune-perftool/merge_requests/263 Closes #123
-
Dominic Kempf authored
-
- Oct 26, 2018
-
-
Dominic Kempf authored
Merge branch 'feature/code-generation-hooks' into 'master' ref:dominic/dune-perftool This is the first minimal implementation of how code generation hooks from downstream projects could look like. There is a few more things to think about (feel invited to share ideas): - \[x\] How to document the arguments and return values expected from hooks - \[x\] How to handle multiple hooks registered to the same hook point and return values (this is quite relevant once you want to do loopy transformations in a hook. It means that you want to "chain" the hooks) This fixes [#129]. See merge request [dominic/dune-perftool!277] [#129]: gitlab.dune-project.org/NoneNone/issues/129 [dominic/dune-perftool!277]: gitlab.dune-project.org/dominic/dune-perftool/merge_requests/277 Closes #129
-
- Oct 25, 2018
-
-
Marcel Koch authored
-
Marcel Koch authored
-
Marcel Koch authored
-