[!305] Performance transformation: Loop reordering in sumfact kernel
Merge branch 'feature/sumfact-loop-reordering' into 'master' ref:extensions/dune-codegen Performance transformation through loop nest reordering. There are two ways to reorder loops in a tensor contraction: 1. Directly accumulate in output variable after setting to zero 2. Accumulating in a large enough temporary This merge request implements these ways of loop reordering and the possibility to create an autotune target directly from the loopy kernel. See merge request [extensions/dune-codegen!305] [extensions/dune-codegen!305]: gitlab.dune-project.org/extensions/dune-codegen/merge_requests/305
No related branches found
No related tags found
Showing
- bin/donkey_benchmark_execution_wrapper.py 9 additions, 1 deletionbin/donkey_benchmark_execution_wrapper.py
- python/dune/codegen/logging.conf 7 additions, 1 deletionpython/dune/codegen/logging.conf
- python/dune/codegen/loopy/transformations/performance.py 8 additions, 0 deletionspython/dune/codegen/loopy/transformations/performance.py
- python/dune/codegen/loopy/transformations/remove_reductions.py 71 additions, 0 deletions...n/dune/codegen/loopy/transformations/remove_reductions.py
- python/dune/codegen/options.py 3 additions, 0 deletionspython/dune/codegen/options.py
- python/dune/codegen/pdelab/localoperator.py 4 additions, 0 deletionspython/dune/codegen/pdelab/localoperator.py
- python/dune/codegen/sumfact/accumulation.py 3 additions, 2 deletionspython/dune/codegen/sumfact/accumulation.py
- python/dune/codegen/sumfact/autotune.py 230 additions, 13 deletionspython/dune/codegen/sumfact/autotune.py
- python/dune/codegen/sumfact/basis.py 3 additions, 2 deletionspython/dune/codegen/sumfact/basis.py
- python/dune/codegen/sumfact/geometry.py 3 additions, 2 deletionspython/dune/codegen/sumfact/geometry.py
- python/dune/codegen/sumfact/kernel_benchmark_template0.cc.in 37 additions, 0 deletionspython/dune/codegen/sumfact/kernel_benchmark_template0.cc.in
- python/dune/codegen/sumfact/kernel_benchmark_template1.cc.in 58 additions, 0 deletionspython/dune/codegen/sumfact/kernel_benchmark_template1.cc.in
- python/dune/codegen/sumfact/realization.py 39 additions, 9 deletionspython/dune/codegen/sumfact/realization.py
- python/dune/codegen/sumfact/symbolic.py 16 additions, 8 deletionspython/dune/codegen/sumfact/symbolic.py
- python/dune/codegen/sumfact/transformations.py 422 additions, 0 deletionspython/dune/codegen/sumfact/transformations.py
- test/sumfact/poisson/CMakeLists.txt 21 additions, 0 deletionstest/sumfact/poisson/CMakeLists.txt
- test/sumfact/poisson/poisson_dg_2d_performance_transformations.mini 28 additions, 0 deletions...ct/poisson/poisson_dg_2d_performance_transformations.mini
- test/sumfact/poisson/poisson_dg_3d_performance_transformations.mini 29 additions, 0 deletions...ct/poisson/poisson_dg_3d_performance_transformations.mini
- test/sumfact/poisson/poisson_fastdg_2d_performance_transformations.mini 28 additions, 0 deletions...oisson/poisson_fastdg_2d_performance_transformations.mini
- test/sumfact/poisson/poisson_fastdg_3d_performance_transformations.mini 28 additions, 0 deletions...oisson/poisson_fastdg_3d_performance_transformations.mini
Loading
Please register or sign in to comment