Skip to main content

Research Repository

Advanced Search

Professor Tobias Weinzierl's Outputs (79)

Annotation-guided AoS-to-SoA conversions and GPU offloading with data views in C++ (2025)
Journal Article
Weinzierl, T., & Radtke, P. (in press). Annotation-guided AoS-to-SoA conversions and GPU offloading with data views in C++. Concurrency and Computation: Practice and Experience,

The C++ programming language provides classes and structs as fundamental modeling entities. Consequently, C++ code tends to favour array-of-structs (AoS) for encoding data sequences, even thoughstructure-of-arrays (SoA) yields better performance for... Read More about Annotation-guided AoS-to-SoA conversions and GPU offloading with data views in C++.

Compiler support for semi-manual AoS-to-SoA conversions with data views (2025)
Presentation / Conference Contribution
Radtke, P., & Weinzierl, T. (2024, September). Compiler support for semi-manual AoS-to-SoA conversions with data views. Presented at PPAM 2024 - 15th International Conference on Parallel Processing & Applied Mathematics, Ostrava, Czech Republic

The C programming language and its cousins such as C++ stipulate the static storage of sets of structured data: Developers have to commit to one, invariant data model -- typically a structure-of-arrays (SoA) or an array-of-structs (AoS) -- unles... Read More about Compiler support for semi-manual AoS-to-SoA conversions with data views.

SYCL compute kernels for ExaHyPE (2024)
Presentation / Conference Contribution
Loi, C. M., Bockhorst, H., & Weinzierl, T. (2024, March). SYCL compute kernels for ExaHyPE. Presented at 2024 SIAM Conference on Parallel Processing for Scientific Computing (PP), Baltimore, MD

We discuss three SYCL realisations of a simple Finite Volume scheme over multiple Cartesian patches. The realisation flavours differ in the way how they map the compute steps onto loops and tasks: We compare an implementation that is exclusively usin... Read More about SYCL compute kernels for ExaHyPE.

ExaGRyPE: Numerical general relativity solvers based upon the hyperbolic PDEs solver engine ExaHyPE (2024)
Journal Article
Zhang, H., Li, B., Weinzierl, T., & Barrera-Hinojosa, C. (2025). ExaGRyPE: Numerical general relativity solvers based upon the hyperbolic PDEs solver engine ExaHyPE. Computer Physics Communications, 307, Article 109435. https://doi.org/10.1016/j.cpc.2024.109435

ExaGRyPE describes a suite of solvers and solver ingredients for numerical relativity that are based upon ExaHyPE 2, the second generation of our Exascale Hyperbolic PDE Engine. Numerical relativity simulations are crucial in resolv... Read More about ExaGRyPE: Numerical general relativity solvers based upon the hyperbolic PDEs solver engine ExaHyPE.

Detrimental task execution patterns in mainstream OpenMP runtimes (2024)
Presentation / Conference Contribution
Weinzierl, T., Tuft, A., & Klemm, M. (2024, September). Detrimental task execution patterns in mainstream OpenMP runtimes. Presented at IWOMP 2024, Perth, Australia

The OpenMP API offers both task-based and data-parallel concepts to scientific computing. While it provides descriptive and prescriptive annotations, it is in many places deliberately unspecific how to implement its annotations. As the predomina... Read More about Detrimental task execution patterns in mainstream OpenMP runtimes.

Grundlagen des parallelen wissenschaftlichen Rechnens: Ein erster Leitfaden zu numerischen Konzepten und Programmiermethoden (2024)
Book
Weinzierl, T. (2024). Grundlagen des parallelen wissenschaftlichen Rechnens: Ein erster Leitfaden zu numerischen Konzepten und Programmiermethoden. Springer. https://doi.org/10.1007/978-3-031-49082-8

Neue Erkenntnisse in vielen wissenschaftlichen und technischen Bereichen sind ohne den Einsatz numerischer Simulationen, die auf modernen Computern effizient ablaufen, nicht denkbar. Je schneller wir neue Ergebnisse erhalten, desto größer und genauer... Read More about Grundlagen des parallelen wissenschaftlichen Rechnens: Ein erster Leitfaden zu numerischen Konzepten und Programmiermethoden.

A multiscale optimisation algorithm for shape and material reconstruction from a single X-ray image (2024)
Presentation / Conference Contribution
Westmacott, H., Ivrissimtzis, I., & Weinzierl, T. (2024, January). A multiscale optimisation algorithm for shape and material reconstruction from a single X-ray image. Presented at ICIGP 2024: The 7th International Conference on Image and Graphics Processing, Beijing, China

We produce thickness and bone to soft tissue ratio estimations from a single, 2D medical X-ray image. For this, we simulate the scattering of the rays through a model of the object and embed this simulation into an optimiser which iteratively adjusts... Read More about A multiscale optimisation algorithm for shape and material reconstruction from a single X-ray image.

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes (2023)
Presentation / Conference Contribution
Wille, M., Weinzierl, T., Brito Gadeschi, G., & Bader, M. (2023, December). Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes. Presented at ISC High Performance 2023, Hamburg

We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to... Read More about Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes.

A multiresolution Discrete Element Method for triangulated objects with implicit time stepping (2022)
Journal Article
Noble, P., & Weinzierl, T. (2022). A multiresolution Discrete Element Method for triangulated objects with implicit time stepping. SIAM Journal on Scientific Computing, 44(4), A2121-A2149. https://doi.org/10.1137/21m1421842

Simulations of many rigid bodies colliding with each other sometimes yield particularly interesting results if the colliding objects differ significantly in size and are nonspherical. The most expensive part within such a simulation code is the colli... Read More about A multiresolution Discrete Element Method for triangulated objects with implicit time stepping.

Spherical accretion of collisional gas in modified gravity I: self-similar solutions and a new cosmological hydrodynamical code (2022)
Journal Article
Zhang, H., Weinzierl, T., Schulz, H., & Li, B. (2022). Spherical accretion of collisional gas in modified gravity I: self-similar solutions and a new cosmological hydrodynamical code. Monthly Notices of the Royal Astronomical Society, 515(2), 2464-2482. https://doi.org/10.1093/mnras/stac1991

The spherical collapse scenario has great importance in cosmology since it captures several crucial aspects of structure formation. The presence of self-similar solutions in the Einstein-de Sitter (EdS) model greatly simplifies its analysis, making i... Read More about Spherical accretion of collisional gas in modified gravity I: self-similar solutions and a new cosmological hydrodynamical code.

Dynamic task fusion for a block-structured finite volume solver over a dynamically adaptive mesh with local time stepping (2022)
Book Chapter
Li, B., Schulz, H., Weinzierl, T., & Zhang, H. (2022). Dynamic task fusion for a block-structured finite volume solver over a dynamically adaptive mesh with local time stepping. In High Performance Computing 37th International Conference, ISC High Performance 2022, Hamburg, Germany, May 29 – June 2, 2022, Proceedings (153-173). Springer Verlag. https://doi.org/10.1007/978-3-031-07312-0_8

Load balancing of generic wave equation solvers over dynamically adaptive meshes with local time stepping is dicult, as the load changes with every time step. Task-based programming promises to mitigate the load balancing problem. We study a Finite V... Read More about Dynamic task fusion for a block-structured finite volume solver over a dynamically adaptive mesh with local time stepping.

Doubt and Redundancy Kill Soft Errors---Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software (2021)
Presentation / Conference Contribution
Samfass, P., Weinzierl, T., Reinarz, A., & Bader, M. (2021, November). Doubt and Redundancy Kill Soft Errors---Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software. Presented at Supercomputing 21 - FTXS Workshop - 2021 IEEE/ACM 11th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), St Louis, MO

Resilient algorithms in high-performance computing are subject to rigorous non-functional constraints. Resiliency must not increase the runtime, memory footprint or I/O demands too significantly. We propose a task-based soft error detection scheme th... Read More about Doubt and Redundancy Kill Soft Errors---Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software.

Task inefficiency patterns for a wave equation solver (2021)
Presentation / Conference Contribution
Schulz, H., Brito Gadeschi, G., Rudyy, O., & Weinzierl, T. (2021, December). Task inefficiency patterns for a wave equation solver. Presented at IWOMP 2021, Bristol

Stabilized Asynchronous Fast Adaptive Composite Multigrid using Additive Damping (2020)
Journal Article
Murray, C. D., & Weinzierl, T. (2021). Stabilized Asynchronous Fast Adaptive Composite Multigrid using Additive Damping. Numerical Linear Algebra with Applications, 28(3), Article e2328. https://doi.org/10.1002/nla.2328

Multigrid solvers face multiple challenges on parallel computers. Two fundamental ones read as follows: Multiplicative solvers issue coarse grid solves which exhibit low concurrency and many multigrid implementations suffer from an expensive coarse g... Read More about Stabilized Asynchronous Fast Adaptive Composite Multigrid using Additive Damping.

Delayed approximate matrix assembly in multigrid with dynamic precisions (2020)
Journal Article
Murray, C. D., & Weinzierl, T. (2021). Delayed approximate matrix assembly in multigrid with dynamic precisions. Concurrency and Computation: Practice and Experience, 33(11), Article e5941. https://doi.org/10.1002/cpe.5941

The accurate assembly of the system matrix is an important step in any code that solves partial differential equations on a mesh. We either explicitly set up a matrix, or we work in a matrix‐free environment where we have to be able to quickly return... Read More about Delayed approximate matrix assembly in multigrid with dynamic precisions.

Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement (2020)
Journal Article
Samfass, P., Weinzierl, T., Charrier, D. E., & Bader, M. (2020). Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement. Concurrency and Computation: Practice and Experience, 32(24), Article e5916. https://doi.org/10.1002/cpe.5916

Balancing the workload of sophisticated simulations is inherently difficult, since we have to balance both computational workload and memory footprint over meshes that can change any time or yield unpredictable cost per mesh entity, while modern supe... Read More about Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement.

teaMPI---replication-based resiliency without the (performance) pain (2020)
Presentation / Conference Contribution
Samfass, P., Weinzierl, T., Hazelwood, B., & Bader, M. (2020, December). teaMPI---replication-based resiliency without the (performance) pain. Presented at ISC High Performance, Frankfurt

In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïv... Read More about teaMPI---replication-based resiliency without the (performance) pain.

Enclave Tasking for DG Methods on Dynamically Adaptive Meshes (2020)
Journal Article
Charrier, D. E., Hazelwood, B., & Weinzierl, T. (2020). Enclave Tasking for DG Methods on Dynamically Adaptive Meshes. SIAM Journal on Scientific Computing, 42(3), C69-C96. https://doi.org/10.1137/19m1276194

High-order discontinuous Galerkin (DG) methods promise to be an excellent discretization paradigm for hyperbolic differential equation solvers running on supercomputers, since they combine high arithmetic intensity with localized data access, since t... Read More about Enclave Tasking for DG Methods on Dynamically Adaptive Meshes.