Table of Contents: XcalableMP PGAS Programming Language :

XcalableMP PGAS Programming Language : From Programming Model to Applications.

Bibliographic Details
Main Author:	Sato, Mitsuhisa.
Format:	eBook
Language:	English
Published:	Singapore : Springer Singapore Pte. Limited, 2020.
Edition:	1st ed.
Subjects:	Electronic books.
Online Access:	Click to View

Table of Contents:

Intro
Preface
Contents
XcalableMP Programming Model and Language
1 Introduction
1.1 Target Hardware
1.2 Execution Model
1.3 Data Model
1.4 Programming Models
1.4.1 Partitioned Global Address Space
1.4.2 Global-View Programming Model
1.4.3 Local-View Programming Model
1.4.4 Mixture of Global View and Local View
1.5 Base Languages
1.5.1 Array Section in XcalableMP C
1.5.2 Array Assignment Statement in XcalableMP C
1.6 Interoperability
2 Data Mapping
2.1 nodes Directive
2.2 template Directive
2.3 distribute Directive
2.3.1 Block Distribution
2.3.2 Cyclic Distribution
2.3.3 Block-Cyclic Distribution
2.3.4 Gblock Distribution
2.3.5 Distribution of Multi-Dimensional Templates
2.4 align Directive
2.5 Dynamic Allocation of Distributed Array
2.6 template_fix Construct
3 Work Mapping
3.1 task and tasks Construct
3.1.1 task Construct
3.1.2 tasks Construct
3.2 loop Construct
3.2.1 Reduction Computation
3.2.2 Parallelizing Nested Loop
3.3 array Construct
4 Data Communication
4.1 shadow Directive and reflect Construct
4.1.1 Declaring Shadow
4.1.2 Updating Shadow
4.2 gmove Construct
4.2.1 Collective Mode
4.2.2 In Mode
4.2.3 Out Mode
4.3 barrier Construct
4.4 reduction Construct
4.5 bcast Construct
4.6 wait_async Construct
4.7 reduce_shadow Construct
5 Local-View Programming
5.1 Introduction
5.2 Coarray Declaration
5.3 Put Communication
5.4 Get Communication
5.5 Synchronization
5.5.1 Sync All
5.5.2 Sync Images
5.5.3 Sync Memory
6 Procedure Interface
7 XMPT Tool Interface
7.1 Overview
7.2 Specification
7.2.1 Initialization
7.2.2 Events
References
Implementation and Performance Evaluation of Omni Compiler
1 Overview
2 Implementation
2.1 Operation Flow.
2.2 Example of Code Translation
2.2.1 Distributed Array
2.2.2 Loop Statement
2.2.3 Communication
3 Installation
3.1 Overview
3.2 Get Source Code
3.2.1 From GitHub
3.2.2 From Our Website
3.3 Software Dependency
3.4 General Installation
3.4.1 Build and Install
3.4.2 Set PATH
3.5 Optional Installation
3.5.1 OpenACC
3.5.2 XcalableACC
3.5.3 One-Sided Library
4 Creation of Execution Binary
4.1 Compile
4.2 Execution
4.2.1 XcalableMP and XcalableACC
4.2.2 OpenACC
4.3 Cooperation with Profiler
4.3.1 Scalasca
4.3.2 tlog
5 Performance Evaluation
5.1 Experimental Environment
5.2 EP STREAM Triad
5.2.1 Design
5.2.2 Implementation
5.2.3 Evaluation
5.3 High-Performance Linpack
5.3.1 Design
5.3.2 Implementation
5.3.3 Evaluation
5.4 Global Fast Fourier Transform
5.4.1 Design
5.4.2 Implementation
5.4.3 Evaluation
5.5 RandomAccess
5.5.1 Design
5.5.2 Implementation
5.5.3 Evaluation
5.6 Discussion
6 Conclusion
References
Coarrays in the Context of XcalableMP
1 Introduction
2 Requirements from Language Specifications
2.1 Images Mapped to XMP Nodes
2.2 Allocation of Coarrays
2.3 Communication
2.4 Synchronization
2.5 Subarrays and Data Contiguity
2.6 Coarray C Language Specifications
3 Implementation
3.1 Omni XMP Compiler Framework
3.2 Allocation and Registration
3.2.1 Three Methods of Memory Management
3.2.2 Initial Allocation for Static Coarrays
3.2.3 Runtime Allocation for Allocatable Coarrays
3.3 PUT/GET Communication
3.3.1 Determining the Possibility of DMA
3.3.2 Buffering Communication Methods
3.3.3 Non-blocking PUT Communication
3.3.4 Optimization of GET Communication
3.4 Runtime Libraries
3.4.1 Fortran Wrapper
3.4.2 Upper-layer Runtime (ULR) Library.
3.4.3 Lower-layer Runtime (LLR) Library
3.4.4 Communication Libraries
4 Evaluation
4.1 Fundamental Performance
4.2 Non-blocking Communication
4.3 Application Program
4.3.1 Coarray Version of the Himeno Benchmark
4.3.2 Measurement Result
4.3.3 Productivity
5 Related Work
6 Conclusion
References
XcalableACC: An Integration of XcalableMP and OpenACC
1 Introduction
1.1 Hardware Model
1.2 Programming Model
1.2.1 XMP Extensions
1.2.2 OpenACC Extensions
1.3 Execution Model
1.4 Data Model
2 XcalableACC Language
2.1 Data Mapping
Example
2.2 Work Mapping
Restriction
Example 1
Example 2
2.3 Data Communication and Synchronization
Example
2.4 Coarrays
Restriction
Example
2.5 Handling Multiple Accelerators
2.5.1 devices Directive
Example
2.5.2 on_device Clause
2.5.3 layout Clause
Example
2.5.4 shadow Clause
Example
2.5.5 barrier_device Construct
Example
3 Omni XcalableACC Compiler
4 Performance of Lattice QCD Application
4.1 Overview of Lattice QCD
4.2 Implementation
5 Performance Evaluation
5.1 Result
5.2 Discussion
6 Productivity Improvement
6.1 Requirement for Productive Parallel Language
6.2 Quantitative Evaluation by Delta Source Lines of Codes
6.3 Discussion
References
Mixed-Language Programming with XcalableMP
1 Background
2 Translation by Omni Compiler
3 Functions for Mixed-Language
3.1 Function to Call MPI Program from XMP Program
3.2 Function to Call XMP Program from MPI Program
3.3 Function to Call XMP Program from Python Program
3.3.1 From Parallel Python Program
3.3.2 From Sequential Python Program
4 Application to Order/Degree Problem
4.1 What Is Order/Degree Program
4.2 Implementation
4.3 Evaluation
5 Conclusion
References.
Three-Dimensional Fluid Code with XcalableMP
1 Introduction
2 Global-View Programming Model
2.1 Domain Decomposition Methods
2.2 Performance on the K Computer
2.2.1 Comparison with Hand-Coded MPI Program
2.2.2 Optimization for SIMD
2.2.3 Optimization for Allocatable Arrays
3 Local-View Programming Model
3.1 Communications Using Coarray
3.2 Performance on the K Computer
4 Summary
References
Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP
1 Introduction
2 Nuclear Fusion Simulation Code
2.1 Gyrokinetic PIC Simulation
2.2 GTC
3 Implementation of GTC-P by Hybrid-view Programming
3.1 Hybrid-View Programming Model
3.2 Implementation Based on the XMP-Localview Model: XMP-localview
3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview
4 Performance Evaluation
4.1 Experimental Setting
4.2 Results
4.3 Productivity and Performance
5 Related Research
6 Conclusion
References
Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP
1 Introduction
2 X-ray Fluorescence Holography
2.1 Reconstruction of Atomic Images
2.2 Analysis Procedure of XFH
3 Parallelization
3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP
3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP
4 Performance Evaluation
4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images
4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images
4.3 Comparison of Parallelization with MPI
5 Conclusion
References
Multi-SPMD Programming Model with YML and XcalableMP
1 Introduction
2 Background: International Collaborations for the Post-Petascale and Exascale Computing
3 Multi-SPMD Programming Model.
3.1 Overview
3.2 YML
3.3 OmniRPC-MPI
4 Application Development in the mSPMD Programming Environment
4.1 Task Generator
4.2 Workflow Development
4.3 Workflow Execution
5 Experiments
6 Eigen Solver on the mSPMD Programming Model
6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model
6.2 Experiments
7 Fault-Tolerance Features in the mSPMD Programming Model
7.1 Overview and Implementation
7.2 Experiments
8 Runtime Correctness Check for the mSPMD Programming Model
8.1 Overview and Implementation
8.2 Experiments
9 Summary
References
XcalableMP 2.0 and Future Directions
1 Introduction
2 XcalableMP on Fugaku
2.1 Performance of XcalableMP Global View Programming
2.2 Performance of XcalableMP Local View Programming
3 Global Task Parallel Programming
3.1 OpenMP and XMP Tasklet Directive
3.2 A Proposal for Global Task Parallel Programming
3.3 Prototype Design of Code Transformation
3.4 Preliminary Performance
3.5 Communication Optimization for Manycore Clusters
4 Retrospectives and Challenges for Future PGAS Models
4.1 Low-Level Communication Layer for PGAS Model
4.2 XcalableMP as a DSL for Stencil Applications
4.3 XcalableMP API: Compiler-Free Approach
4.4 Global Task Parallel Programming Model for Accelerators
References.

XcalableMP PGAS Programming Language : From Programming Model to Applications.

Similar Items