ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations (2003)

Yan, Jerry C. ; Biswas, Rupak ; Brooks, Walter F. ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2018-06-06

Description: Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.

Keywords: Computer Operations and Hardware

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

2

Unknown

Parallel 3D Mortar Element Method for Adaptive Nonconforming Meshes (2004)

VanderWijngaart, Rob ; Biswas, Rupak ; Mavriplis, Catherine ; [et al.]

In: Other Sources

add to mindlist on the mindlist

Details

Publication Date: 2019-07-18

Description: High order methods are frequently used in computational simulation for their high accuracy. An efficient way to avoid unnecessary computation in smooth regions of the solution is to use adaptive meshes which employ fine grids only in areas where they are needed. Nonconforming spectral elements allow the grid to be flexibly adjusted to satisfy the computational accuracy requirements. The method is suitable for computational simulations of unsteady problems with very disparate length scales or unsteady moving features, such as heat transfer, fluid dynamics or flame combustion. In this work, we select the Mark Element Method (MEM) to handle the non-conforming interfaces between elements. A new technique is introduced to efficiently implement MEM in 3-D nonconforming meshes. By introducing an "intermediate mortar", the proposed method decomposes the projection between 3-D elements and mortars into two steps. In each step, projection matrices derived in 2-D are used. The two-step method avoids explicitly forming/deriving large projection matrices for 3-D meshes, and also helps to simplify the implementation. This new technique can be used for both h- and p-type adaptation. This method is applied to an unsteady 3-D moving heat source problem. With our new MEM implementation, mesh adaptation is able to efficiently refine the grid near the heat source and coarsen the grid once the heat source passes. The savings in computational work resulting from the dynamic mesh adaptation is demonstrated by the reduction of the the number of elements used and CPU time spent. MEM and mesh adaptation, respectively, bring irregularity and dynamics to the computer memory access pattern. Hence, they provide a good way to gauge the performance of computer systems when running scientific applications whose memory access patterns are irregular and unpredictable. We select a 3-D moving heat source problem as the Unstructured Adaptive (UA) grid benchmark, a new component of the NAS Parallel Benchmarks (NPB). In this paper, we present some interesting performance results of ow OpenMP parallel implementation on different architectures such as the SGI Origin2000, SGI Altix, and Cray MTA-2.

Keywords: Fluid Mechanics and Thermodynamics

Type: International Conference on Spectral and High Order Methods; Jun 21, 2004 - Jun 25, 2004; RI; United States

Format: text

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

3

Unknown

A Hierarchical and Distributed Approach for Mapping Large Applications to Heterogeneous Grids using Genetic Algorithms (2003)

Sanyal, Soumya ; Jain, Amit ; Das, Sajal K. ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: In this paper, we propose a distributed approach for mapping a single large application to a heterogeneous grid environment. To minimize the execution time of the parallel application, we distribute the mapping overhead to the available nodes of the grid. This approach not only provides a fast mapping of tasks to resources but is also scalable. We adopt a hierarchical grid model and accomplish the job of mapping tasks to this topology using a scheduler tree. Results show that our three-phase algorithm provides high quality mappings, and is fast and scalable.

Keywords: Computer Systems

Type: IEEE 5th International Conference on Cluster Computing; Dec 01, 2003 - Dec 04, 2003; Hong Kong; China

Format: text

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

4

Unknown

Unstructured Adaptive Meshes: Bad for Your Memory? (2003)

Biswas, Rupak ; VanderWijngaart, Rob ; Feng, Hui-Yu

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: This viewgraph presentation explores the need for a NASA Advanced Supercomputing (NAS) parallel benchmark for problems with irregular dynamical memory access. This benchmark is important and necessary because: 1) Problems with localized error source benefit from adaptive nonuniform meshes; 2) Certain machines perform poorly on such problems; 3) Parallel implementation may provide further performance improvement but is difficult. Some examples of problems which use irregular dynamical memory access include: 1) Heat transfer problem; 2) Heat source term; 3) Spectral element method; 4) Base functions; 5) Elemental discrete equations; 6) Global discrete equations. Nonconforming Mesh and Mortar Element Method are covered in greater detail in this presentation.

Keywords: Computer Operations and Hardware

Type: ADAPT03: Conference on Adaptive Methods for PDEs and Large-Scale Computation; Oct 11, 2003 - Oct 12, 2003; Troy, NY; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

5

Unknown

Parallel Conjugate Gradient: Effects of Ordering Strategies, Programming Paradigms, and Architectural Platforms (2000)

Oliker, Leonid ; Heber, Gerd ; Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.

Keywords: Computer Programming and Software

Type: Parallel and Distributed Computing Systems; Aug 08, 2000 - Aug 10, 2000; Las Vegas, NV; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

6

Unknown

Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems (2000)

Li, Xiaoye ; Heber, Gerd ; Oliker, Leonid ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.

Keywords: Computer Systems

Type: Irregular; May 01, 2000; Cancun; Mexico

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

7

Unknown

Unstructured Adaptive (UA) NAS Parallel Benchmark (2004)

Biswas, Rupak ; Feng, Huiyu ; VanderWijngaart, Rob ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.

Keywords: Mathematical and Computer Sciences (General)

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

8

Unknown

Task Assignment Heuristics for Parallel and Distributed CFD Applications (2003)

Lopez-Benitez, Noe ; Djomehri, M. Jahed ; Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: This paper proposes a task graph (TG) model to represent a single discrete step of multi-block overset grid computational fluid dynamics (CFD) applications. The TG model is then used to not only balance the computational workload across the overset grids but also to reduce inter-grid communication costs. We have developed a set of task assignment heuristics based on the constraints inherent in this class of CFD problems. Two basic assignments, the smallest task first (STF) and the largest task first (LTF), are first presented. They are then systematically costs. To predict the performance of the proposed task assignment heuristics, extensive performance evaluations are conducted on a synthetic TG with tasks defined in terms of the number of grid points in predetermined overlapping grids. A TG derived from a realistic problem with eight million grid points is also used as a test case.

Keywords: Fluid Mechanics and Thermodynamics

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

9

Unknown

Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications (2003)

Djomehri, M. Jahed ; Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.

Keywords: Mathematical and Computer Sciences (General)

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

10

Unknown

Parallel Computing Strategies for Irregular Algorithms (2002)

Oliker, Leonid ; Biswas, Rupak ; Biegel, Bryan ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

Keywords: Computer Programming and Software

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview