ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

11

Unknown

Efficient Load Balancing and Data Remapping for Adaptive Grid Calculations (1997)

Biswas, Rupak ; Oliker, Leonid

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: Mesh adaption is a powerful tool for efficient unstructured- grid computations but causes load imbalance among processors on a parallel machine. We present a novel method to dynamically balance the processor workloads with a global view. This paper presents, for the first time, the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. Previous results indicated that mesh repartitioning and data remapping are potential bottlenecks for performing large-scale scientific calculations. We resolve these issues and demonstrate that our framework remains viable on a large number of processors.

Keywords: Computer Systems

Type: NASA-CR-204487 , NAS 1.26:204487 , RIACS-TR-97-03 , Parallel Algorithms and Architectures; Jun 22, 1997 - Jun 25, 1997; Newport, RI; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

12

Unknown

Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA (1999)

Biswas, Rupak ; Oliker, Leonid

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.

Keywords: Computer Systems

Type: Supercomputing; Nov 13, 1999 - Nov 19, 1999; Portland, OR; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

13

Unknown

Beyond the NAS Parallel Benchmarks: Measuring Dynamic Program Performance and Grid Computing Applications (2001)

Biegel, Bryan ; VanderWijngaart, Rob F. ; Feng, Huiyu ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: The contents include: 1) A brief history of NPB; 2) What is (not) being measured by NPB; 3) Irregular dynamic applications (UA Benchmark); and 4) Wide area distributed computing (NAS Grid Benchmarks-NGB). This paper is presented in viewgraph form.

Keywords: Computer Systems

Type: Workshop on the Performance Characterization of Algorithms; Jul 16, 2001 - Jul 17, 2001; Oakland, CA; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

14

Unknown

Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor (1996)

Sohn, Andrew ; Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

Keywords: Computer Systems

Type: NASA-CR-200964 , NAS 1.26:200964 , RIACS-TR-96-07 , 10th ACM International Conference on Supercomputing; May 25, 1996 - May 28, 1996; Philadelphia, PA; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

15

Unknown

Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems (1996)

Sohn, Andrew ; Biswas, Rupak ; Oliker, Leonid

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: Dynamic mesh adaption on unstructured grids is a powerful tool for efficiently computing unsteady problems to resolve solution features of interest. Unfortunately, this causes load imbalance among processors on a parallel machine. This paper describes the parallel implementation of a tetrahedral mesh adaption scheme and a new global load balancing method. A heuristic remapping algorithm is presented that assigns partitions to processors such that the redistribution cost is minimized. Results indicate that the parallel performance of the mesh adaption code depends on the nature of the adaption region and show a 35.5X speedup on 64 processors of an SP2 when 35% of the mesh is randomly adapted. For large-scale scientific computations, our load balancing strategy gives almost a sixfold reduction in solver execution times over non-balanced loads. Furthermore, our heuristic remapper yields processor assignments that are less than 3% off the optimal solutions but requires only 1% of the computational time.

Keywords: Computer Systems

Type: NASA-CR-202186 , NAS 1.26: 202186 , RIACS-TR-96-16 , Supercomputing 1996; Nov 17, 1996 - Nov 22, 1996; Pittsburgh, PA; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

16

Unknown

NASA Advanced Computing Environment for Science and Engineering (2015)

Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: An overview regarding NASA Ames Advanced Computing Systems. This presentation will discuss the modeling, simulation, analysis, and decision-making in relation to the Advanced Computing Systems.

Keywords: Computer Systems

Type: ARC-E-DAA-TN22019 , High Performance Computing Day; Apr 06, 2015; Blacksburg, VA; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

17

Unknown

Hybrid Systems Diagnosis (2005)

Gupta, Vineet ; McIlraith, Sheila ; Clancy, Dan ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-11

Description: This paper reports on an on-going Project to investigate techniques to diagnose complex dynamical systems that are modeled as hybrid systems. In particular, we examine continuous systems with embedded supervisory controllers that experience abrupt, partial or full failure of component devices. We cast the diagnosis problem as a model selection problem. To reduce the space of potential models under consideration, we exploit techniques from qualitative reasoning to conjecture an initial set of qualitative candidate diagnoses, which induce a smaller set of models. We refine these diagnoses using parameter estimation and model fitting techniques. As a motivating case study, we have examined the problem of diagnosing NASA's Sprint AERCam, a small spherical robotic camera unit with 12 thrusters that enable both linear and rotational motion.

Keywords: Computer Systems

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

18

Unknown

Design of Unstructured Adaptive (UA) NAS Parallel Benchmark Featuring Irregular, Dynamic Memory Accesses (2001)

Biswas, Rupak ; Feng, Hui-Yu ; Biegel, Bryan ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: We describe the design of a new method for the measurement of the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. The method involves the solution of a stylized heat transfer problem on an unstructured, adaptive grid. A Spectral Element Method (SEM) with an adaptive, nonconforming mesh is selected to discretize the transport equation. The relatively high order of the SEM lowers the fraction of wall clock time spent on inter-processor communication, which eases the load balancing task and allows us to concentrate on the memory accesses. The benchmark is designed to be three-dimensional. Parallelization and load balance issues of a reference implementation will be described in detail in future reports.

Keywords: Computer Systems

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

19

Unknown

Message Passing and Shared Address Space Parallelism on an SMP Cluster (2002)

Singh, Jaswinder P. ; Biswas, Rupak ; Biegel, Bryan ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.

Keywords: Computer Systems

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

20

Unknown

Parallel Processing of Adaptive Meshes with Load Balancing (2001)

Harvey, Daniel J. ; Das, Sajal K. ; Biswas, Rupak ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-10

Description: Many scientific applications involve grids that lack a uniform underlying structure. These applications are often also dynamic in nature in that the grid structure significantly changes between successive phases of execution. In parallel computing environments, mesh adaptation of unstructured grids through selective refinement/coarsening has proven to be an effective approach. However, achieving load balance while minimizing interprocessor communication and redistribution costs is a difficult problem. Traditional dynamic load balancers are mostly inadequate because they lack a global view of system loads across processors. In this paper, we propose a novel and general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication topology, and compare its performance with a successful global load balancing environment, called PLUM, specifically created to handle adaptive unstructured applications. Our experimental results on an IBM SP2 demonstrate that the SBN-based load balancer achieves lower redistribution costs than that under PLUM by overlapping processing and data migration.

Keywords: Computer Systems

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview