ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

Hits per page

hits 1 - 4 | 4 hits

Sorting

Unknown

Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results (1994)

Meuer, Hans-Werner ; Lasinski, T. A. ; Strohmeier, Erich ; [et al.]

In: Other Sources

add to mindlist on the mindlist

Details

Publication Date: 2019-07-18

Description: In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.

Keywords: Computer Systems

Type: Supercomputing 1994; Nov 14, 1994 - Nov 18, 1994; Washington, DC; United States

Format: text

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Unknown

Six Years of Parallel Computing at NAS (1987 - 1993): What Have we Learned? (1994)

Simon, Horst D. ; Cooper, D. M.

In: Other Sources

add to mindlist on the mindlist

Details

Publication Date: 2019-07-17

Description: In the fall of 1987 the age of parallelism at NAS began with the installation of a 32K processor CM-2 from Thinking Machines. In 1987 this was described as an "experiment" in parallel processing. In the six years since, NAS acquired a series of parallel machines, and conducted an active research and development effort focused on the use of highly parallel machines for applications in the computational aerosciences. In this time period parallel processing for scientific applications evolved from a fringe research topic into the one of main activities at NAS. In this presentation I will review the history of parallel computing at NAS in the context of the major progress, which has been made in the field in general. I will attempt to summarize the lessons we have learned so far, and the contributions NAS has made to the state of the art. Based on these insights I will comment on the current state of parallel computing (including the HPCC effort) and try to predict some trends for the next six years.

Keywords: Computer Systems

Type: Cray Research/CUG Board of Directors; Apr 10, 1994 - Apr 13, 1994; United States

Format: text

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Unknown

HARP: A Dynamic Inertial Spectral Partitioner (1997)

Simon, Horst D. ; Sohn, Andrew ; Biswas, Rupak

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: Partitioning unstructured graphs is central to the parallel solution of computational science and engineering problems. Spectral partitioners, such recursive spectral bisection (RSB), have proven effecfive in generating high-quality partitions of realistically-sized meshes. The major problem which hindered their wide-spread use was their long execution times. This paper presents a new inertial spectral partitioner, called HARP. The main objective of the proposed approach is to quickly partition the meshes at runtime in a manner that works efficiently for real applications in the context of distributed-memory machines. The underlying principle of HARP is to find the eigenvectors of the unpartitioned vertices and then project them onto the eigerivectors of the original mesh. Results for various meshes ranging in size from 1000 to 100,000 vertices indicate that HARP can indeed partition meshes rapidly at runtime. Experimental results show that our largest mesh can be partitioned sequentially in only a few seconds on an SP2 which is several times faster than other spectral partitioners while maintaining the solution quality of the proven RSB method. A parallel WI version of HARP has also been implemented on IBM SP2 and Cray T3E. Parallel HARP, running on 64 processors SP2 and T3E, can partition a mesh containing more than 100,000 vertices into 64 subgrids in about half a second. These results indicate that graph partitioning can now be truly embedded in dynamically-changing real-world applications.

Keywords: Computer Systems

Type: NASA-CR-204489 , NAS 1.26:204489 , RIACS-TR-97-01 , Parallel Algorithms and Architectures; Jun 22, 1997 - Jun 25, 1997; Newport, RI; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

Unknown

Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors (1996)

Simon, Horst D. ; Biswas, Rupak ; Sohn, Andrew

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-07-13

Description: The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computational mesh is adapted, JOVE is activated to eliminate the load imbalance. JOVE has been implemented on an IBM SP2 distributed-memory machine in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. We also show that JOVE gives a 24-fold speedup on 64 processors compared to sequential execution.

Keywords: Computer Systems

Type: NASA-TM-112034 , NAS 1.15:112034 , NAS-96-012 , IEEE Symposium on Parallel and Distributed Processing (SPDP''96); Oct 23, 1996 - Oct 26, 1996; New Orleans, LA; United States

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

hits 1 - 4 | 4 hits