ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Other Sources  (28)
  • Computer Systems  (14)
  • Computer Programming and Software  (12)
  • Computer Operations and Hardware  (2)
  • Chemistry
  • Electronic structure and strongly correlated systems
  • 2010-2014  (5)
  • 1995-1999  (23)
  • 1
    Publication Date: 2019-06-28
    Description: Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.0X speedup on 64 processors when 10 percent of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all the mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
    Keywords: Computer Programming and Software
    Type: NASA-CR-201396 , RIACS-TR-96-11 , NAS 1.26:201396
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2019-07-18
    Description: Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We describe a novel method to dynamically balance the processor workloads with a global view. Mesh question, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. A data redistribution model will also be presented that predicts the remapping cost. This model is required to determine whether the gain from a balanced workload distribution offsets the cost of data movement. Results presented will demonstrate that this is an effective dynamic load balancing strategy which remains viable on a large number of processors.
    Keywords: Computer Programming and Software
    Type: NEC Europe Ltd. Conference; May 04, 1998 - May 08, 1998; Sankt Augustin; Germany
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2019-07-18
    Description: This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
    Keywords: Computer Programming and Software
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2019-07-13
    Description: In this paper, we present a new approach to constructing a "self-avoiding" walk through a triangular mesh. Unlike the popular approach of visiting mesh elements using space-filling curves which is based on a geometric embedding, our approach is combinatorial in the sense that it uses the mesh connectivity only. We present an algorithm for constructing a self-avoiding walk which can be applied to any unstructured triangular mesh. The complexity of the algorithm is O(n x log(n)), where n is the number of triangles in the mesh. We show that for hierarchical adaptive meshes, the algorithm can be easily parallelized by taking advantage of the regularity of the refinement rules. The proposed approach should be very useful in the run-time partitioning and load balancing of adaptive unstructured grids.
    Keywords: Computer Programming and Software
    Type: 39th Symposium on Foundations of Computer Science; Nov 08, 1998 - Nov 11, 1998; Palo Alto, CA; United States
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2019-07-13
    Description: Classical mesh partitioning algorithms were designed for rather static situations, and their straightforward application in a dynamical framework may lead to unsatisfactory results, e.g., excessive data migration among processors. Furthermore, special attention should be paid to their amenability to parallelization. In this paper, a novel parallel method for the dynamic partitioning of adaptive unstructured meshes is described. It is based on a linear representation of the mesh using self-avoiding walks.
    Keywords: Computer Systems
    Type: IPPS''99; Apr 12, 1999 - Apr 16, 1999; San Juan; Puerto Rico
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2019-07-13
    Description: In this paper we study the performance of the Lustre file system using five scientific and engineering applications representative of NASA workload on large-scale supercomputing systems such as NASA s Pleiades. In order to facilitate the collection of Lustre performance metrics, we have developed a software tool that exports a wide variety of client and server-side metrics using SGI's Performance Co-Pilot (PCP), and generates a human readable report on key metrics at the end of a batch job. These performance metrics are (a) amount of data read and written, (b) number of files opened and closed, and (c) remote procedure call (RPC) size distribution (4 KB to 1024 KB, in powers of 2) for I/O operations. RPC size distribution measures the efficiency of the Lustre client and can pinpoint problems such as small write sizes, disk fragmentation, etc. These extracted statistics are useful in determining the I/O pattern of the application and can assist in identifying possible improvements for users applications. Information on the number of file operations enables a scientist to optimize the I/O performance of their applications. Amount of I/O data helps users choose the optimal stripe size and stripe count to enhance I/O performance. In this paper, we demonstrate the usefulness of this tool on Pleiades for five production quality NASA scientific and engineering applications. We compare the latency of read and write operations under Lustre to that with NFS by tracing system calls and signals. We also investigate the read and write policies and study the effect of page cache size on I/O operations. We examine the performance impact of Lustre stripe size and stripe count along with performance evaluation of file per process and single shared file accessed by all the processes for NASA workload using parameterized IOR benchmark.
    Keywords: Computer Systems
    Type: ARC-E-DAA-TN6025 , HiPC 2012; Dec 18, 2012 - Dec 21, 2012; Pune; India
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2019-07-13
    Description: The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
    Keywords: Computer Systems
    Type: ARC-E-DAA-TN5169 , 14th IEEE International Conferenc eon HPCC-2012; Jun 25, 2012; Liverpool; United Kingdom
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2019-07-12
    Description: From its bold start nearly 30 years ago and continuing today, the NASA Advanced Supercomputing (NAS) facility at Ames Research Center has enabled remarkable breakthroughs in the space agency s science and engineering missions. Throughout this time, NAS experts have influenced the state-of-the-art in high-performance computing (HPC) and related technologies such as scientific visualization, system benchmarking, batch scheduling, and grid environments. We highlight the pioneering achievements and innovations originating from and made possible by NAS resources and know-how, from early supercomputing environment design and software development, to long-term simulation and analyses critical to design safe Space Shuttle operations and associated spinoff technologies, to the highly successful Kepler Mission s discovery of new planets now capturing the world s imagination.
    Keywords: Computer Systems
    Type: ARC-E-DAA-TN4714
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2019-07-18
    Description: Dynamic mesh adaptation on unstructured grids is a powerful tool for computing unsteady three-dimensional problems that require grid modifications to efficiently resolve solution features. By locally refining and coarsening the mesh to capture phenomena of interest, such procedures make standard computational methods more cost effective. Highly refined meshes are required to accurately capture shock waves, contact discontinuities, vortices, and shear layers in fluid flow problems. Adaptive meshes have also proved to be useful in several other areas of computational science and engineering like computer vision and graphics, semiconductor device modeling, and structural mechanics. Local mesh adaptation provides the opportunity to obtain solutions that are comparable to those obtained on globally-refined grids but at a much lower cost. Additional information is contained in the original extended abstract.
    Keywords: Computer Programming and Software
    Type: MASCOTS 1998; Jul 19, 1998 - Jul 24, 1998; Montreal; Canada
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2019-07-10
    Description: Understanding the interplay between machines and problems is key to obtaining high performance on parallel machines. This paper investigates the interplay between programming paradigms and communication capabilities of parallel machines. In particular, we explicate the communication capabilities of the IBM SP-2 distributed-memory multiprocessor and the SGI PowerCHALLENGEarray symmetric multiprocessor. Two benchmark problems of bitonic sorting and Fast Fourier Transform are selected for experiments. Communication-efficient algorithms are developed to exploit the overlapping capabilities of the machines. Programs are written in Message-Passing Interface for portability and identical codes are used for both machines. Various data sizes and message sizes are used to test the machines' communication capabilities. Experimental results indicate that the communication performance of the multiprocessors are consistent with the size of messages. The SP-2 is sensitive to message size but yields a much higher communication overlapping because of the communication co-processor. The PowerCHALLENGEarray is not highly sensitive to message size and yields a low communication overlapping. Bitonic sorting yields lower performance compared to FFT due to a smaller computation-to-communication ratio.
    Keywords: Computer Systems
    Type: NAS-96-005
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...