ISSN:
1573-0484
Keywords:
fast Fourier transform
;
radix-2, 3 and 5
;
distributed-memory parallel computer
;
cyclic distribution
;
all-to-all communication
Source:
Springer Online Journal Archives 1860-2000
Topics:
Computer Science
Notes:
Abstract In this paper, we propose high-performance radix-2, 3 and 5 parallel 1-D complex FFT algorithms for distributed-memory parallel computers. We use the four-step or six-step FFT algorithms to implement the radix-2, 3 and 5 parallel 1-D complex FFT algorithms. In our parallel FFT algorithms, since we use cyclic distribution, all-to-all communication takes place only once. Moreover, the input data and output data are both in natural order. We also show that the suitability of a parallel FFT algorithm is machine-dependent because of the differences in the architecture of the processor elements in distributed-memory parallel computers. Experimental results of 2p3q5r point FFTs on distributed-memory parallel computers, HITACHI SR2201 and IBM SP2 are reported. We succeeded to get performances of about 130 GFLOPS on a 1024PE HITACHI SR2201 and about 1.25 GFLOPS on a 32PE IBM SP2.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1023/A:1008160021085
Permalink