SHAFFT 1.1.0-alpha
A Scalable High-dimensional Accelerated FFT Library
Loading...
Searching...
No Matches
SHAFFT

Tests Docs License: MIT

SHAFFT is a scalable library for high-dimensional complex-to-complex Fast Fourier Transforms (FFTs) in distributed-memory environments. It implements the slab decomposition method introduced by Dalcin, Mortensen, and Keyes (arXiv:1804.09536), using MPI for communication across a Cartesian process topology.

Features

  • N-dimensional distributed FFTs with slab decomposition
  • 1D distributed FFTs via dedicated FFT1D class
  • Flexible process grid topology (1 to N-1 distributed axes)
  • Single and double precision complex transforms
  • Backend-agnostic design with portable buffer API
  • C++, C, and Fortran 2003 interfaces

Supported Backends

Backend Target Description
hipFFT GPU AMD and NVIDIA GPUs via ROCm/HIP
FFTW CPU Multi-threaded CPU execution

Quick Start

Requirements

  • CMake >= 3.21
  • MPI implementation
  • C++17-compatible compiler (GCC >= 10)
  • Backend: FFTW3 (CPU) or ROCm/HIP (GPU)

Build

# FFTW backend (CPU)
cmake -B build -S . \
-DSHAFFT_ENABLE_FFTW=ON \
-DCMAKE_INSTALL_PREFIX=/opt/shafft
cmake --build build --target install
# hipFFT backend (GPU)
cmake -B build -S . \
-DSHAFFT_ENABLE_HIPFFT=ON \
-DSHAFFT_GPU_AWARE_MPI=OFF \
-DCMAKE_PREFIX_PATH=/opt/rocm \
-DCMAKE_INSTALL_PREFIX=/opt/shafft
cmake --build build --target install

Example

#include <shafft/shafft.hpp>
#include <mpi.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
std::vector<int> commDims = {0, 0, 0}; // auto-select
std::vector<size_t> dims = {64, 64, 32};
plan.init(commDims, dims, shafft::FFTType::C2C, MPI_COMM_WORLD);
plan.plan();
size_t n = plan.allocSize();
shafft::complexf *data, *work;
plan.setBuffers(data, work);
plan.normalize();
plan.normalize();
plan.release();
MPI_Finalize();
return 0;
}
N-dimensional distributed FFT plan with RAII semantics.
Definition shafft.hpp:51
int init(const std::vector< int > &commDims, const std::vector< size_t > &dimensions, FFTType type, MPI_Comm comm, TransformLayout output=TransformLayout::REDISTRIBUTED) noexcept
Initialize plan with Cartesian process grid.
int normalize() noexcept override
Apply symmetric normalization (1/sqrt(N) per transform).
int plan() noexcept override
Create backend FFT plans.
int execute(FFTDirection direction) noexcept override
Execute the FFT.
size_t allocSize() const noexcept override
Get required buffer size in complex elements.
void release() noexcept override
Release all internal resources.
int setBuffers(complexf *data, complexf *work) noexcept
Attach data and work buffers.
int freeBuffer(complexf *buf) noexcept
Free buffer allocated with allocBuffer().
int allocBuffer(size_t count, complexf **buf) noexcept
Allocate buffer for the current backend.
std::complex< float > complexf
Single-precision complex type (std::complex<float>).
Definition shafft_types.hpp:71
@ C2C
Single-precision complex-to-complex (float).
@ BACKWARD
Backward/inverse transform (frequency to time domain).
@ FORWARD
Forward transform (time to frequency domain).

Validation (Optional)

cd build
ctest --output-on-failure

Documentation

Primary documentation:

  • Getting Started - installation and build options
  • User Guide - API usage for C++, C, and Fortran
  • Linking Guide - compile and link instructions
  • Backend Reference - backend-specific configuration
  • Limitations - current constraints

Build local HTML documentation (optional):

doxygen docs/Doxyfile

Then open docs/html/index.html in your browser.

License

MIT License. See LICENSE for details.

Citation

If you use SHAFFT in your research, please cite the underlying method:

@article{dalcin2019fast,
title={Fast parallel multidimensional FFT using advanced MPI},
author={Dalcin, Lisandro and Mortensen, Mikael and Keyes, David E},
journal={Journal of Parallel and Distributed Computing},
volume={128},
pages={137--150},
year={2019},
doi={10.1016/j.jpdc.2019.02.006}
}