Sun's Performance Workshop Fortran includes UltraSPARC FFT routines in the Performance Library. This library is reportedly faster than FFTW for double precision but still not as fast as djbfft. Sun's mediaLib includes UltraSPARC VIS FFT routines, which presumably are faster than djbfft for low-precision FFTs. (If you're from Sun, and you have more comprehensive benchmarks, please let me know.)
Compaq's DIGITAL Extended Math Library includes Alpha FFT routines.
IBM's Engineering and Scientific Subroutine Library includes PowerPC FFT routines.
SGI's SGI/Cray Scientific Library includes MIPS R10000 and MIPS R12000 FFT routines. There also appear to be separate FFT routines as part of the SGI C library and the Cray C library, but I haven't found home pages for these libraries.
The FFTW authors have been working on pfftw, an asm imitation of djbfft.
I've heard about several unpublished asm FFT projects for various chips.