TY - JOUR T1 - Memory-Based Self-Ordering FFT for Efficient I/O Scheduling AU - Seke, Erol AU - Kaya, Zeynep PY - 2024 DA - March Y2 - 2024 DO - 10.18038/estubtda.1401022 JF - Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering JO - Estuscience - Se PB - Eskisehir Technical University WT - DergiPark SN - 2667-4211 SP - 156 EP - 167 VL - 25 IS - 1 LA - en AB - A complex-valued self-ordering radix-2 memory-based Fast Fourier Transform (FFT) architecture suitable for low end Field Programmable Gate Arrays (FPGA) is presented. Employing a self-ordering algorithm within the data flow, both input and output data are kept in normal sequential order, not in digit-reversed-order. This way, with an appropriate scheduling, last stage of the FFT and I/O operations are performed in parallel with no wait states. Self-ordering FFT algorithms are generally designed for software implementations. We designed and implemented one on FPGA (hardware), showing that considerable number of clock cycle savings can be obtained compared to unordered FFT counterparts. The approach is implemented on various FPGAs. The results are compared with similar radix-2 architectures in terms of required clock cycles and resource usage, confirming the advantage of the approach. KW - FFT KW - DFT KW - memory-based FFT KW - self-ordering KW - FPGA KW - radix-2 N2 - A complex-valued self-ordering radix-2 memory-based Fast Fourier Transform (FFT) architecture suitable for low end Field Programmable Gate Arrays (FPGA) is presented. Employing a self-ordering algorithm within the data flow, both input and output data are kept in normal sequential order, not in digit-reversed-order. This way, with an appropriate scheduling, last stage of the FFT and I/O operations are performed in parallel with no wait states. Self-ordering FFT algorithms are generally designed for software implementations. We designed and implemented one on FPGA (hardware), showing that considerable number of clock cycle savings can be obtained compared to unordered FFT counterparts. The approach is implemented on various FPGAs. The results are compared with similar radix-2 architectures in terms of required clock cycles and resource usage, confirming the advantage of the approach. CR - [1] Chang YN, Parhi KK. An efficient pipelined FFT architecture, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing 2003; 50-6: 322–325. CR - [2] Chu E, George A. FFT algorithms and their adaptation to parallel processing, Linear Algebra and its Applications 1998; 284-1-3: 95–124. CR - [3] Garrido M. A survey on pipelined FFT hardware architectures, Journal of Signal Processing Systems 2021; https://doi.org/10.1007/s11265-021-01655-1. CR - [4] Garrido M, Grajal J, Sanchez MA. Gustafsson O, Pipelined radix-2k feedforward FFT architectures, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2013; 21: 23–32. CR - [5] Garrido M, Moller K, Kumm M. World’s fastest FFT architectures: Breaking the barrier of 100 GS/s, IEEE Transactions on Circuits and Systems I: Regular Papers 2019; 66-4: 1507–1516. CR - [6] Garrido M, Sanchez MA, Lopez-Vallejo ML, Grajal J. A 4096-point radix-4 memory-based FFT using DSP slices, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2016; 25: 375–379. CR - [7] Huang SJ, Chen SG. A high-throughput radix-16 FFT processor with parallel and normal input/output ordering for IEEE 802.15.3C Systems, IEEE Transactions on Circuits and Systems I: Regular Papers 2012; 59-8: 1752–1765. CR - [8] Jo BG, Sunwoo MH. New continuous-flow mixed-radix (CFMR) FFT processor using novel in-place strategy, IEEE Transactions on Circuits and Systems I: Regular Papers 2005; 52-5: 911–919. CR - [9] Johnson H, Burrus C. An in-order, in-place radix-2 FFT, ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing 1984. CR - [10] Kaya Z, Seke E, A novel addressing algorithm of radix-2 FFT using single-bank dual-port memory, Circuit World 2020; 48: 64–70. CR - [11] Luo HF, Liu Y-J, Shieh M-D..Efficient memory-addressing algorithms for FFT processor design, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2015; 23-10: 2162–2172. CR - [12] Ma ZG, Yin XB, Yu F. A novel memory-based FFT architecture for real-valued signals based on a radix-2 decimation-in-frequency algorithm, IEEE Transactions on Circuits and Systems II: Express Briefs 2015; 62-9: 876–880. CR - [13] Ma Y, Wanhammar L, A hardware efficient control of memory addressing for high-performance FFT processors, IEEE Transactions on Signal Processing 2000; 48-3; 917–921. CR - [14] Mao XB, Ma ZG, Yu F, Xing QJ. A continuous-flow memory-based architecture for real-valued FFT, IEEE Transactions on Circuits and Systems II: Express Briefs 2017; 64-11; 1352–1356. CR - [15] Nguyen NH, Khan SA, Kim CH, Kim JM. A high-performance, resource-efficient, reconfigurable parallel-pipelined FFT processor for FPGA platforms, Microprocessors and Microsystems 2018; 60: 96–106. CR - [16] Sorokin H, Takala J. Conflict-free parallel access scheme for mixed-radix FFT supporting I/o permutations, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011. CR - [17] Takala JH, Jarvinen TS, Sorokin HT. Conflict-free parallel memory access scheme for FFT processors, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. CR - [18] Tsai PY, Lin CY. A generalized conflict-free memory addressing scheme for continuous-flow parallel-processing FFT processors with rescheduling, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2011; 19-12: 2290–2302. CR - [19] Valencia D, Alimohammad A. Compact and high‐throughput parameterisable architectures for memory‐based FFT algorithms, IET Circuits, Devices and Systems 2019; 13-5: 696–703. CR - [20] Xiao X, Oruklu E, Saniie J, An efficient FFT engine with reduced addressing logic, IEEE Transactions on Circuits and Systems II: Express Briefs 2008; 55-11: 1149–1153. CR - [21] Xiao X, Oruklu E, Saniie J. Fast memory addressing scheme for radix-4 FFT implementation, IEEE International Conference on Electro/Information Technology 2009. UR - https://doi.org/10.18038/estubtda.1401022 L1 - https://dergipark.org.tr/en/download/article-file/3582257 ER -