# White Paper

**FPGA** 



# Power and Performance Analysis of Finite Impulse Response (FIR) Filters and Fast Fourier Transforms (FFT) on Agilex™ 7 FPGAs

# Authors

Michael Wu FPGA Core Architect

Maanasa Mohanambal Sathianarayanan Technical Marketing Manager

## **Grace Zgheib**

FPGA Core Architect

Ilya Ganusov Senior Principal Engineer

**Altera Corporation** 

### **Table of Contents**

| Introduction1                                                     |
|-------------------------------------------------------------------|
| Agilex FPGA DSP<br>Architecture Background1                       |
| Benchmarking Methodology 2                                        |
| FIR and FFT Benchmark Selection2                                  |
| Experimental Setup2                                               |
| Benchmarking Results4                                             |
| FIR Results: Agilex 7 FPGA<br>Fabric vs. AMD Versal FPGA Fabric 4 |
| FIR Results: Agilex 7FPGA<br>Fabric vs. AMD Versal AIE5           |
| FFT Results: Agilex 7 FPGA<br>Fabric vs. AMD Versal FPGA Fabric 6 |
| FFT Results: Agilex 7 FPGA<br>Fabric vs. AMD Versal AIE6          |
| Conclusion7                                                       |
| References8                                                       |

The power and performance efficiency of digital signal processing (DSP) workloads play a significant role in the evolution of modern-day technology. This paper benchmarks the DSP performance on Agilex<sup>M</sup> 7 FPGAs [1] [2] using finite impulse response (FIR) filters and fast Fourier transform (FFT) designs. It also analyzes publicly available results from AMD and compares the power and performance efficiency of several FIR and FFT workloads on Agilex 7 FPGAs and AMD's Versal\* FPGAs and artificial intelligence engines (AIE) [3] [4].

The FIR benchmark results show that, on average, Agilex 7 FPGAs deliver:

- 1.52X better performance per watt compared to the AMD Versal FPGA fabric.
- 2.09X better performance per watt compared to the AMD Versal AI Engine.

The FFT benchmark results show that, on average, Agilex 7 FPGAs deliver:

- 1.65X better performance per watt compared to the AMD Versal FPGA fabric.
- 1.36X better performance per watt compared to the AMD Versal AIE.

### Introduction

An ever-increasing customer demand for improved DSP performance has led FPGA manufacturers to continue to scale up their hardware specifications and add more computing power. This translates into a generation-to-generation performance enhancement while maintaining certain power budget and physical constraint requirements.

Analyzing DSP architecture performance is crucial to ensure Altera® devices perform up to standard and meet customer expectations, especially when used on high-speed real-time processing solutions such as multiple-input multiple-output (MIMO) beamformers, radar, medical systems, and many more. One of the most common methodologies to evaluate DSP performance is by benchmarking FIR filters and FFTs on the FPGA. This paper compares the power and performance of FIR and FFT implementations on an Agilex 7 FPGA against the AMD Versal FPGA fabric and AIE, an array of processors based on a very-long instruction word (VLIW) architecture. This analysis highlights the performance advantages of the Agilex 7 FPGAs and will help customers identify how these FPGAs can meet their design-specific requirements.

### Agilex FPGA DSP Architecture Background

The Agilex 7 FPGAs and SoCs [2] carry over the variable-precision DSP architecture from previous FPGAs with hard fixed-point and IEEE 754-compliant floating-point capabilities [5].

Customers can configure the DSP blocks in fixed-point mode to support signal processing with multiple precision options ranging from 9×9 to 54×54. An increased 9×9 multiplier count, with three 9×9 multipliers for every 18×19 multiplier, is supported for specialized use cases. Each DSP block can be configured as four 9×9, two 18×19, or one 27×27 multiply-accumulate block.

The variable-precision DSP supports the single-precision 32-bit arithmetic FP32 floating-point mode, half-precision 16-bit arithmetic FP16 and FP19 floating-point modes, as well as the BFLOAT16 floating-point format to perform floating-point addition, multiplication, multiply-add, and multiply-accumulate operations. With a dedicated 64-bit cascade bus, the user can cascade multiple variable-precision DSP blocks to implement even higher-precision DSP functions efficiently.

#### **Benchmarking Methodology**

The primary goal of this analysis is to provide a fair comparison against competing FPGAs. In this paper, we review AMD's published FIR and FFT implementations on the Versal FPGA fabric, AIE, and AIE-ML, an AIE version optimized for machine learning, and compare those to an implementation on the Agilex 7 FPGA fabric. The experiments are carefully designed to closely match the experimental setup of the published data for the AMD Versal FPGA fabric and to maximize the performance of all the devices, including the AMD Versal AIE.

#### FIR and FFT Benchmark Selection

For the FIR performance analysis, we identify a set of designs with varying sizes and complexity from the public data published by AMD [6] [7] that can be replicated using the DSP Builder while covering a wide range of FIR functional parameters. DSP Builder is a digital signal processing design tool that enables a hardware description language (HDL) generation of DSP algorithms directly from the MathWorks Simulink\* environment onto FPGAs [8]. The tool generates high-quality, synthesizable VHDL/Verilog code from MATLAB\* functions and Simulink models. The generated register transfer level (RTL) code can be used to implement the designs on the FPGA.

The FIR evaluation is designed to cover a broad set of configurations, including various channel counts, tap counts, sampling periods, and filter types. The evaluated set of configurations is not comprehensive, especially when considering all the possible combinations within this domain; however, the obtained results and conclusions provide sufficient insights into the relative performance of the Agilex 7 FPGA with respect to the competing devices when implementing similar designs or designs with comparable functionalities.

Table 1 shows the FIR configurations chosen for the FPGA fabric-to-fabric comparison, and Table 2 shows the configurations implemented for the FPGA fabric vs. AIE comparison. We create designs that achieve the same throughput for the fabric-to-fabric comparison, usually tied to the input sample clock. For example, if we have a 614 MHz input sample rate, the expected throughput is 614 mega samples per second (MSPS) for a single sample rate filter. However, the AIE evaluation requires a different approach as the AI engine resembles a processor-based system where the frequency is fixed to 1GHz, and the achieved performance is not directly proportional to the operating frequency. The filters are also designed to achieve similar MSPS in the Altera and AMD implementations for a meaningful comparison.

FIR filters are fundamental blocks that are found in many

larger designs. Most of these filter designs are small, allowing us to match our test cases with the competing designs closely. All the filters are implemented with complex input data and real coefficient types while targeting the industrystandard 614 MHz frequency [9]. They are also implemented with constant coefficients to match the implementations used in the published AMD results.

Table 3 summarizes the configurations chosen for comparing FFT designs. We implement a sweep across 10 different FFT configurations by varying transform lengths from 32 to 32K. The designs are instantiated with the FFT FPGA IP core using the DSP Builder for the Agilex 7 FPGA fabric and Xilinx LogiCORE\* IP FFT core v9.1 for the AMD Versal FPGA fabric. A complete software parameter sweep is included for the AMD Versal FPGA fabric analysis and a comparison of the published data for AMD's AIE and AIE-ML. The designs are implemented using 16-bit input data precision, 16-bit twiddle precision, and full-word-growth intermediate data, given that AIEs can only use 16 bits at a given time, as mentioned in Table 3.

#### **Experimental Setup**

The Quartus® Prime Software Suite [10] version 23.1 and Xilinx Vivado\* Design Suite [11] version 2022.2 are used in this evaluation, along with their respective power estimator tools – the FPGA Power and Thermal Calculator (PTC) and Xilinx Power Estimator (XPE) version 2022.2 for the FIR analysis, and AMD's Power Design Manager (PDM) version 2023.1 for the FFT analysis. The CAD flows of these tools can be customized to trade off design performance, logic resource consumption, compile time, and memory utilization. The customized settings that produce the best results for one design are not necessarily the best for others. As such, the analysis was done using the default compilation settings for both tools.

To conduct these experiments, we use an Agilex 7 device with a similar speed grade and comparable logic density to AMD's Versal VC1902 device, as it is the device used in the published AMD results. More specifically, the devices used in our experiments are:

- Agilex 7 device AGFA027R25A3I3E
- AMD Versal AIE device XCVC1902-VSVA2197-1LP-I-L
- AMD Versal AIE-ML device XCVE2802-NSVH1369-1LP-I-L

The FIR and FFT filters are implemented on the targeted devices using their respective tool chains, and the collective performance, power, and resource utilization are measured and compared with a toggle rate of 20%. Only the dynamic power consumption for each design is reported since power estimation tools from both Altera and AMD report power for the entire device instead of static power specifically consumed by the instantiated FIR or FFT design. Many of the FIR and FFT designs in the benchmark suite occupy a small fraction of the devices they are implemented on, and therefore, the static power of unused logic and unrelated IP cores would dominate the overall reported power consumption. In practice, full customer designs tend to utilize most of the FPGA fabric, with dynamic power

White Paper | Power and Performance Analysis of Finite Impulse Response (FIR) Filters and Fast Fourier Transforms (FFT) on Agilex™7 FPGAs

dominating the static power in both the Altera and AMD Versal fabric/AIE implementations. Therefore, the dynamic power is representative of the total power consumption. In addition, constant overhead can dominate the dynamic power used for small designs. Therefore, we duplicate small designs several times (for example, 10 - 100X) to achieve reasonably high device resource utilization and report the dynamic power per instance. This approach helps reduce the overhead cost associated with small designs. It is important to note that Altera has not verified the power reports from the AMD tools.

| Filter | Filter Type   | Coefficient Vector   | Interpolation<br>Rate | Decimation<br>Rate | Number of<br>Channels | Rate Specification   | Sample<br>Period |
|--------|---------------|----------------------|-----------------------|--------------------|-----------------------|----------------------|------------------|
| 1      | Decimation    | Symmetric 102 tap    | 1                     | 5                  | 1                     | Output_Sample_Period | 5                |
| 2      | Decimation    | Symmetric 102 tap    | 1                     | 5                  | 1                     | Output_Sample_Period | 1                |
| 3      | Interpolation | Symmetric 102 tap    | 5                     | 2                  | 8                     | Input_Sample_Period  | 20               |
| 4      | Interpolation | Symmetric 102 tap    | 5                     | 1                  | 1                     | Input_Sample_Period  | 5                |
| 5      | Interpolation | Symmetric 102 tap    | 5                     | 1                  | 1                     | Input_Sample_Period  | 1                |
| 6      | Single Rate   | Symmetric 102 tap    | 1                     | 1                  | 1                     | Input_Sample_Period  | 4                |
| 7      | Decimation    | Nonsymmetric 102 tap | 1                     | 5                  | 1                     | Output_Sample_Period | 20               |
| 8      | Interpolation | Nonsymmetric 102 tap | 5                     | 1                  | 1                     | Input_Sample_Period  | 20               |

 Table 1.
 FIR configurations for the Agilex 7 FPGA Fabric vs. AMD Versal FPGA Fabric Comparison

| Filter | Filter Type   | <b>Coefficient Vector</b>    | Interpolation Rate | <b>Decimation Rate</b> | Number of AIE | MSPS |
|--------|---------------|------------------------------|--------------------|------------------------|---------------|------|
| 1      | Decimation    | Symmetric 99 tap             | 1                  | 3                      | 1             | 503  |
| 2      | Decimation    | Symmetric 99 tap (half band) | 1                  | 2                      | 1             | 523  |
| 3      | Resampler     | Symmetric 256 tap            | 3                  | 2                      | 1             | 123  |
| 4      | Interpolation | Asymmetric 128 tap           | 2                  | 1                      | 1             | 128  |
| 5      | Interpolation | Symmetric 99 tap (half band) | 2                  | 1                      | 1             | 181  |
| 6      | Single Rate   | Symmetric 128 tap            | 1                  | 1                      | 1             | 145  |

 Table 2.
 FIR configurations for the Agilex 7 FPGA Fabric vs. AMD Versal AIE Comparison

| <b>Design Configurations</b> | Agilex <sup>™</sup> 7 FPGA Fabric                                                                                                  | AMD Versal* Fabric                                                                          | AMD Versal AIE                        |
|------------------------------|------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------|
| FFT Sizes                    | 32K                                                                                                                                |                                                                                             |                                       |
| IP Source                    | DSPBA 2023.1<br>FFT FPGA IP core                                                                                                   | Xilinx LogiCORE* IP FFT core v9.1                                                           | Vitis DSP 2023.1 QoR<br>Documentation |
| FFT Implementation           | Fabric to fabric: Streaming FFT,<br>channel=1, 1 sample per cycle<br>Fabric vs AIE/AIE-ML: Streaming or<br>parallel FFT, channel=1 | Pipelined-streaming I/O<br>1 sample per cycle, 4 DSPs per CMAC<br>for the highest frequency | Window API,<br>batch = 1              |
| Runtime Configurable Size    | No                                                                                                                                 | No                                                                                          | No                                    |
| Input Data Precision         | 16 bit                                                                                                                             | 16 bit                                                                                      | 16 bit                                |
| Twiddle Precision            | 16 bit                                                                                                                             | 16 bit                                                                                      | 16 bit                                |
| Intermediate Data            | Full-word-growth                                                                                                                   | Unscaled (full-word-growth)                                                                 | 32 bit                                |

 Table 3.
 FFT configurations for the Agilex 7 FPGA Fabric vs. AMD Versal Devices

#### **Benchmarking Results**

The following section compares the implemented FIR and FFT filters on the Agilex 7 FPGA, AMD Versal FPGA fabric, and AMD Versal AIE.

#### **FIR Results:**

#### Agilex 7 FPGA Fabric vs. AMD Versal FPGA Fabric

The following section covers the FIR results from our Altera fabric vs. AMD Versal fabric study [7]. The Altera fabric achieves a higher maximum frequency (Fmax) on every filter configuration, as shown in Figure 1. Note that here, we report an unrestricted Fmax metric, which indicates the maximum frequency limit imposed by the FPGA logic and routing and does not account for the internal Fmax limit of the DSP and memory blocks. Figure 2 compares power fixed at 614 MHz and shows that the Agilex 7 FPGA fabric also achieves significantly lower dynamic power than the AMD fabric on every design. Figure 3 shows that the Agilex 7 FPGA fabric delivers higher performance per watt than the AMD Versal FPGA fabric on seven out of eight FIR designs, with a geomean of 1.52X at 614 MHz.







Figure 2. FIR Dynamic Power Comparison – Fabric vs. Fabric



Figure 3. FIR Performance per Watt Comparison – Fabric vs. Fabric

Table 4 details the performance, power, and performance per watt ratios of the FPGA fabric vs. AMD fabric results across all FIR filters. The color-coding scheme highlights in green cases where the Altera results are better than AMD's by more than 10% and those worse than 10% in pink, while everything within the +/-10% range remains in white. On average, the Altera implementations offer 28% higher performance (Fmax) or 34% lower dynamic power, resulting in a 1.52X average performance per watt improvement over the competing device.

| Agilex <sup>™</sup> 7 FPGA Fabric / AMD Versal FPGA Fabric (Ratio) |            |                        |                               |  |  |  |
|--------------------------------------------------------------------|------------|------------------------|-------------------------------|--|--|--|
|                                                                    | Fmax Ratio | Dynamic Power<br>Ratio | Performance per<br>Watt Ratio |  |  |  |
| Filter 1                                                           | 1.22       | 0.43                   | 2.31                          |  |  |  |
| Filter 2                                                           | 1.34       | 0.88                   | 1.14                          |  |  |  |
| Filter 3                                                           | 1.15       | 1.11                   | 0.90                          |  |  |  |
| Filter 4                                                           | 1.20       | 0.49                   | 2.03                          |  |  |  |
| Filter 5                                                           | 1.45       | 0.71                   | 1.42                          |  |  |  |
| Filter 6                                                           | 1.24       | 0.44                   | 2.25                          |  |  |  |
| Filter 7                                                           | 1.34       | 0.83                   | 1.21                          |  |  |  |
| Filter 8                                                           | 1.36       | 0.64                   | 1.57                          |  |  |  |
| Geomean                                                            | 1.28       | 0.66                   | 1.52                          |  |  |  |

# Table 4. FIR Results: FPGA Fabric vs. AMD FPGA Fabric Ratios

#### FIR Results:

#### Agilex 7 FPGA Fabric vs. AMD Versal AIE

This section compares the power and performance of FIR designs on the Agilex FPGA fabric vs. AMD Versal AIE.

Figure 4 compares the throughput of various FIR configurations in terms of MSPS and demonstrates that the Agilex 7 FPGA fabric consistently delivers higher maximum throughput than AMD Versal AIE. Figure 5 details the dynamic power consumption for the same configurations and shows that the Agilex 7 FPGA also delivers lower power for all six filter designs. These results demonstrate that Altera's implementations achieve higher MSPS and multiply-accumulate (MAC) operation efficiency despite the AMD Versal AIE running at a 1 GHz frequency. Furthermore, increasing the number of cores in the AI engine does not improve efficiency (i.e., performance per watt) [6].

When comparing power efficiency, the Altera fabric offers more than double the performance per watt, on average, than AMD's AIE, as shown in Figure 6<sup>1</sup>.



Figure 4. FIR Maximum Throughput Comparison – Fabric vs. AIE



Figure 5. FIR Dynamic Power Comparison – Fabric vs. AIE

Performance per Watt Comparison - Fabric vs AIE





Agilex<sup>™</sup> 7 Fabric / AMD Versal\* AIE (Ratio)

|          | MSPS Ratio | Dynamic Power<br>Ratio | Performance per<br>Watt Ratio |  |  |  |
|----------|------------|------------------------|-------------------------------|--|--|--|
| Filter 1 | 1.22       | 0.75                   | 1.63                          |  |  |  |
| Filter 2 | 1.17       | 0.77                   | 1.52                          |  |  |  |
| Filter 3 | 1.00       | 0.79                   | 1.27                          |  |  |  |
| Filter 4 | 2.40       | 0.85                   | 2.83                          |  |  |  |
| Filter 5 | 1.70       | 0.76                   | 2.23                          |  |  |  |
| Filter 6 | 4.23       | 0.98                   | 4.34                          |  |  |  |
| Geomean  | 1.70       | 0.81                   | 2.09                          |  |  |  |

# Table 5. FIR Results: FPGA Fabric vs. AMD Versal AIE Ratios Retion

While we did not directly compare to Versal AIE-ML for FIR workloads, it is evident that Versal AIE-ML would deliver inferior performance both in term of absolute performance and performance per watt. An analysis of AMD's published FIR results on Versal AIE-ML [6] reveals that the AIE-ML produces only about 12-15% of AMD's Versal AIE throughput performance when considering the same number of cores.

#### FFT Results: Agilex 7 FPGA Fabric vs. AMD Versal FPGA Fabric

The following section covers the FFT results from the Agilex 7 FPGA fabric comparison to the AMD Versal FPGA fabric. To make the comparison fair, we utilized a configuration similar to those reported by AMD in terms of FFT throughput and numeric accuracy. To maximize AMD's Fmax, we use complex multiplication configurations with 4 DSPs per complex multiply accumulate (CMAC).

The FPGA fabric demonstrated consistently higher Fmax in all tested designs, as shown in Figure 7. The Agilex 7 FPGA performance advantage also increases with larger FFT sizes due to the performance degradation of the AMD fabric for long 16K-32K FFTs.

Figure 8 illustrates the dynamic power of both devices at the highest achievable Fmax. The Agilex 7 FPGA fabric achieves significantly lower dynamic power even at higher performance levels than the AMD Versal\* fabric for all FFT sizes. Calculating the performance per watt ratio revealed that the Agilex 7 FPGA consumes an average of 0.63X less dynamic power than the AMD Versal FPGA fabric.



Figure 7. FFT Maximum Frequency Comparison – Fabric vs. Fabric



Figure 8. FFT Dynamic Power Comparison – Fabric vs Fabric

#### FFT Results: Agilex 7 FPGA Fabric vs. AMD Versal AIE

The following section compares the FFT results between the Agilex FPGA fabric and AMD Versal AIE.

Figure 9 shows the throughput of the FFT configurations in terms of MSPS. For AIE FFT implementations, AMD reports using single-core designs for 32-point to 2048-point FFTs. Starting from 4096-point FFT, AMD reports using multicore AIE implementations [12]. For Agilex FPGAs, we use two different implementations to roughly match or exceed the AIE FFT throughput reported for AMD's implementations. More specifically, we use conventional single-sample-percycle FFT implementations for 32-point to 4096-point FFT sizes and switch to parallel FFT implementations for 8192-point and larger sizes.

Figure 10 illustrates the dynamic power of both the Agilex FPGA fabric and AMD Versal AIE at peak FFT throughput. These results show that the Agilex 7 FPGA fabric has consistently lower dynamic power than the AMD Versal AIE for all FFT sizes. On average, the AMD Versal AIE uses 14% higher dynamic power while delivering 20% lower throughput than the Agilex 7 FPGA fabric.



Figure 9. FFT MSPS Comparison – Fabric vs AIE



Figure 10. FFT Dynamic Power Comparison – Fabric vs AIE



Figure 11. FFT Performance per Watt Comparison – Agilex FPGA Fabric over Competing Devices

Finally, Figure 11 plots the relative performance per watt of the Agilex 7 FPGA fabric with respect to all the competing devices considered in this study. The Agilex 7 FPGA fabric is consistently better than the AMD Versal FPGA fabric across all FFT sizes, as shown by the red curve. Furthermore, the Altera fabric is better than AMD Versal AIE in almost all cases except a few in the mid-range FFT sizes, as shown by the yellow curve.

When comparing the average performance per watt across all 10 studied FFT configurations, the Agilex 7 FPGA fabric delivers a 1.65X improvement over the AMD Versal FPGA fabric, 1.36X over the AMD Versal AIE, and 1.55X over the AMD Versal AIE-ML, as shown in Table 6. The •values in Table 6 denote the re-purposing of data from AIE, as AMD does not have published data on AIE-ML for those FFT sizes.

| Agilex FPGA Performance per Watt Ratio over Competing Devices |                    |                   |                      |  |  |
|---------------------------------------------------------------|--------------------|-------------------|----------------------|--|--|
| FFT Transform<br>Length                                       | vs. Versal<br>FPGA | vs. Versal<br>AIE | vs. Versal<br>AIE-ML |  |  |
| 32                                                            | 1.80               | 3.46              | 3.46                 |  |  |
| 64                                                            | 1.93               | 2.50              | 2.50 🛛               |  |  |
| 128                                                           | 1.57               | 1.25              | 1.58                 |  |  |
| 256                                                           | 1.53               | 0.96              | 1.28                 |  |  |
| 512                                                           | 1.43               | 0.79              | 1.02                 |  |  |
| 1024                                                          | 1.64               | 0.76              | 0.92                 |  |  |
| 2048                                                          | 1.68               | 0.76              | 0.80                 |  |  |
| 4096                                                          | 1.80               | 1.27              | 0.93                 |  |  |
| 8192                                                          | 1.64               | 1.30              | 1.77                 |  |  |
| 16384                                                         | 1.70               | 1.89              | 2.36                 |  |  |
| 32768                                                         | 1.48               | 1.96              | 2.50                 |  |  |
| Geomean                                                       | 1.65               | 1.36              | 1.55                 |  |  |

Table 6.FFT Results:FPGA Fabric Performance per Watt<br/>Ratio Over Competing Devices

#### Conclusion

In this paper, we review AMD's published data on the FIR and FFT implementations on the Versal FPGA fabric and AIE and compare them to implementations on an Agilex 7 FPGA. The experiments are carefully designed to closely match the public data for the AMD Versal FPGA fabric and to maximize the performance across all the devices, including the AMD Versal AIE. To perform accurate comparisons, we run the FIR and FFT designs on the AMD Versal fabric and use the publicly available data for the AMD Versal AIE and AIE-ML.

Across the different FIR and FFT configurations, we show that the Agilex 7 FPGA family consistently achieves higher throughput than the competing AMD Versal devices. The results also show that it delivers significant performance enhancements and stability while maintaining lower power consumption.

More specifically, FIR filters demonstrate a 1.5X higher average performance per watt on the Agilex FPGA fabric than the AMD Versal FPGA fabric. Agilex FPGA fabric also delivers 2.1X higher performance per watt than the AMD Versal AIE.

Similarly, the Agilex 7 FPGA fabric outperforms the AMD Versal FPGA fabric, AIE, and AIE-ML for FFT designs by consistently achieving higher maximum frequency/ throughput and lower dynamic power consumption. The Agilex FPGA fabric achieves 1.65X higher performance per watt over the AMD Versal fabric, 1.36X over AMD Versal AIE, and 1.55X over AMD Versal AIE-ML, on average.

Altera also has published results on publicly available designs from OpenCores representing a variety of functions were implemented in a device from the Agilex 7 FPGA family [13]. Agilex 7 FPGAs and SoCs are designed to be the highest performing products in their class, and the comparisons and conclusions drawn from this analysis reinforce the fact that Agilex 7 FPGAs deliver industry-leading advantages for DSP applications.

#### References

- J. Chromczak, M. Wheeler, C. Chiasson, D. How, M. Langhammer, T. Vanderhoek, G. Zgheib and I. Ganusov, "Architectural Enhancements in Agilex<sup>™</sup> FPGAs," in Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, CA, USA, 2020.
- [2] "Agilex<sup>™</sup> 7 FPGA and SoC FPGA," [Online]. Available: <u>https://www.intel.com/content/www/us/en/products/</u> details/fpga/agilex/7.html.
- [3] B. Gaide, D. Gaitonde, C. Ravishankar and T. Bauer, "Xilinx Adaptive Compute Acceleration Platform: Versal Architecture," in Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, CA, USA, 2019.
- [4] "AMD Versal," [Online]. Available: <u>https://www.xilinx.</u> com/products/silicon-devices/acap/versal.html.
- [5] "Variable-Precision DSP in Agilex<sup>™</sup>7FPGAs and SoCs," [Online]. Available: <u>https://www.intel.com/content/</u> <u>www/us/en/docs/programmable/683458/current/</u> variable-precision-dsp-in-fpgas-and-socs.html.
- [6] "AMD Vitis Libraries Filters," [Online]. Available: <u>https://docs.xilinx.com/r/en-US/Vitis\_Libraries/L2-</u> AIE-DSP-Library-User-Guide.
- [7] "Performance and Resource Utilization for FIR Compiler v7.2," [Online]. Available: <u>https://www.xilinx.</u> com/htmldocs/ip\_docs/pru\_files/fir-compiler.html.
- [8] "DSP Builder," [Online]. Available: <u>https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/dsp-builder.html</u>.

- [9] "BuildMoreCost-EffectiveandMoreEfficient5GRadios with Agilex™ FPGAs," [Online]. Available: <u>https://www. intel.com/content/dam/www/central-libraries/us/en/ documents/build-5g-radios-with-agilex-fpgas-whitepaper.pdf.</u>
- [10] "Quartus® Prime Design Software," [Online]. Available: https://www.intel.com/content/www/us/en/products/ details/fpga/development-tools/quartus-prime.html.
- [11] "Vivado Design Suite," [Online]. Available: <u>https://www.</u>xilinx.com/products/design-tools/vivado.html.
- [12] "AMD Vitis Libraries FFT IFFT," [Online]. Available: https://docs.xilinx.com/r/en-US/Vitis\_Libraries/dsp/ user\_guide/L2/benchmark.html.
- [13] Z.Weng and K.Tondehal, "Performance Advantages on OpenCores with Agilex<sup>™</sup> 7 FPGAs," Altera, [Online]. Available: <u>https://www.intel.com/content/www/us/en/</u> content-details/787066/content-details.html.



<sup>+</sup> Tests measure performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit https://edc. intel.com/content/www/us/en/ products/performance/benchmarks/fpga/

Altera technologies' features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at http://www.intel.com.

Altera reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services.

No product or component can be absolutely secure. Your costs and results may vary.

© Altera Corporation. Altera, the Altera logo, the 'a' logo, and other Altera marks are trademarks of Altera Corporation. \*Other names and brands may be claimed as the property of others.