

## 7.2 An Integrated 802.11a Baseband and MAC Processor

John Thomson, Bevan Baas, Elizabeth M. Cooper, Jeffrey M. Gilbert, George Hsieh, Paul Husted, Aparna Loka Nathan, Jeffrey S. Kuskin, David McCracken, Bill McFarland, Teresa H. Meng, David Nakahira, Samuel Ng, Mahesh Rattehalli, Jeff L. Smith, Ravi Subramanian, Lars Thon, Yi-Hsiu Wang, Robert Yu, Xiaoru Zhang

Atheros Communications, Sunnyvale, CA

The tremendous growth in wireless LANs has generated interest in technologies that provide higher data rates and greater system capacities. The IEEE 802.11a standard, based on coded OFDM modulation, provides nearly five times the data rate and as much as ten times the overall system capacity as current 802.11b wireless LAN systems [1,2,3]. This baseband and MAC processor is part of a two-chip set that forms a complete 802.11a solution [4]. The chip operates up to 54Mb/s in 20MHz channels according to the 802.11a standard, and includes proprietary modes supporting up to 108Mb/s in 40MHz channels.

Figures 7.2.1 and 7.2.2 depict the system and MAC architecture. The protocol control unit (PCU) manages all low-level, timing-critical aspects of 802.11. It formats and sends outgoing frame data to the baseband transmitter and processes incoming frame data from the baseband receiver. The host interface unit (HIU) provides connectivity to the host processor over a PCI bus. The DMA engine manages transfer of frame data and control information between the PCU and HIU.

Timing-critical functions require the MAC to act within microseconds of an event or at precise intervals. Examples of these functions include the channel access mechanism, channel state and network-wide timer updates, checksum generation and checking, hardware-level frame retry, and the generation of special frames such as periodic beacons and acknowledge. Non-timing-critical functions are performed in the driver software executing on the host. These include complex frame exchange sequences (e.g., association and authentication exchanges), frame fragmentation and defragmentation, frame buffering and bridging, and other network management portions of the 802.11 protocol.

The driver software builds transmit frames as a collection of frame data fragments in host memory and passes a pointer to a corresponding list of transmit descriptors to the DMA engine. The DMA engine traverses the list and performs the necessary data fetches from host memory, passing the coalesced frame data to the PCU. The DMA engine provides full scatter/gather capability, including support for arbitrary byte alignment and byte lengths, to avoid multiple data copy operations on the host.

The PCU encrypts the frame if WEP is enabled and generates the proper checksum value. It follows the 802.11 CSMA/CA access procedure to gain access to the channel and then forwards the frame to the baseband logic. On receive, the PCU extracts the frame type, verifies the checksum, and generates a response frame (typically an acknowledge) if appropriate. The PCU also passes the received frame to the DMA engine, which interprets a series of descriptors to transfer the frame data to host memory. The MAC provides power-management functions. Acting as a station in a multi-node network, for example, the MAC can be programmed to sleep automatically and awake just before the next beacon is scheduled to arrive. The PCU parses the incoming beacon to determine whether to remain awake for additional frames or to re-enter sleep.

The MAC is implemented using dedicated control and datapath logic, and includes registers that allow host software to configure and control its operation. This yields an overall design that is

compact, power-efficient, and requires no off-chip RAM or program storage, yet is flexible enough to accommodate the vagaries of the 802.11a protocol itself as well as the additional needs of the host operating system and driver software.

The baseband transmitter generates OFDM waveforms according to the 802.11a specification. A 128-point IFFT simplifies external transmit filtering, thereby preserving the guard interval. Dual 9b 160MHz DACs employ a current steering structure and pass baseband data to the analog transceiver. Programmable scaling in the digital domain trades DAC quantization noise against increased probability of clipping.

The baseband receiver (Figure 7.2.3) contains dual 9b 80MHz ADCs that cover  $\pm 500\text{mV}$  input range. The oversampling relative to the channel bandwidth of 17MHz simplifies the anti-aliasing filter design and allows the filtering of the adjacent channel to be primarily in the digital domain. Calibration of DC offset and gain in the analog receiver is performed. The AGC maximizes signal size at the ADC while providing headroom for adjacent channel interference and the peak-to-average ratio of OFDM symbols. The short preamble in 802.11a demands a quick loop from power measurement to gain adjustment. The receiver gain is composed of RF, IF and baseband gains, as well as an antenna switch that provides additional attenuation.

The ADC outputs pass through lowpass downsampling filters. Signal detection, frequency offset estimation and symbol timing rely on auto-correlation. Figure 7.2.4 shows pilot tracking and channel correction details. The two long training symbols are averaged, filtered, and inverted. For each of the four pilots in a data symbol, the phase with respect to the training symbol pilot phase is computed. A least-squares fit of the four pilot phase differences determines the phase adjustment for each data subcarrier. Pilot phase monitoring tracks frequency estimation error, phase noise and symbol timing drift. Pilot magnitude tracking compares pilot power in the data symbols to the training symbols and monitors gain variations. A 128-point FFT reduces adjacent filtering requirements, preserves the guard interval, and shares hardware with the IFFT. Equalized data is passed to a radix-4, fully parallel, soft decision, traceback architecture Viterbi decoder.

Along with the companion 5GHz CMOS transceiver, the fabricated chip is fully compliant with the 802.11a standard and exceeds all mandated performance specifications. Figure 7.2.5 compares the SNR required by the implementation to that required by the standard. The 802.11a required SNR is derived from the sensitivities and 10dB noise figure specified by the standard. Also shown are the theoretical SNR limits calculated assuming perfect synchronization and channel estimation.

Additional data rates up to 108Mb/s are supported by varying the internal clock frequencies and adjusting transmit and receive filtering. The oversampled ADC and DAC designs accommodate the higher data rates. Figure 7.2.6 shows key chip parameters and Figure 7.2.7 shows the die micrograph. Power is reduced by utilizing 73 gated clock trees with independent enables.

### Acknowledgments:

The authors acknowledge the contributions of R. Bahr, G. Chesson, W. Cole, P. Hanley, K. Jianto, D. Johnson, L. Khan, C. Lee, S. Montoya, S. Padnos, S. Rabii, A. Tehrani, S. Wong and J. Zheng.

### References:

- [1] 802.11a standard, ISO/IEC 8802-11:1999/Amd 1:2000(E).
- [2] W. Eberle, et al., "A Digital 72Mb/s 64-QAM OFDM Transceiver for 5GHz Wireless LAN in 0.18mm CMOS", ISSCC Digest of Technical Papers, pp. 336-337, Feb 2001.
- [3] P. Ryan, et al., "A Single Chip PHY COFDM Modem for IEEE 802.11a with Integrated ADCs and DACs", ISSCC Digest of Technical Papers, pp. 338-339, Feb 2001.
- [4] D. Su, et. al., "A 5GHz CMOS transceiver for IEEE 802.11a Wireless LAN", ISSCC Digest of Technical Papers, Paper 5.4, Feb. 2002.



Figure 7.2.1: System overview.



Figure 7.2.2: MAC architecture.



Figure 7.2.3: Block diagram of baseband receiver.



Figure 7.2.4: Pilot tracking and channel correction.

| Modulation and Coding Rate | Data Rate (Mb/s) | Required SNR of 802.11a (dB) | Required SNR of Design (dB) | Theoretical SNR (dB) | Implementation Loss (dB) |
|----------------------------|------------------|------------------------------|-----------------------------|----------------------|--------------------------|
| BPSK 1/2                   | 6                | 9.7                          | 5.4                         | 1.0                  | 4.4                      |
| BPSK 3/4                   | 9                | 10.7                         | 5.8                         | 3.5                  | 2.3                      |
| QPSK 1/2                   | 12               | 12.7                         | 7.0                         | 3.8                  | 3.2                      |
| QPSK 3/4                   | 18               | 14.7                         | 9.5                         | 6.5                  | 3.0                      |
| 16-QAM 1/2                 | 24               | 17.7                         | 11.3                        | 8.8                  | 2.5                      |
| 16-QAM 3/4                 | 36               | 21.7                         | 14.9                        | 12.3                 | 2.6                      |
| 64-QAM 2/3                 | 48               | 25.7                         | 18.6                        | 16.8                 | 1.8                      |
| 64-QAM 3/4                 | 54               | 26.7                         | 20.6                        | 19.0                 | 1.6                      |

Figure 7.2.5: Required SNR for 10% packet error rate.

|                             |                                                                |
|-----------------------------|----------------------------------------------------------------|
| Technology                  | Standard 0.25 $\mu$ m CMOS, 5 layer metal, 2.5V core, 3.3V I/O |
| Transistor Count            | 4.0M                                                           |
| Die Size                    | 6.8mm x 6.8mm                                                  |
| Package                     | 196-pin BGA                                                    |
| Power at 54Mb/s (Tx/Rx)     |                                                                |
| Core                        | 219 / 203 mW                                                   |
| DACs + supporting circuitry | 68 / 0 mW                                                      |
| ADCs + supporting circuitry | 0 / 211mW                                                      |
| I/O                         | 25 / 24 mW                                                     |
| PLL                         | 14 / 14 mW                                                     |
| Total                       | 326 / 452 mW                                                   |

Figure 7.2.6: Chip details.



Figure 7.2.7: Die micrograph.