articleIEEE Journal of Solid-State CircuitsJan 1, 2008Closed access

An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS

Intel (United States) · Intel (India)

Indexed incrossref

Abstract

This paper describes an integrated network-on-chip architecture containing 80 tiles arranged as an 8x10 2-D array of floating-point cores and packet-switched routers, both designed to operate at 4 GHz. Each tile has two pipelined single-precision floating-point multiply accumulators (FPMAC) which feature a single-cycle accumulation loop for high throughput. The on-chip 2-D mesh network provides a bisection bandwidth of 2 Terabits/s. The 15-FO4 design employs mesochronous clocking, fine-grained clock gating, dynamic sleep transistors, and body-bias techniques. In a 65-nm eight-metal CMOS process, the 275 mm 2 custom design contains 100 M transistors. The fully functional first silicon achieves over 1.0 TFLOPS…

Citation impact

627
total citations
FWCI
78.14
Percentile
100%
References
20
Citations per year

Authors

15

Topics & keywords

Keywords
  • CMOS
  • Chip
  • Computer science
  • Transistor
  • Tile
  • Computer hardware
  • Parallel computing
  • Embedded system
UN Sustainable Development Goals
  • Affordable and clean energy
No related works found for this paper.

Funding