COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Computer Laboratory Computer Architecture Group Meeting > Two generations of Many-Core Computational Arrays
Two generations of Many-Core Computational ArraysAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Robert Mullins. Note unusual time The Asynchronous Array of Simple Processors (AsAP) is a programmable and reconfigurable processing system that: enables high throughput and high energy-efficiency, is well matched to workloads containing many varied DSP tasks, and is well suited for deep submicron VLSI fabrication technologies. The AsAP platform is composed of a large number of programmable “reduced complexity” processing elements designed to capture the targeted task kernels but with very little additional overhead. Processors contain individual digitally-tunable clock oscillators operating completely independently with respect to each other (GALS), and processors communicate through a reconfigurable full-rate 2-D mesh network. Individual clock oscillators fully halt in 9 cycles when there is no work to do, and restart at full speed in less than one cycle after work becomes available. A chip containing 36 programmable processors was fabricated in 0.18 um CMOS using standard cells and is fully functional. Each 0.66 mm^2 processor operates up to 610 MHz at 2.0 V and dissipates 32 mW average at 475 MHz and 1.8 V, and 2.4 mW at 116 MHz and 0.9 V while executing applications. [ISSCC06] Several dozen DSP and general tasks have been coded including 32-1024 point complex FFTs, a k=7 viterbi decoder, a JPEG encoder, a full-rate HDTV H .264 CAVLC encoder, and a fully-compliant IEEE 802.11a/11g wireless LAN baseband transmitter and receiver. Power, throughput, and area results compare very well with existing programmable DSP processors. A recently completed C compiler and automatic mapping tool greatly simplify programming. A second generation 65 nm CMOS design contains 167 processors and has many new architectural features including dedicated FFT , Viterbi, and video motion estimation processors; 16 KB shared memories; and long-distance inter-processor interconnect. The programmable processors are able to individually and dynamically change their supply voltage (choosing among VddHi, VddLo, or disconnected) and clock frequency. The chip is fully-functional with early measurements showing the programmable processors operating up to 1.2 GHz while dissipating 59 mW at 1.3 V. At a supply voltage of 0.675 V, they operate at 66 MHz while dissipating only 608 uW. This talk is part of the Computer Laboratory Computer Architecture Group Meeting series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsDarwin Humanities and Social Sciences Seminar Twentieth Century Think Tank Meeting the Challenge of Healthy Ageing in the 21st CenturyOther talksKatie Field - Symbiotic options for the conquest of land Around the world in 605 State energy agreements Making Refuge: Issam Kourbaj Intelligent Self-Driving Vehicles Can land rights prevent deforestation? Evidence from a large-scale titling policy in the Brazilian Amazon. The importance of seed testing Validation & testing of novel therapeutic targets to treat osteosarcoma Throwing light on organocatalysis: new opportunities in enantioselective synthesis 'Politics in Uncertain Times: What will the world look like in 2050 and how do you know? Thermodynamics de-mystified? /Thermodynamics without Ansätze? Borel Local Lemma |