Energy-Efficient Sorting on a Many-Core Platform

Aaron Stillmaker
Lucas Stillmaker
Brent Bohnenstiehl
Bevan M. Baas

VLSI Computation Laboratory
Department of Electrical and Computer Engineering
University of California, Davis

Abstract:

Data centers require significant and growing amounts of power to operate, and 
with increasing numbers of data centers worldwide, power consumption for 
enterprise workloads is a significant concern. Sortingis a key computational 
kernel in large database systems, and thedevelopment of energy efficient sorting 
capabilities would thereforesignificantly reduce data center power usage.  We 
propose highly parallel sorting algorithms and mappings using a modular design 
for a fine-grained many-core system that greatly decreases the amount of energy 
consumed to perform sorts of arbitrarily large data sets. 

The design and implementation of several sorting variants that perform the first 
phase of an external sort on a many-core array will be presented. They are built 
using program kernels operating on independent processors in a many-core array 
with 256 bytes of data memory and fewer than 128 instructions per processor. 
The algorithms employed are simple and the vast majority of processors contain 
identical programs.  The sorts are also simulated on a general many-core array 
with the number of processors scaling between 100 and 10,000, which is used as
a co-processor to a general-purpose processor.

When the results from the first phase of a 164 core array are compared to a 
quicksort implementation on an Intel Core 2 Duo T9600, the highest throughput 
design achieves up to 27x higher throughput per chip area, and the most energy 
efficient sort yields a 330x reduction in energy dissipated per sorted block.
Compared to a radix sort implementation on a Nvidia 9600M GPU,the highest 
throughput design achieves up to 22x higher throughput per chip area, and the 
most energy efficient sort yields a 750x reduction in energy dissipated per 
sorted block.  The proposed sorts are easily programmed, and scaleable to any 
sized 2D mesh processor array while giving a large energy savings without 
penalizing performance.

Reference

Aaron Stillmaker, Lucas Stillmaker, Brent Bohnenstiehl and Bevan M. Baas, "Energy-Efficient Sorting on a Many-Core Platform," Technology and Talent for the 21st Century (TECHCON 2013), Austin, TX, September 2013.

BibTeX Entry

@INPROCEEDINGS{stillmakerTechcon, 
author={Stillmaker, Aaron and Stillmaker, Lucas and Bohnenstiehl, Brent and Baas, Bevan}, 
booktitle={Technology and Talent for the 21st Century (TECHCON 2013)}, 
title={Energy-Efficient Sorting on a Many-Core Platform}, 
year=2013, 
month=sep, 
keywords={external sorting;fine-grained many-core;modular
programing;parallel processing;processor array;streaming sorting;}, 
}

VCL Lab | ECE Dept. | UC Davis

Last update: Mar. 23, 2013