Long Short-Term Memory on a Many-Core Platform

Arthur Hlaing
Masters Thesis
VLSI Computation Laboratory
Department of Electrical and Computer Engineering
University of California, Davis
Technical Report ECE-VCL-2020-2, VLSI Computation Laboratory, University of California, Davis, 2020.

Abstract:

Recurrent neural networks (RNNs) learn from data sequences and have memory which captures information about the past inputs. Long short-term memory (LSTM) is a variant of RNNs used in many applications such as speech recognition, handwriting recognition, natural language processing, and image captioning. A complete character-level language model is implemented on the many-core processor array using new LSTM layer designs. The model predicts the next character in the sequence based on the specific characters that have come before it in the sequence. The output of the many-core implementation is verified with the output of the original published reference. Since no hardware comparison data exist for the whole application, the LSTM portion of the application composed of two LSTM layers is compared with 6 published hardware implementations on latency, throughput, power efficiency, and throughput per Watt.

Two many-core implementations of the benchmark, which is composed of two LSTM layers, are compared with general-purpose processors, graphic processing unit (GPU), field-programmable gate array (FPGA), and application-specific integrated circuit (ASIC) implementations. The two LSTMs have 230,912 weights and biases. The throughput, measured in characters per second, of the many-core implementations are 893×–3843× greater than general-purpose processors, 155× greater than TSMC 64 nm ASIC, and 471× greater than FPGA. Moreover, in throughput per Watt, the many-core implementations are between 26×–125× greater than general-purpose processors, 1.2× greater than ASIC, and 166× greater than FPGA in throughput per Watt.

Thesis

Reference

Arthur Hlaing, "Long Short-Term Memory on a Many-Core Platform," Masters Thesis, Technical Report ECE-VCL-2020-2, VLSI Computation Laboratory, ECE Department, University of California, Davis, March 2020.

BibTeX entry

@mastersthesis{arthur:vcl:mastersthesis,
   author      = {Arthur Hlaing},
   title       = {Long Short-Term Memory on a Many-Core Platform},
   school      = {University of California, Davis},
   year        = 2020,
   address     = {Davis, CA, USA},
   month       = mar,
   note        = {\url{http://vcl.ece.ucdavis.edu/pubs/theses/2020-2.ahlaing/}}
   }

VCL Lab | ECE Dept. | UC Davis