Compact Machine Learning Systems with Reconfigurable Computing

Morán Costoya, Alejandro

Compact Machine Learning Systems with Reconfigurable Computing

Morán Costoya, Alejandro

Supervised by:

Josep Lluis Rosselló Sanz Director
Vicente José Canals Guinand Director

Defence university: Universitat de les Illes Balears

Fecha de defensa: 28 January 2022

Committee:

María Luisa López Vallejo Chair
Gabriel Oliver Codina Secretary
Luis Parrilla Roure Committee member

Type: Thesis

Teseo: 731136 DIALNET RepositoriUIB editor

Abstract

• Introduction Recently, the state-of-the-art solution was to send data captured by sensor nodes to the cloud and wait for the server's response. This server-dependent approach requires a significant amount of data transmission, which in turn results in a network congestion. Even though data transmissions with sufficient throughput would solve the problem, increasing throughput increases energy consumption. Moreover, there exist privacy issues related to sending sensor data directly to the cloud (e.g. sending images with people faces). The solution to this issue is to enable data inference and analysis capabilities at the edge, which are close to data sources. This approach is known as Edge Computing. Compared to the straightforward data transmission, energy consumption is potentially reduced since inference is done by low power edge devices, there is not a latency problem if the device is capable of managing real-time inference, and privacy is no longer an issue since raw data is not transmitted to the cloud, only metadata (e.g. date, location and event duration) and inference results (e.g. detection of chemical compounds in the air or gunshots or voice-based assistance) are sent. In this context, the thesis is focused on prototyping FPGA pattern recognition inference solutions with potential applications in Edge Computing, either at edge nodes or as part of IoT devices. • Research content In this work, the proposed models are much smaller and the aim is to contribute to the exploration of simplified highly/fully parallel (custom) non von Neumann hardware architectures with potential energy efficiency benefits. In particular, several FPGA designs implementing the inference process of several Machine Learning models have been proposed and tested on a set of benchmark datasets. The FPGA implementations include two Reservoir Computing models based on low precision fixed-point arithmetic and a Radial Basis Function Neural Network based on Stochastic Computing. Additionally, a Convolutional Neural Network based on two different Stochastic Computing variants has been simulated and evaluated for different bit precision, both trained using a custom Training Aware Quantization approach. • Conclusion In general, the proposed implementations present simplifications that contribute to reduce both the size of the circuit and its power consumption, as well as a specific training to work with low precision parameters. While it is true that the implementations detailed in the thesis can be considered independent of each other, it can be concluded that in most cases improvements have been obtained in terms of power consumption compared to the state of the art, without having a significant impact on the hit ratio. This makes the proposed designs very attractive for small battery-powered portable devices, since lower power consumption translates into longer battery life. In addition, lower power consumption can also be beneficial for large-scale inference, as lower power consumption also means a reduction in CO2 emissions generated.