Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Electrical and Computer Engineering

First Advisor's Name

Gang Quan

First Advisor's Committee Title

Committee chair

Second Advisor's Name

Nezih Pala

Second Advisor's Committee Title

Committee member

Third Advisor's Name

Ou Bai

Third Advisor's Committee Title

Committee member

Fourth Advisor's Name

Kemal Akkaya

Fourth Advisor's Committee Title

Committee member

Fifth Advisor's Name

Deng Pan

Fifth Advisor's Committee Title

Committee member

Keywords

computer and systems architecture, electrical and electronics

Date of Defense

3-28-2022

Abstract

The autonomous vehicle (AV) technology, due to its tremendous social and economical benefits, is transforming the entire world in the coming decades. However, significant technical challenges still need to be overcome until AVs can be safely, reliably, and massively deployed. Temperature plays a key role in the safety and reliability of an AV, not only because a vehicle is subjected to extreme operating temperatures but also because the increasing computations demand more powerful IC chips, which can lead to higher operating temperature and large thermal gradient. In particular, as the underpinning technology for AV, artificial intelligence (AI) requires substantially increased computation and memory resources, which have been growing exponentially through recent years and further exacerbated the thermal problems. High operating temperature and large thermal gradient can reduce the performance, degrade the reliability, and even cause an IC to fail catastrophically. We believe that dealing with thermal issues must be coupled closely in the design phase of the AVs’ electronic control system (ECS). To this end, first, we study how to map vehicle applications to ECS with heterogeneous architecture to satisfy peak temperature constraints and optimize latency and system-level reliability. We present a mathematical programming model to bound the peak temperature for the ECS. We also develop an approach based on the genetic algorithm to bound the peak temperature under varying execution time scenarios and optimize the system-level reliability of the ECS. We present several computationally efficient techniques for system-level mean-time-to-failure (MTTF) computation, which show several orders-of-magnitude speed-up over the state-of-the-art method. Second, we focus on studying the thermal impacts of AI techniques. Specifically, we study how the thermal impacts for the memory bit flipping can affect the prediction accuracy of a deep neural network (DNN). We develop a neuron-level analytical sensitivity estimation framework to quantify this impact and study its effectiveness with popular DNN architectures. Third, we study the problem of incorporating thermal impacts into mapping the parameters for DNN neurons to memory banks to improve prediction accuracy. Based on our developed sensitivity metric, we develop a bin-packing-based approach to map DNN neuron parameters to memory banks with different temperature profiles. We also study the problem of identifying the optimal temperature profiles for memory systems that can minimize the thermal impacts. We show that the thermal aware mapping of DNN neuron parameters on memory banks can significantly improve the prediction accuracy at a high-temperature range than the thermal ignorant for state-of-the-art DNNs.

Identifier

FIDC010501

ORCID

https://orcid.org/0000-0003-1995-6232

Previously Published In

A. S. Bankar, S. Sha, V. Chaturvedi and G. Quan, “Thermal Aware Lifetime Reliability Optimization for Automotive Distributed Computing Applications,” 2020 IEEE 38th International Conference on Computer Design (ICCD), Dec. 2020, pp. 498-505.

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).