Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review
Analog Devices (United States) · University of Missouri · +1 more institution
Abstract
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly accurate models for data-driven, learned, automatic, and practical machine learning (ML) solutions to end-user applications remains challenging. DL algorithms are often computationally expensive, power-hungry, and require large memory to process complex and iterative operations of millions of parameters. Hence, training and inference of DL models are typically performed on high-performance computing (HPC) clusters in the cloud. Data transmission to the cloud results in high latency, round-trip delay, security and privacy concerns, and the inability of…
Citation impact
- FWCI
- 39.03
- Percentile
- 100%
- References
- 475
Authors
4Topics & keywords
- Computer science
- Edge device
- Cloud computing
- Edge computing
- Software deployment
- Deep learning
- Inference
- Artificial intelligence