Accelerating Sparse Deep Neural Networks

Mishra, Asit; Latorre, Jorge Albericio; Pool, Jeff; Stosic, Darko; Stosic, Dusan; Venkatesh, Ganesh; Yu, Chong; Micikevicius, Paulius

Computer Science > Machine Learning

arXiv:2104.08378 (cs)

[Submitted on 16 Apr 2021]

Title:Accelerating Sparse Deep Neural Networks

Authors:Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius

View PDF

Abstract:As neural network model sizes have dramatically increased, so has the interest in various techniques to reduce their parameter counts and accelerate their execution. An active area of research in this field is sparsity - encouraging zero values in parameters that can then be discarded from storage or computations. While most research focuses on high levels of sparsity, there are challenges in universally maintaining model accuracy as well as achieving significant speedups over modern matrix-math hardware. To make sparsity adoption practical, the NVIDIA Ampere GPU architecture introduces sparsity support in its matrix-math units, Tensor Cores. We present the design and behavior of Sparse Tensor Cores, which exploit a 2:4 (50%) sparsity pattern that leads to twice the math throughput of dense matrix units. We also describe a simple workflow for training networks that both satisfy 2:4 sparsity pattern requirements and maintain accuracy, verifying it on a wide range of common tasks and model architectures. This workflow makes it easy to prepare accurate models for efficient deployment on Sparse Tensor Cores.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR)
Cite as:	arXiv:2104.08378 [cs.LG]
	(or arXiv:2104.08378v1 [cs.LG] for this version)
	http://doi.org/10.48550/arXiv.2104.08378

Submission history

From: Jeff Pool [view email]
[v1] Fri, 16 Apr 2021 21:27:32 UTC (375 KB)

Computer Science > Machine Learning

Title:Accelerating Sparse Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Accelerating Sparse Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators