Abstract: This work presents TD-NAAS, a template-based differentiable approach towards the co-design of neural networks and hardware accelerator. Each neural operator is paired with the optimal hardware block that executes it efficiently, which is called a template. This approach reduces the search space by eliminating hardware design parameters and guarantees the most efficient accelerator. Evaluation results show that our method can build a neural network with higher accuracy and an accelerator with lower latency compared to the existing works.
Loading