Abstract: In this work we propose an adaptive and scalable hardware implementation of convolutional Neural Networks. The adaptive hardware model is the result of a design loop that starts with a software implementation relying on standard scanning window and MAC operations. This design is developed into a deterministic, hardware-friendly model which introduces timing, fixed-point representation and a pixel streaming interface. Then finally HDL code is generated and an RTL of the system is created. Each step is analyzed and validated against pre-set objectives using a golden reference from the last step. The proposed system is capable of selective output execution of different data-paths. It allows for real time trade-offs between accuracy for execution time and power. This is achieved by implementing a CNN network through a number of sequential layer blocks. Layer-blocks can effectively be considered standalone networks with differing complexities. Each layer blocks branches off into an output that is independent of the block that follows it. This allows the system to execute partially or fully according to performance requirements. This reconfigurable model trades off accuracy for speed and power, results show a tradeoff in accuracy for a 50% and 70% gain in both speed and power respectfully.
Loading