Abstract: Many circuit analysis workloads incorporate complex execution logic under dynamic control flow, such as branch-and-bound techniques, on-the-fly pruning and recursive decomposition strategies. Parallelizing these kinds of workloads can benefit from the exploitation of dynamic task graph parallelism across arbitrary decision-making points at runtime. A recent research paper AsyncTask has introduced a new programming model that supports the dynamic construction of a computational task graph. Unlike the traditional construct-and-run programming models, AsyncTask offers programmers great flexibility to parallelize large-scale circuit analysis workloads that are extremely spare, irregular and control-flow intensive. To leverage the power of dynamic task parallelism, AsyncTask users are responsible for creating tasks in a valid topological order. This paper conducts an experimental study to investigate in the runtime difference of different topological orders of tasks on large-scale static timing analysis workloads using AsyncTask. Our result highlights the need for a new technique to get a valid topological sequence that yields a better runtime performance than heuristic-based sorting algorithms for large-scale real-world circuit analysis applications.
Loading