Abstract: Ubiquitous computing involves large number of devices which are connected via networks. This requires packet processing service to guarantee privacy, security, and high quality. We study to provide ubiquitous computing with stable and satisfied services through improving packet processing performance. Since the applications become more and more complicated, the task allocation among multi-cores for pipelined architecture becomes important and difficult. In order to map tasks onto pipelined architecture and maximize the overall throughput, we propose a task allocation scheme incorporated with profiling and globally thread refinement. This scheme relies on a performance model which determines the system throughput considering multi-thread, memory access and the effect of communications between stages. We evaluate the technique by implementing representative network processing applications on the Intel IXP architecture. Experimental results show that our scheme is able to generate mapping of realistic applications to balance the stages and obtain high throughput. Furthermore, it outperforms other methods even when the PE number is reduced.
Loading