Draft submission for review (MLS Group Project 24/25)

Published: 09 Apr 2025, Last Modified: 09 Apr 2025OpenReview Archive Direct UploadEveryoneCC0 1.0
Abstract: This paper presents a comparative analysis of CPU and GPU performance for information retrieval tasks, focusing on query-to-vector search using exact k-nearest neighbours and approximate nearest neighbours, and model serving systems, testing the effectiveness of queueing and batching for efficient real-world employment. We evaluate both architectures across a range of retrieval workloads. Our findings highlight the GPU's advantages in parallelism and high-throughput computation, especially for dense retrieval and large-scale model serving, while CPUs demonstrate competitive performance for low-latency, memory-efficient tasks.
Loading