Abstract: Resource provisioning is vital in large-scale, geo-graphically distributed, and hierarchically organized infrastructures, and, at the same time, it represents one of the stiffest challenges in their management. The goal is to optimally allocate infrastructure resources to jobs, ensuring jobs' Service Level Objectives (SLOs) while retaining high resource utilization across the entire resource pool. In this context, accurate workload profiling is crucial to achieving optimal resource management, giving more context to the system. However, approaches either make static guesses or use runtime profiling - that may be delayed by sandbox testing - and fall short in providing fast and accurate information. We aim to overcome these challenges with a novel profiling approach and methodology, the PolarisProfiler. We discard the consistency assumptions and assume a broader and less influenced perspective. We use apriori available, static metadata to enable generic and immediate job profiling based on historic execution traces. The PolarisProfiler proposes a novel dynamic profiling model, a generic workload profile generator, and a metadata-based profile classifier. We illustrate the practical feasibility of our approach by evaluating the PolarisProfiler in a case study. We target machine learning workloads, leveraging a publicly available dataset from Alibaba. We offer a reference implementation of our profiling methodology, combining a density-based hierarchical clustering technique and an interpretable decision-tree model for the classifier. We test the PolarisProfiler for job duration estimation. Despite being based solely on static, apriori metadata, we obtain convincing results compared to the state-of-the-art, yielding an estimation error rate of 5% for the 80% of profiled jobs.