Empirical Training Time Prediction for LLM Fine-Tuning Using Scaling Laws

Published: 21 May 2025, Last Modified: 17 Jun 2025MLArchSys 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Presentation: In-Person
Keywords: LLMs, Fine-Tuning, Time Prediction, Scaling Laws, Benchmarking, Gradient Boosting, GPU, Parallelism
Presenter Full Name: Chianing Wang
TL;DR: Empirical Training Time Prediction for LLM Fine-Tuning Using Scaling Laws
Presenter Email: johnny.wang@toyota.com
Abstract: This paper presents a methodology to efficiently estimate the training time and associated computational cost of fine-tuning Large Language Models (LLMs). Our approach introduces a novel two-stage methodology that incorporates an intelligent tuning algorithm called the Scaling Laws Smart Tuning (SLST) algorithm for efficient sample data collection and a time prediction model combining scaling laws with Gradient Boosting techniques. The scaling laws capture broad training trends concerning model parameters and dataset sizes, while Gradient Boosting models effectively reduce residual errors by modeling complex nonlinear relationships directly from data. Through this integrated approach, we achieve a high accuracy in training time predictions, which can significantly enhance resource planning and infrastructure decision-making. Our results demonstrate the effectiveness of the methodology, which balances interpretability and predictive accuracy, and highlight its scalability across various computational and parallelism environments.
Presenter Bio: Chianing Wang is a Principal Researcher at Toyota InfoTech Labs USA and a PhD candidate in Computer Science and Engineering at Santa Clara University. His research primarily focuses on AI infrastructure, GPU parallelism, and optimizing distributed computing frameworks for large-scale machine learning workloads. Chianing’s work aims to enhance computational efficiency and scalability in next-generation AI systems.
Paper Checklist Guidelines: I certify that all co-authors have validated the presented results and conclusions, and have read and commit to adhering to the Paper Checklist Guidelines, Call for Papers and Publication Ethics.
YouTube Link: https://youtu.be/MEjHJ2aaoS0
YouTube Link Poster: No Poster
Dataset Release: I certify that all co-authors commit to release the dataset and necessary scripts to reproduce the presented results.
Google Slides: https://docs.google.com/presentation/d/1HlGeKqSzuBrlQhY9GLXOSx3e3y3RhveZ/edit?usp=sharing&ouid=106046921967000875591&rtpof=true&sd=true
Poster: No
Workshop Registration: Yes, the presenter has registered for the workshop.
YouTube Link Short: TBD
Submission Number: 13
Loading