Global Safe Sequential Learning via Efficient Knowledge Transfer

TMLR Paper3336 Authors

13 Sept 2024 (modified: 20 Nov 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=BDNhQE7Hg9
Changes Since Last Submission: ## Rebuttal change: We revise the content and reorganize our sections, as described below. In addition to the main file, we attach **diff_latexdiff.pdf** which is a modification track generated by the latexdiff tool; we further attach **diff_highlight.pdf** which marks content-wise modification in color; in diff_highlight.pdf, sentences and paragraphs that are only re-structured are ignored (latexdiff show swapped text as updates). Note in diff_highlight.pdf that algorithms are revised but not colored, to avoid latex rendering errors. In summary: - We significantly re-organize the sections and paragraphs to improve the smoothness, particularly section 2 to 7. - We collect all modeling assumptions and describe them in background section 3. - We modify the algorithms to clarify inputs, assumptions, computed quantities and outputs. We also separate alg. 1 and alg. 2 so that alg. 1 is now conventional safe AL, alg. 2 is safe AL with transfer GPs without source pre-computation (and alg. 3 safe transfer AL with pre-computation). - We reduce the burden of mathematical expressions, and we add more descriptive statements and intuition, this mainly affect section 3 to 6. - We move the original appendix figure 6 (the one illustrating that transfer learning identify new safe regions) to our theoretical section (section 5, figure 3) to strengthen our argument. We add corollary 5.3 to help clarify our theoretical result. - We add reference lines to figure 2 \& 3 to improve clarity (functions/data the same, theoretical illustration, section 5) - We restructure the presentation of our experiments, and we add a table 2 as an overview of all our datasets. The experiment subsections are now grouped according to different empirical metrics; all the datasets, methods, metrics and results are identical. - In the appendix, we add a section to extend in detail multitask GPs of multiple source tasks. ------ ------ ## Resubmit change: We are glad that you allow us to resubmit our paper. We suggest a novel algorithm for safe active transfer learning that allows for faster exploration than previous state of the art. During our previous submission, reviewers considered this as "relevant to the TMLR audience" and that we "brings new ideas for transferring safety constraints". The major criticism was the low dimensionality up to 3 as well as shortage of real world experiments. In this resubmission, we have addressed these concerns by adding a 13 dimensional real-world problem, on which our framework once more shows superiority compared to the relevant benchmarks. The results can be found in section 5.2, and table 3, figure 4-5, as well as further details in appendix D, appendix fig 9 and appendix table 4. Additionally, we also addressed further reviewer remarks: - we clarified that dependency of safety functions does NOT affect our theoretical arguments (see modified Remark 4.1) as Rev 87Xf was worried that our independence assumption could be limiting. - we explained how the discretization in constrained acquisition optimization impact the complexity (modified Remark 3.2) in response to Rev 87Xf - we extended the conclusion with discussion of sparse GPs on safe AL/BO (this is itself an open line of research, not trivially applicable to our framework) We also fix typos, and we slightly adjust notations to improve readability. We further attach an edit track pdf in supplementary (anonymized, latexdiff tool).
Assigned Action Editor: ~Branislav_Kveton1
Submission Number: 3336
Loading