Adaptive Decision-Making for Optimization of Safety-Critical Systems: The ARTEO Algorithm

Adaptive Decision-Making for Optimization of Safety-Critical Systems: The ARTEO Algorithm

TMLR Paper1180 Authors

22 May 2023 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Real-time decision-making in uncertain environments with safety constraints is a common problem in many business and industrial applications. In these problems, it is often the case that a general structure of the problem and some of the underlying relationships among the decision variables are known and other relationships are unknown but measurable subject to a certain level of noise. In this work, we develop the ARTEO algorithm by formulating such real-time decision-making problems as constrained mathematical programming problems, where we combine known structures involved in the objective function and constraint formulations with learned Gaussian process (GP) regression models. We then utilize the uncertainty estimates of the GPs to (i) enforce the resulting safety constraints within a confidence interval and (ii) make the cumulative uncertainty expressed in the decision variable space a part of the objective function to drive exploration for further learning – subject to the safety constraints. We demonstrate the safety and efficiency of our approach with two case studies: optimization of electric motor current and real-time bidding problems. We further evaluate the performance of ARTEO compared to other methods that rely entirely on GP-based safe exploration and optimization. The results indicate that ARTEO benefits from the incorporation of prior knowledge to the optimization problems and leads to lower cumulative regret while ensuring the satisfaction of the safety constraints.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission:

In response to the valuable feedback from reviewers, we have made focused revisions across sections 2 and 3 of our manuscript: In summary, our amendments in this revision follow:

Problem formulation: Clarified the time-dependent behaviour of $C_t$ and redefined its inputs to align with the total number of outputs from relevant functions. Introduced a vector function $v(x)$ to clarify inputs to $C_t$ and safety constraints.
Time dependence and assumptions: Resolved contradictions regarding the time-dependence knowledge of $C_t$ and $g_{a,t}$, emphasizing uniform application of assumptions.
Generality of problem formulation: Added discussions in Appendix B about extending our approach to scenarios with multiple black-box functions and safety constraints.
Related work expansion and clarification: Broadened the discussion on Safe Bayesian Optimization and Time-Varying Bayesian Optimization. Addressed the approach's distinction regarding past data retention.
Clarification on oracle usage: Revised the term "oracle" to reflect practical aspects of obtaining noisy observations from experiments. These amendments aim to enhance the clarity, methodology, and justification of our problem formulation, and respond directly to the reviewers' concerns for improved overall manuscript quality.

We sincerely hope that these changes adequately address the insightful concerns raised by the reviewers and improve the quality of our manuscript.

Assigned Action Editor: Pascal Poupart

Submission Number: 1180

Loading