['145a146,154', '> Section: Limitations', '> Our work provides a significant step towards understanding the computational sample complexity of learning margin halfspaces with Massart noise. However, it is important to acknowledge certain limitations and assumptions:', '> \\begin{itemize}', '>     \\item \\textbf{Margin Assumption:} A core assumption of our theoretical results (Theorem 1.3 and Theorem 2.1) is the existence of a $\\gamma$-margin halfspace. While this is a standard assumption in many learning settings, extending our near-optimal results to general halfspaces (i.e., without the margin assumption) remains an open and challenging problem. Our current approach, while potentially adaptable, would yield a suboptimal dependence on the dimension $d$.', '>     \\item \\textbf{Massart Noise Rate:} Our analysis is predicated on the $\\eta$-Massart noise condition with $\\eta < 1/2$. While this is a common and practical noise model, the behavior and algorithmic efficiency in scenarios with higher noise rates (e.g., adversarial noise or $\\eta \\ge 1/2$) are not covered by our current framework.', '>     \\item \\textbf{Computational Complexity:} While our algorithm achieves a sample complexity of $\\tilde{O}(1/(\\epsilon^2 \\gamma^2))$ and runs in polynomial time, specifically $\\tilde{O}(dn/\\epsilon)$ or $O(dNT)$ depending on the step, the implicit constants and specific polynomial dependencies might be large for extremely high-dimensional settings or very small $\\epsilon$ or $\\gamma$. Further fine-grained analysis of the constant factors could be a direction for future work.', '>     \\item \\textbf{Theoretical Nature:} This paper is theoretical and does not include empirical evaluations. While our results provide strong theoretical guarantees, practical performance and robustness to real-world data characteristics (e.g., non-uniform distributions, feature correlations not captured by the unit ball assumption) would need to be verified through experiments.', '> \\end{itemize}', '> ', '230a240,256', '> Section: Broader Impacts', '> Our work is theoretical in nature, providing foundational algorithmic results for learning under noise. As such, it does not have immediate direct societal applications or risks in its current form. However, as with any advancement in machine learning theory, there are potential broader impacts once these algorithms are integrated into real-world systems.', '> ', '> \\textbf{Positive Impacts:}', '> \\begin{itemize}', '>     \\item \\textbf{Improved Robustness:} By providing more efficient and robust algorithms for learning halfspaces under Massart noise, our work contributes to building more reliable and accurate machine learning models, especially in scenarios where label noise is prevalent. This could lead to better performance in critical applications where data quality is a concern.', '>     \\item \\textbf{Foundation for Future Research:} The theoretical insights and techniques developed in this paper could serve as a foundation for future research in robust learning, potentially leading to more general algorithms that are less sensitive to various forms of noise and adversarial attacks.', '> \\end{itemize}', '> ', '> \\textbf{Potential Negative Impacts and Ethical Considerations:}', '> \\begin{itemize}', '>     \\item \\textbf{Fairness and Bias Amplification:} While our work does not directly address fairness, robust learning algorithms, when applied to sensitive domains (e.g., credit scoring, hiring, criminal justice), could inadvertently amplify existing biases in the data if not carefully designed and audited. If the noise distribution itself is biased or if the underlying data reflects societal inequalities, a more efficient learner might propagate these issues more effectively.', '>     \\item \\textbf{Misuse in Surveillance and Profiling:} Halfspaces are fundamental building blocks for classification. Improved efficiency in learning them, even under noise, could theoretically be misused in applications like surveillance, profiling, or targeted manipulation, where accurate classification of individuals might raise privacy concerns.', '>     \\item \\textbf{Data Privacy:} Our theoretical framework assumes access to data samples. In practical deployments, the collection and use of such data must adhere to strict privacy regulations and ethical guidelines. Our work does not introduce new privacy risks but relies on the responsible handling of data in any potential application.', '> \\end{itemize}', '> We emphasize that these are potential downstream impacts, and the responsibility for ethical deployment lies with practitioners who adapt and apply these theoretical advancements.', '> ', '344c370', '< Caption: The answer NA means that the abstract and introduction do not include the claims made in the paper.• The abstract and/or introduction should clearly state the claims made, including the contributions made in the paper and important assumptions and limitations. A No or NA answer to this question will not be perceived well by the reviewers. • The claims made should match theoretical and experimental results, and reflect how much the results can be expected to generalize to other settings. • It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper.2. LimitationsQuestion: Does the paper discuss the limitations of the work performed by the authors? Answer: [Yes] Justification: The limitations are clearly stated in the statements of each theorem and are discussed in the introduction of the paper.', '---', '> Caption: The answer NA means that the abstract and introduction do not include the claims made in the paper.• The abstract and/or introduction should clearly state the claims made, including the contributions made in the paper and important assumptions and limitations. A No or NA answer to this question will not be perceived well by the reviewers. • The claims made should match theoretical and experimental results, and reflect how much the results can be expected to generalize to other settings. • It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper.2. LimitationsQuestion: Does the paper discuss the limitations of the work performed by the authors? Answer: [Yes] Justification: The limitations of our work, including key assumptions and areas for future extension, are explicitly discussed in a dedicated "Limitations" section.', '349c375', '< Caption: Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justification: The paper is theoretical in nature and does not include experiments. Guidelines:• The answer NA means that paper does not include experiments requiring code. • Please see the NeurIPS code and data submission guidelines (https://nips.cc/ public/guides/CodeSubmissionPolicy) for more details. • While we encourage the release of code and data, we understand that this might not be possible, so "No" is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark). • The instructions should contain the exact command and environment needed to run to reproduce the results. See the NeurIPS code and data submission guidelines (https: //nips.cc/public/guides/CodeSubmissionPolicy) for more details. • The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc. • The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why. • At submission time, to preserve anonymity, the authors should release anonymized versions (if applicable). • Providing as much information as possible in supplemental material (appended to the paper) is recommended, but including URLs to data and code is permitted. 6. Experimental Setting/Details Question: Does the paper specify all the training and test details (e.g., data splits, hyper-The factors of variability that the error bars are capturing should be clearly stated (for example, train/test split, initialization, random drawing of some parameter, or overall run with given experimental conditions). • The method for calculating the error bars should be explained (closed form formula, call to a library function, bootstrap, etc.) • The assumptions made should be given (e.g., Normally distributed errors). • It should be clear whether the error bar is the standard deviation or the standard error of the mean. • It is OK to report 1-sigma error bars, but one should state it. The authors should preferably report a 2-sigma error bar than state that they have a 96% CI, if the hypothesis of Normality of errors is not verified. • For asymmetric distributions, the authors should be careful not to show in tables or figures symmetric error bars that would yield results that are out of range (e.g. negative error rates). • If error bars are reported in tables or plots, The authors should explain in the text how they were calculated and reference the corresponding figures or tables in the text. 8. Experiments Compute Resources Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [NA] Justification: The paper is theoretical in nature and does not include experiments. Guidelines: • The answer NA means that the paper does not include experiments. • The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage. • The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute. • The paper should disclose whether the full research project required more compute than the experiments reported in the paper (e.g., preliminary or failed experiments that didn\'t make it into the paper). The answer NA means that the authors have not reviewed the NeurIPS Code of Ethics. • If the authors answer No, they should explain the special circumstances that require a deviation from the Code of Ethics. • The authors should make sure to preserve anonymity (e.g., if there is a special consideration due to laws or regulations in their jurisdiction). 10. Broader Impacts Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed? Answer: [NA] Justification: The work is theoretical and we do not see any major or immediate implications on society.', '---', '> Caption: Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justification: The paper is theoretical in nature and does not include experiments. Guidelines:• The answer NA means that paper does not include experiments requiring code. • Please see the NeurIPS code and data submission guidelines (https://nips.cc/ public/guides/CodeSubmissionPolicy) for more details. • While we encourage the release of code and data, we understand that this might not be possible, so "No" is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark). • The instructions should contain the exact command and environment needed to run to reproduce the results. See the NeurIPS code and data submission guidelines (https: //nips.cc/public/guides/CodeSubmissionPolicy) for more details. • The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc. • The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why. • At submission time, to preserve anonymity, the authors should release anonymized versions (if applicable). • Providing as much information as possible in supplemental material (appended to the paper) is recommended, but including URLs to data and code is permitted. 6. Experimental Setting/Details Question: Does the paper specify all the training and test details (e.g., data splits, hyper-The factors of variability that the error bars are capturing should be clearly stated (for example, train/test split, initialization, random drawing of some parameter, or overall run with given experimental conditions). • The method for calculating the error bars should be explained (closed form formula, call to a library function, bootstrap, etc.) • The assumptions made should be given (e.g., Normally distributed errors). • It should be clear whether the error bar is the standard deviation or the standard error of the mean. • It is OK to report 1-sigma error bars, but one should state it. The authors should preferably report a 2-sigma error bar than state that they have a 96% CI, if the hypothesis of Normality of errors is not verified. • For asymmetric distributions, the authors should be careful not to show in tables or figures symmetric error bars that would yield results that are out of range (e.g. negative error rates). • If error bars are reported in tables or plots, The authors should explain in the text how they were calculated and reference the corresponding figures or tables in the text. 8. Experiments Compute Resources Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [NA] Justification: The paper is theoretical in nature and does not include experiments. Guidelines: • The answer NA means that the paper does not include experiments. • The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage. • The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute. • The paper should disclose whether the full research project required more compute than the experiments reported in the paper (e.g., preliminary or failed experiments that didn\'t make it into the paper). The answer NA means that the authors have not reviewed the NeurIPS Code of Ethics. • If the authors answer No, they should explain the special circumstances that require a deviation from the Code of Ethics. • The authors should make sure to preserve anonymity (e.g., if there is a special consideration due to laws or regulations in their jurisdiction). 10. Broader Impacts Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed? Answer: [Yes] Justification: We have included a dedicated "Broader Impacts" section to discuss the potential positive contributions and possible negative societal implications and ethical considerations of our theoretical work, acknowledging that foundational research can have downstream effects.', '471d496', '< ']
