{
    "negative": [
        "The authors present an interesting intuition for aligning latent spaces across modalities, but the method itself appears underdeveloped. The loss function combines several heuristic terms whose interactions are not well analyzed, and training stability seems to be an issue according to the authors’ own results. Several hyperparameters are tuned per dataset, which undermines claims of generality. Furthermore, the evaluation benchmarks are outdated, omitting more recent multimodal datasets where the method’s limitations might be more apparent.",
        "The paper proposes a novel regularization framework for training deep networks under distribution shift, but the technical contribution is ultimately limited. The core idea reduces to a reweighting of gradients using an uncertainty proxy that is closely related to existing variance-based methods, yet the authors do not clearly articulate what is fundamentally new. Several key claims, such as improved robustness to covariate shift, are only supported by small-scale experiments on CIFAR variants, and the reported gains are within standard deviation of strong baselines. Moreover, the theoretical section relies on assumptions (e.g., bounded Hessians and oracle access to uncertainty estimates) that are unrealistic in modern deep learning settings, making the practical relevance of the analysis questionable.",
        "This submission introduces a transformer-based architecture for graph reasoning, but similar hybrid designs have already been explored extensively. The novelty claim hinges on a specific message-passing schedule, yet no comparison is made to closely related graph transformers using alternative schedules. Additionally, the computational cost is significantly higher than baselines, and the authors do not report wall-clock times or memory usage. Without a compelling accuracy–efficiency trade-off or a stronger theoretical justification, it is hard to see the impact of this work.",
        "The theoretical contribution of the paper is a convergence proof for a non-convex optimization scheme, but the proof largely follows known techniques and does not yield new insights. Key lemmas are adaptations of existing results, and the final theorem holds only under restrictive learning rate schedules that are not used in practice. As a result, the gap between theory and experiments remains large. I do not believe this theoretical analysis alone is sufficient to warrant acceptance.",
        "Although the paper is well written, the experimental evidence does not convincingly support the stated conclusions. Many baselines are either weak or improperly tuned, and in some cases the proposed model underperforms standard methods when evaluated under slightly different metrics. The authors claim state-of-the-art results, but this relies on selective reporting of metrics and datasets. The absence of statistical significance testing further weakens the empirical claims.",
        "While the motivation of bridging representation learning and causal invariance is appealing, the proposed method lacks sufficient empirical validation to justify acceptance. The experiments are restricted to synthetic causal graphs and a single real-world dataset, and ablations are superficial. In particular, it is unclear whether performance improvements stem from the causal objective or simply from additional regularization. The writing is generally clear, but important implementation details (e.g., how interventions are simulated during training) are deferred to the appendix, making reproduction difficult. Overall, the paper feels incremental relative to prior work on invariant risk minimization.",
        "While the idea of combining contrastive learning with supervised objectives is not new, this paper does not sufficiently differentiate itself from prior art. The architectural changes are minimal, and the improvements are marginal. The authors also fail to discuss negative results or scenarios where the method fails, which raises concerns about overfitting to selected benchmarks. The contribution feels more like a workshop paper than a full conference submission.",
        "This work aims to improve sample efficiency in reinforcement learning via a learned world model, but the empirical gains are modest and inconsistent. In some environments, the method performs worse than model-free baselines, and the authors attribute this to implementation issues without further investigation. The comparison set omits several strong recent model-based methods, making it difficult to assess the true contribution. Overall, the paper does not yet meet the bar for a top-tier conference.",
        "In summary, the paper addresses a relevant problem and is competently executed, but it lacks a strong, clearly articulated contribution. Both the methodological novelty and empirical impact are limited, and several important baselines and analyses are missing. I would encourage the authors to strengthen the experimental section and clarify the novelty before resubmission.",
        "The proposed fairness-aware training objective is motivated by an important problem, but the evaluation is narrow and lacks depth. The authors focus on a single fairness metric and do not explore trade-offs with accuracy or other notions of fairness. Additionally, the method assumes access to sensitive attributes during training, which may not be realistic in many applications. These limitations significantly reduce the applicability of the approach."
    ],
    "positive": [
        "The authors introduce a novel framework for causal representation learning that meaningfully advances prior work. Unlike earlier approaches, the proposed method scales to high-dimensional inputs and is compatible with standard deep architectures. The theoretical analysis, while necessarily abstract, provides useful intuition and is well aligned with the empirical findings. Overall, this is a strong contribution that I expect will influence future research in this area.",
        "This submission makes a strong case for revisiting contrastive learning in supervised settings. The proposed modifications are simple yet effective, and the authors convincingly show that they lead to better representations and downstream performance. The experiments are carefully controlled, and the paper includes insightful analysis of representation quality. I believe this work will be of broad interest to the community.",
        "I found this paper to be a significant and timely contribution. The authors propose a new training paradigm that unifies several existing ideas under a coherent framework, leading to improved performance and stability. The experimental section is comprehensive, including strong baselines, multiple datasets, and rigorous statistical analysis. The paper is also very well written, making it accessible to a broad audience.",
        "The authors address fairness in machine learning with a method that is both principled and practical. The proposed objective is grounded in a clear theoretical framework, and the empirical evaluation spans multiple datasets and fairness metrics. I appreciate the discussion of limitations and ethical considerations, which demonstrates a mature and responsible approach to the problem. This paper sets a high standard for future work in this area.",
        "This paper presents a compelling and well-motivated approach to improving generalization under distribution shift by explicitly modeling uncertainty during training. The method is technically sound, and the authors provide a clear derivation connecting their objective to robustness guarantees. Empirically, the approach is validated across a diverse set of benchmarks, including both vision and language tasks, where it consistently outperforms strong baselines. I particularly appreciated the thorough ablation studies, which convincingly demonstrate the contribution of each component.",
        "This paper presents a novel and effective approach to model-based reinforcement learning that significantly improves sample efficiency. The learned world model is well designed, and the integration with planning is technically sound. Extensive experiments across a range of environments show consistent and sometimes dramatic improvements over both model-free and model-based baselines. The analysis of failure modes further adds credibility to the results.",
        "In summary, this is an excellent paper that combines novelty, technical depth, and strong empirical results. The problem is well motivated, the solution is elegant, and the evaluation is thorough. I strongly recommend acceptance, as this work will likely have a lasting impact on the field.",
        "The theoretical contribution of this work is substantial and goes beyond incremental analysis. The convergence results provide new insights into a class of non-convex optimization problems that are widely used in practice. Importantly, the authors also demonstrate that the theory has practical implications by aligning their assumptions with realistic training settings. This balance between theory and practice is commendable.",
        "This is a well-executed paper that combines architectural innovation with careful experimental evaluation. The proposed graph transformer variant is elegant and addresses known limitations of existing models in capturing long-range dependencies. The empirical results are strong, showing consistent improvements on multiple graph benchmarks, and the authors are transparent about computational costs. The clarity of presentation and reproducibility details further strengthen the submission.",
        "The paper tackles an important problem in multimodal learning and proposes a solution that is both intuitive and effective. The loss formulation is thoughtfully designed, and the authors provide insightful analysis into why the alignment strategy works. Experiments on challenging multimodal benchmarks show clear gains over prior methods, and the qualitative results are particularly convincing. This work represents a meaningful step forward for the field."
    ]
}