[
  {
    "role": "user",
    "content": "The user asked if the paper 'upload/AAAI_adv.pdf' has explicit authors. I have determined that the paper is an 'Anonymous submission' and therefore has no explicit authors. I have saved this answer in the file 'author_info_AAAI_adv.json'. Please verify if the answer is correct and the file is properly formatted and accessible.\n\nContinue, no tools are running in this round, and no results are returned\n\n{\"tool_calls\": [{\"id\": \"toolu_016pJgkjuwvGPTgujaBBZM6k\", \"name\": \"dir_list\", \"arguments\": {}}]}成功执行，结果为tool_result:{\"status\": \"success\", \"output\": \"{\\n  \\\"tree\\\": {\\n    \\\"name\\\": \\\"recovery_test\\\",\\n    \\\"type\\\": \\\"directory\\\",\\n    \\\"path\\\": \\\".\\\",\\n    \\\"children\\\": [\\n      {\\n        \\\"name\\\": \\\".DS_Store\\\",\\n        \\\"type\\\": \\\"file\\\",\\n        \\\"path\\\": \\\".DS_Store\\\",\\n        \\\"size\\\": 6148\\n      },\\n      {\\n        \\\"name\\\": \\\"author_info_AAAI_adv.json\\\",\\n        \\\"type\\\": \\\"file\\\",\\n        \\\"path\\\": \\\"author_info_AAAI_adv.json\\\",\\n        \\\"size\\\": 156\\n      },\\n      {\\n        \\\"name\\\": \\\"checkpoint\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"checkpoint\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"code_env\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"code_env\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"code_run\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"code_run\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"config\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"config\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"logs\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"logs\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"rag_db\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"rag_db\\\"\\n      },\\n      {\\n        \\\"name\\\": \\\"test_recovery.txt\\\",\\n        \\\"type\\\": \\\"file\\\",\\n        \\\"path\\\": \\\"test_recovery.txt\\\",\\n        \\\"size\\\": 21\\n      },\\n      {\\n        \\\"name\\\": \\\"upload\\\",\\n        \\\"type\\\": \\\"directory\\\",\\n        \\\"path\\\": \\\"upload\\\"\\n      }\\n    ]\\n  },\\n  \\\"base_path\\\": \\\"/workspace/tasks/recovery_test\\\"\\n}\", \"error_information\": \"\"}\n\n{\"tool_calls\": [{\"id\": \"toolu_01EEohi3nVpYAwvrLMyQ4Ujy\", \"name\": \"file_read\", \"arguments\": {\"file_path\": \"author_info_AAAI_adv.json\"}}]}成功执行，结果为tool_result:{\"status\": \"success\", \"output\": \"{\\n  \\\"content\\\": \\\"{\\\\n  \\\\\\\"question\\\\\\\": \\\\\\\"upload/AAAI_adv.pdf中有没有明确的作者？\\\\\\\",\\\\n  \\\\\\\"answer\\\\\\\": \\\\\\\"文章中没有明确的作者，显示为“Anonymous submission”。\\\\\\\"\\\\n}\\\",\\n  \\\"total_lines\\\": 4,\\n  \\\"line_range\\\": \\\"1-4\\\",\\n  \\\"file_path\\\": \\\"author_info_AAAI_adv.json\\\",\\n  \\\"container_path\\\": \\\"/workspace/tasks/recovery_test/author_info_AAAI_adv.json\\\",\\n  \\\"detected_encoding\\\": \\\"utf-8\\\",\\n  \\\"file_size\\\": 156\\n}\", \"error_information\": \"\"}\n\n{\"tool_calls\": [{\"id\": \"toolu_01UY5DggASfd2PN4r7GrnjXp\", \"name\": \"dir_list\", \"arguments\": {\"dir_path\": \"upload\"}}]}成功执行，结果为tool_result:{\"status\": \"success\", \"output\": \"{\\n  \\\"tree\\\": {\\n    \\\"name\\\": \\\"upload\\\",\\n    \\\"type\\\": \\\"directory\\\",\\n    \\\"path\\\": \\\"upload\\\",\\n    \\\"children\\\": [\\n      {\\n        \\\"name\\\": \\\"AAAI_adv.pdf\\\",\\n        \\\"type\\\": \\\"file\\\",\\n        \\\"path\\\": \\\"upload/AAAI_adv.pdf\\\",\\n        \\\"size\\\": 267693\\n      }\\n    ]\\n  },\\n  \\\"base_path\\\": \\\"/workspace/tasks/recovery_test\\\"\\n}\", \"error_information\": \"\"}\n\n{\"tool_calls\": [{\"id\": \"toolu_01XUVdoULosQy2Q5PpNvtLLx\", \"name\": \"parse_document\", \"arguments\": {\"file_path\": \"upload/AAAI_adv.pdf\"}}]}成功执行，结果为tool_result:{\"status\": \"success\", \"output\": \"{\\n  \\\"content\\\": \\\"AMSDAS: Adaptive Multi-Scale Dynamic Activation Smoothing for Enhanced\\\\nAdversarial Training\\\\nAnonymous submission\\\\nAbstract\\\\nAdversarial training has emerged as a crucial technique for\\\\nimproving the robustness of deep neural networks against\\\\nadversarial attacks. However, traditional adversarial training\\\\nmethods often suffer from overfitting to specific perturba-\\\\ntion types and computational inefficiency. This paper intro-\\\\nduces Adaptive Multi-Scale Dynamic Activation Smoothing\\\\n(AMSDAS), a novel approach that addresses these limita-\\\\ntions by implementing adaptive activation smoothing across\\\\ndifferent network scales. AMSDAS dynamically adjusts acti-\\\\nvation smoothness based on input vulnerability and network\\\\nregion, providing a more balanced trade-off between robust-\\\\nness and accuracy. Our experimental evaluation on CIFAR-\\\\n10 and Fashion-MNIST datasets demonstrates that AMSDAS\\\\nachieves significant robustness improvements over standard\\\\ntraining (6.90% improvement) while maintaining computa-\\\\ntional efficiency. Through comprehensive ablation studies and\\\\nhyperparameter optimization, we show that AMSDAS pro-\\\\nvides a stable and efficient alternative to traditional adversar-\\\\nial training methods, particularly effective in scenarios where\\\\ncomputational resources are limited.\\\\nIntroduction\\\\nThe vulnerability of deep neural networks to adversarial at-\\\\ntacks—imperceptible perturbations deliberately designed to\\\\ncause misclassification—has emerged as a critical concern\\\\nin deploying machine learning systems in high-stakes envi-\\\\nronments (Goodfellow, Shlens, and Szegedy 2014; Szegedy\\\\net al. 2014). Despite the remarkable performance of modern\\\\ndeep learning models across various domains, their suscep-\\\\ntibility to such attacks reveals fundamental weaknesses in\\\\ntheir underlying representational mechanisms. This vulnera-\\\\nbility not only compromises model reliability but also raises\\\\nsignificant security concerns in critical applications such as\\\\nautonomous driving, medical diagnostics, and financial sys-\\\\ntems.\\\\nAdversarial training has established itself as the most ef-\\\\nfective defense strategy against such attacks (Madry et al.\\\\n2017), adopting a min-max optimization framework where\\\\nmodels are trained on adversarial examples generated during\\\\nthe training process. However, standard adversarial training\\\\napproaches suffer from three key limitations. First, they typ-\\\\nically impose a significant trade-off between robustness and\\\\nclean accuracy (Zhang et al. 2019b), often degrading perfor-\\\\nmance on unperturbed data. Second, they require substantialcomputational resources due to the iterative nature of strong\\\\nattack generation. Third, they tend to overfit to specific per-\\\\nturbation types used during training, limiting generalization\\\\nto novel or unseen attacks (Bai et al. 2021).\\\\nRecent approaches have explored activation function mod-\\\\nifications as a promising direction for enhancing adversarial\\\\nrobustness. Smooth Adversarial Training (SAT) (Xie et al.\\\\n2020) demonstrated that replacing ReLU with smooth acti-\\\\nvation functions can improve robustness by creating more\\\\ngradual transitions in decision spaces. However, these meth-\\\\nods typically apply uniform smoothing across all network\\\\nlayers, ignoring the hierarchical nature of neural representa-\\\\ntions and missing opportunities for layer-specific optimiza-\\\\ntion. Furthermore, they generally employ fixed smoothing\\\\nparameters regardless of input characteristics, failing to adapt\\\\nto varying levels of sample vulnerability.\\\\nTo address these limitations, we introduce Adaptive Multi-\\\\nScale Dynamic Activation Smoothing (AMSDAS), a novel\\\\napproach that enhances adversarial robustness while main-\\\\ntaining competitive clean accuracy. AMSDAS builds upon\\\\nthree key innovations. First, it implements multi-scale acti-\\\\nvation smoothing that applies different smoothness levels to\\\\ndifferent network regions, recognizing that early, middle, and\\\\nlate layers capture information at different abstraction levels\\\\nand thus require customized smoothing strategies. Second,\\\\nit incorporates vulnerability-aware dynamic adaptation that\\\\nadjusts activation functions based on sample-specific vul-\\\\nnerability metrics, allocating computational resources more\\\\nefficiently. Third, it establishes a coordinated perturbation\\\\nbudget adaptation mechanism that aligns smoothing param-\\\\neters with perturbation magnitudes, creating a feedback loop\\\\nthat optimizes the robustness-accuracy trade-off.\\\\nOur comprehensive experimental evaluation demonstrates\\\\nthe effectiveness of AMSDAS across different model archi-\\\\ntectures and datasets. On the CIFAR-10 dataset with ResNet-\\\\n18, AMSDAS achieves 79.46% accuracy, representing a sig-\\\\nnificant improvement over traditional adversarial training\\\\nmethods while avoiding the overfitting issues commonly ob-\\\\nserved in adversarial training approaches.\\\\nThe main contributions of this paper are:\\\\n1. We introduce AMSDAS, a novel adversarial train-\\\\ning framework that implements multi-scale activation\\\\nsmoothing with vulnerability-aware dynamic adaptation.\\\\n2. We provide a theoretical analysis connecting activation\\\\nsmoothness, gradient stability, and adversarial robustness,\\\\nshowing how AMSDAS reduces the Lipschitz constant of\\\\nneural networks in a targeted manner.\\\\n3. We empirically demonstrate that AMSDAS improves ad-\\\\nversarial robustness while maintaining competitive clean\\\\naccuracy across different architectures and datasets.\\\\n4. Through ablation studies, we precisely quantify the con-\\\\ntribution of each AMSDAS component, revealing that\\\\nearly layer smoothing accounts for most performance ben-\\\\nefits, providing valuable insights into the role of different\\\\nnetwork regions in adversarial robustness.\\\\n5. We analyze the training dynamics of AMSDAS, show-\\\\ning that it produces smoother loss descent trajectories\\\\nand more stable test performance compared to standard\\\\nmethods, avoiding the oscillations commonly observed in\\\\nadversarial training.\\\\nRelated Work\\\\nAdversarial examples, first highlighted by (Goodfellow,\\\\nShlens, and Szegedy 2014) and (Szegedy et al. 2014), have\\\\nrevealed critical vulnerabilities in deep neural networks. This\\\\nsection reviews key developments in adversarial training\\\\nand related techniques that inform our proposed AMSDAS\\\\nmethod.\\\\nFoundations of Adversarial Training\\\\nAdversarial training was first formalized by (Goodfellow,\\\\nShlens, and Szegedy 2014), who introduced the Fast Gradient\\\\nSign Method (FGSM). Building upon this work, (Madry\\\\net al. 2017) developed a min-max optimization framework\\\\nthat remains the gold standard for adversarial training. Their\\\\napproach uses Projected Gradient Descent (PGD) to generate\\\\nstronger adversarial examples during training:\\\\nmin\\\\n𝜃E(𝑥,𝑦)∼D\\\\u0014\\\\nmax\\\\n𝛿∈SL(𝜃,𝑥+𝛿,𝑦)\\\\u0015\\\\n(1)\\\\nwhere𝜃represents model parameters, (𝑥,𝑦)are training\\\\nexamples,𝛿is the adversarial perturbation constrained to set\\\\nS, andLis the loss function.\\\\nWhile this approach significantly improves robustness\\\\nagainst adversarial attacks, it introduces several challenges.\\\\n(Zhang et al. 2019b) identified a fundamental trade-off be-\\\\ntween robustness and accuracy, developing the TRADES al-\\\\ngorithm that explicitly balances these competing objectives:\\\\nmin\\\\n𝜃E(𝑥,𝑦)∼D\\\\u0014\\\\nL(𝜃,𝑥,𝑦)+𝛽·max\\\\n𝛿∈S𝐷𝐾𝐿(𝑓𝜃(𝑥)∥𝑓𝜃(𝑥+𝛿))\\\\u0015\\\\n(2)\\\\nwhere𝛽controls the trade-off between natural accuracy\\\\nand robustness.\\\\nFast Adversarial Training Methods\\\\nThe high computational cost of PGD-based adversarial train-\\\\ning has motivated research into more efficient alternatives.\\\\n(Wong, Rice, and Kolter 2020) demonstrated that FGSM\\\\nwith random initialization (FGSM-RS) can achieve compa-\\\\nrable robustness to multi-step PGD while requiring only afraction of the computational resources. However, this ap-\\\\nproach suffers from ”catastrophic overfitting,” where models\\\\nsuddenly lose robustness during training.\\\\n(Andriushchenko et al. 2020) analyzed the causes of this\\\\nphenomenon, identifying gradient alignment as a critical fac-\\\\ntor. They proposed GradAlign regularization to stabilize fast\\\\nadversarial training by encouraging the alignment between\\\\ngradients of the original and perturbed inputs.\\\\nSmoothness and Activation Functions\\\\nThe choice of activation functions significantly impacts neu-\\\\nral network robustness against adversarial attacks. Tradi-\\\\ntional ReLU activations can create sharp decision boundaries\\\\nthat adversarial examples exploit. (Xie et al. 2020) demon-\\\\nstrated that replacing ReLU with smooth activation functions\\\\ncan enhance adversarial robustness by creating more gradual\\\\ntransitions in the decision space.\\\\nTheir Smooth Adversarial Training (SAT) approach uses\\\\nSiLU (Swish) activations:\\\\nSiLU(𝑥)=𝑥·𝜎(𝑥) (3)\\\\nwhere𝜎is the sigmoid function. This smoothness pro-\\\\nmotes gradient stability during adversarial training and im-\\\\nproves generalization to unseen attack types.\\\\nHowever, these approaches typically apply uniform\\\\nsmoothing across all network layers, ignoring the hierarchi-\\\\ncal nature of neural representations. This limitation directly\\\\nmotivates our multi-scale approach in AMSDAS.\\\\nMulti-Scale and Adaptive Approaches\\\\nRecent advances in adversarial training have explored multi-\\\\nscale and adaptive approaches that dynamically adjust the\\\\ntraining process based on input characteristics or network\\\\narchitecture. (Huang et al. 2021) introduced an adaptive ad-\\\\nversarial training framework that dynamically adjusts pertur-\\\\nbation budgets based on sample difficulty, showing improve-\\\\nments in both robustness and convergence speed.\\\\n(Wang et al. 2021) analyzed the convergence properties of\\\\nadversarial training, highlighting the importance of adaptive\\\\noptimization strategies that balance exploration and exploita-\\\\ntion during the inner maximization process.\\\\nDespite significant advances in adversarial training,\\\\nseveral limitations persist in current approaches: most\\\\nactivation-based methods apply uniform smoothing across\\\\nthe network, existing approaches typically use fixed pertur-\\\\nbation budgets that don’t adapt to input vulnerability, and\\\\nthe trade-off between robustness and accuracy remains a sig-\\\\nnificant challenge. Our AMSDAS approach addresses these\\\\ngaps by introducing adaptive multi-scale activation smooth-\\\\ning that dynamically adjusts smoothness parameters based\\\\non both network hierarchy and input vulnerability.\\\\nMethodology\\\\nIn this section, we present the Adaptive Multi-Scale Dy-\\\\nnamic Activation Smoothing (AMSDAS) methodology, a\\\\nnovel approach for enhancing the robustness of deep neural\\\\nnetworks against adversarial attacks. AMSDAS combines\\\\nmulti-scale activation smoothing with vulnerability-aware\\\\nadaptation mechanisms to overcome the robustness-accuracy\\\\ntrade-off common in adversarial training.\\\\nMotivation and Framework Overview\\\\nStandard adversarial training improves model robustness by\\\\ntraining on adversarial examples, but often sacrifices clean\\\\naccuracy and suffers from overfitting to specific perturbation\\\\ntypes. Existing smoothing-based approaches typically apply\\\\nuniform smoothing across the network, ignoring the hierar-\\\\nchical nature of feature representations and variable input\\\\nvulnerability.\\\\nAMSDAS addresses these limitations through three key in-\\\\nnovations: (1) Multi-scale activation smoothing that applies\\\\ndifferent smoothness levels to different network regions; (2)\\\\nInput-adaptive smoothing that dynamically adjusts activation\\\\nfunctions based on sample vulnerability; and (3) Coordinated\\\\nperturbation budget adaptation that aligns smoothing param-\\\\neters with perturbation magnitudes.\\\\nMulti-Scale Activation Smoothing\\\\nNeural networks process information hierarchically, with ear-\\\\nlier layers capturing low-level features and later layers cap-\\\\nturing high-level semantic information. AMSDAS leverages\\\\nthis structure by applying different smoothness levels to ac-\\\\ntivations at different network scales.\\\\nParameterized Smooth Activation Functions We replace\\\\nstandard ReLU activations with parameterized smooth alter-\\\\nnatives. Specifically, we use a softplus approximation with a\\\\ncontrollable smoothness parameter 𝛽:\\\\nSmoothReLU(𝑥,𝛽)=1\\\\n𝛽log(1+𝑒𝛽𝑥) (4)\\\\nThis function interpolates between ReLU and a\\\\nsmooth activation based on 𝛽. As𝛽→ ∞ ,\\\\nSmoothReLU(𝑥,𝛽)→ max(0,𝑥)(standard ReLU). As 𝛽→\\\\n0, SmoothReLU(𝑥,𝛽)→𝑥(linear function). Intermediate\\\\nvalues provide varying degrees of smoothness.\\\\nLayer-Dependent Smoothness Assignment We partition\\\\nthe network into three regions and assign different smooth-\\\\nness parameters to each:\\\\n𝛽𝑙= \\\\n𝛽early if𝑙∈early layers\\\\n𝛽middle if𝑙∈middle layers\\\\n𝛽late if𝑙∈late layers(5)\\\\nwhere𝛽𝑙is the smoothness parameter for layer 𝑙.\\\\nOur experiments show that early layers benefit from\\\\nstronger smoothing (smaller 𝛽values) to stabilize gradient\\\\nflow, while late layers benefit from less smoothing (larger 𝛽\\\\nvalues) to preserve discriminative power:\\\\n𝛽early=𝛼early·𝛽base (6)\\\\n𝛽middle=𝛼middle·𝛽base (7)\\\\n𝛽late=𝛼late·𝛽base (8)\\\\nwhere𝛼early<𝛼 middle<𝛼 lateare scaling factors, and 𝛽base\\\\nis a base smoothness parameter.Vulnerability-Aware Dynamic Adaptation\\\\nAMSDAS introduces a vulnerability scoring mechanism that\\\\nassesses each input sample’s susceptibility to adversarial at-\\\\ntacks. This score guides the dynamic adaptation of both ac-\\\\ntivation smoothness and perturbation budgets.\\\\nVulnerability Score Computation For each input sample\\\\n𝑥with target label 𝑦, we compute a vulnerability score 𝑣(𝑥,𝑦)\\\\nthat quantifies its susceptibility to adversarial attacks:\\\\n𝑣(𝑥,𝑦)=||∇𝑥L(𝑓(𝑥),𝑦)||2\\\\nΔlogit(𝑥,𝑦)+𝜖(9)\\\\nwhere∇𝑥L(𝑓(𝑥),𝑦)is the gradient of the loss with respect\\\\nto input𝑥,Δlogit(𝑥,𝑦)=𝑓𝑦(𝑥)−max𝑗≠𝑦𝑓𝑗(𝑥)is the logit\\\\ngap between the target class and the highest non-target class,\\\\nand𝜖is a small constant for numerical stability.\\\\nA higher gradient magnitude and smaller logit gap indicate\\\\ngreater vulnerability to adversarial perturbations.\\\\nNormalized Vulnerability Scores To facilitate meaning-\\\\nful comparisons across batches and epochs, we normalize\\\\nvulnerability scores to a standard range:\\\\nˆ𝑣(𝑥,𝑦)=𝑣(𝑥,𝑦)−min batch𝑣(𝑥′,𝑦′)\\\\nmax batch𝑣(𝑥′,𝑦′)−min batch𝑣(𝑥′,𝑦′)+𝜖(10)\\\\nThis normalization ensures consistent adaptation despite\\\\nvarying vulnerability distributions across different training\\\\nstages.\\\\nAdaptive Perturbation Budget\\\\nAMSDAS dynamically adjusts the perturbation budget 𝜖for\\\\neach input based on its vulnerability score, creating a feed-\\\\nback loop between activation smoothness and adversarial\\\\nexample generation:\\\\n𝜖adaptive(𝑥,𝑦)=𝜖base·(1+𝛾·(ˆ𝑣(𝑥,𝑦)−1)) (11)\\\\nwhere𝜖baseis the base perturbation budget, 𝛾is a scaling\\\\nfactor controlling the adaptation strength, and ˆ 𝑣(𝑥,𝑦)is the\\\\nnormalized vulnerability score.\\\\nThe adaptive budget is constrained to prevent extreme val-\\\\nues:\\\\n𝜖final(𝑥,𝑦)=min(max(𝜖adaptive(𝑥,𝑦),𝜖min),𝜖max)(12)\\\\nwhere𝜖min=𝜖base·0.5 and𝜖max=𝜖base·1.5 establish\\\\nreasonable bounds.\\\\nEpoch-Dependent Smoothness Schedule\\\\nTo further enhance training stability, AMSDAS implements\\\\nan epoch-dependent smoothness schedule that gradually ad-\\\\njusts the base smoothness parameter throughout training:\\\\n𝛽base(𝑒)=\\\\u001a𝛽min+(𝛽max−𝛽min)·2𝑒\\\\n𝐸if𝑒<𝐸\\\\n2\\\\n𝛽max−(𝛽max−𝛽min)·2𝑒−𝐸\\\\n𝐸if𝑒≥𝐸\\\\n2(13)\\\\nAlgorithm 1: Adaptive Multi-Scale Dynamic Activation\\\\nSmoothing (AMSDAS)\\\\nRequire: Training dataset 𝐷, model𝑓𝜃with parameters 𝜃,\\\\nnumber of epochs 𝐸, learning rate 𝜂, base perturbation\\\\nbudget𝜖base, scaling factors 𝛼early,𝛼middle ,𝛼late, adapta-\\\\ntion strength 𝛾\\\\n1:Initialize model with SmoothReLU activations\\\\n2:Assign layers to early, middle, and late regions\\\\n3:forepoch𝑒=1 to𝐸do\\\\n4: Calculate epoch-dependent base smoothness: 𝛽base(𝑒)\\\\n5: Set region-specific smoothness:\\\\n6:𝛽early=𝛼early·𝛽base(𝑒)\\\\n7:𝛽middle=𝛼middle·𝛽base(𝑒)\\\\n8:𝛽late=𝛼late·𝛽base(𝑒)\\\\n9: formini-batch(𝑋,𝑌)from𝐷do\\\\n10: Compute vulnerability scores: 𝑣(𝑥𝑖,𝑦𝑖)for each\\\\n(𝑥𝑖,𝑦𝑖)∈(𝑋,𝑌)\\\\n11: Normalize vulnerability scores: ˆ 𝑣(𝑥𝑖,𝑦𝑖)\\\\n12: Calculate adaptive perturbation budgets: 𝜖𝑖=\\\\n𝜖adaptive(𝑥𝑖,𝑦𝑖)\\\\n13: Generate adversarial examples: 𝑋′\\\\n𝑖=𝑥𝑖+𝛿𝑖, where\\\\n||𝛿𝑖||∞≤𝜖𝑖\\\\n14: Adapt activation smoothness based on vulnerability\\\\nscores\\\\n15: Compute loss:L(𝜃)=Lclean(𝜃,𝑋,𝑌) +\\\\nLadv(𝜃,𝑋′,𝑌)\\\\n16: Update model: 𝜃←𝜃−𝜂∇𝜃L(𝜃)\\\\n17: end for\\\\n18:end for\\\\n19:return Trained model with robust parameters 𝜃\\\\nwhere𝑒is the current epoch, 𝐸is the total number of\\\\ntraining epochs, and 𝛽minand𝛽maxare the minimum and\\\\nmaximum smoothness parameters.\\\\nThis schedule increases smoothness in early training\\\\nepochs to stabilize gradient flow, then gradually decreases\\\\nit to enhance discriminative power as training progresses.\\\\nTraining Algorithm\\\\nThe complete AMSDAS training procedure integrates all\\\\ncomponents into a unified framework, as outlined in Algo-\\\\nrithm 1:\\\\nThe algorithm jointly optimizes activation smoothness and\\\\nadversarial training objectives, enhancing robustness while\\\\nmaintaining clean accuracy through its adaptive mechanisms.\\\\nTheoretical Analysis\\\\nThe effectiveness of AMSDAS can be understood through\\\\nthe lens of Lipschitz continuity and robustness bounds. For\\\\na classifier𝑓with Lipschitz constant 𝐿𝑓, the robustness to\\\\nperturbations of magnitude 𝜖can be bounded as:\\\\n|𝑓(𝑥+𝛿)−𝑓(𝑥)|≤𝐿𝑓·||𝛿||≤𝐿𝑓·𝜖 (14)\\\\nAMSDAS reduces the Lipschitz constant of the network\\\\nthrough smooth activations, particularly in early layers wheregradient explosions often occur. The layer-specific smooth-\\\\nness assignment optimally balances this effect across the\\\\nnetwork:\\\\n𝐿𝑓=𝐿Ö\\\\n𝑙=1𝐿𝑙≈𝐿Ö\\\\n𝑙=1𝛽𝑙 (15)\\\\nwhere𝐿𝑙is the Lipschitz constant of layer 𝑙, which is\\\\napproximately proportional to the smoothness parameter 𝛽𝑙.\\\\nBy dynamically adjusting smoothness based on input vul-\\\\nnerability and network region, AMSDAS achieves a more\\\\nfavorable trade-off between robustness and accuracy com-\\\\npared to uniform smoothing approaches.\\\\nExperiments\\\\nExperimental Setup\\\\nWe evaluate AMSDAS on CIFAR-10 and Fashion-MNIST\\\\ndatasets using ResNet-18 and MobileNetV2 architectures.\\\\nFor CIFAR-10, we use per-channel normalization with val-\\\\nues (0.4914, 0.4822, 0.4465) and (0.2470, 0.2435, 0.2616)\\\\nfor mean and standard deviation respectively. For Fashion-\\\\nMNIST, we normalize using mean 0.2861 and standard devi-\\\\nation 0.3530. We train all models using SGD with momentum\\\\n0.9, weight decay 5e-4, initial learning rate 0.005 with co-\\\\nsine annealing schedule, batch size 128, and 50 epochs. For\\\\nadversarial training, we use PGD-7 with 𝜖=8/255 and step\\\\nsize𝛼=2/255. AMSDAS-specific parameters include base\\\\nsmoothness range [0.5, 10.0], early/middle/late layer scaling\\\\nfactors (1.5, 1.0, 0.7), and vulnerability scaling factor 0.5.\\\\nAll experiments are conducted using PyTorch on NVIDIA\\\\nV100 GPUs.\\\\nComparison with State-of-the-art\\\\nTo thoroughly validate AMSDAS’s effectiveness, we con-\\\\nduct extensive comparisons with state-of-the-art adversarial\\\\ntraining methods. Table 1 presents the performance compar-\\\\nison between our AMSDAS approach and baseline methods\\\\nacross different datasets and architectures.\\\\nOur experimental results demonstrate several key findings\\\\nthat validate AMSDAS’s effectiveness. AMSDAS achieves\\\\n79.46% accuracy on CIFAR-10 with ResNet-18, showing a\\\\nsignificant improvement over traditional adversarial train-\\\\ning methods. Through hyperparameter optimization, we\\\\nachieved the best performance with 80.47% accuracy us-\\\\ning learning rate 0.05, demonstrating AMSDAS’s superior\\\\neffectiveness compared to baseline methods.\\\\nThe higher final training loss of AMSDAS (0.991) com-\\\\npared to standard methods (0.860-0.991) indicates that our\\\\nmethod effectively prevents overfitting to specific pertur-\\\\nbation types, which is crucial for real-world deployment.\\\\nAMSDAS achieves its robustness improvements with only 5-\\\\n7% additional computational overhead compared to standard\\\\nadversarial training, making it more practical for resource-\\\\nconstrained environments.\\\\nTo validate our method’s architecture-agnostic proper-\\\\nties, we conducted experiments on MobileNetV2, achiev-\\\\ning 71.11% accuracy on CIFAR-10. The 8.35% performance\\\\ndifference compared to ResNet-18 is primarily attributable\\\\nTable 1: Performance comparison between AMSDAS and baseline methods across different datasets and architectures.\\\\nMethod Dataset Architecture Accuracy (%) Final Loss\\\\nStandard PGD Adversarial Training CIFAR-10 ResNet-18 76.28 0.8604\\\\nTRADES CIFAR-10 ResNet-18 78.01 0.9369\\\\nFast Adversarial Training CIFAR-10 ResNet-18 78.56 0.9364\\\\nSmooth Adversarial Training CIFAR-10 ResNet-18 79.44 0.9913\\\\nAMSDAS (Ours) CIFAR-10 ResNet-18 80.47 0.991\\\\nAMSDAS (Ours) CIFAR-10 MobileNetV2 71.11 1.127\\\\nAMSDAS (Ours) Fashion-MNIST ResNet-18 91.68 0.264\\\\nto architecture capacity rather than method compatibility is-\\\\nsues, which is consistent with performance gaps observed in\\\\nprior work on these architectures (Rebuffi et al. 2021).\\\\nOur most impressive results come from Fashion-MNIST,\\\\nwhere AMSDAS achieves 91.68% accuracy with ResNet-\\\\n18. This performance substantially exceeds typical baseline\\\\nresults on this dataset, highlighting AMSDAS’s exceptional\\\\nability to adapt to different data distributions and complexity\\\\nlevels.\\\\nDetailed Comparison with Traditional Adversarial\\\\nTraining\\\\nTable 2 provides a comprehensive comparison of AMSDAS\\\\nwith traditional adversarial training methods, including de-\\\\ntailed metrics for each approach.\\\\nWhile AMSDAS achieves superior test accuracy (80.47%)\\\\ncompared to traditional methods like standard PGD\\\\n(76.28%), TRADES (78.01%), and fast adversarial training\\\\n(78.56%), it demonstrates several key advantages. AMSDAS\\\\nprovides a 4.19% improvement over standard PGD adversar-\\\\nial training, demonstrating significant robustness enhance-\\\\nment. AMSDAS achieves its results with lower computa-\\\\ntional overhead compared to traditional adversarial training\\\\nmethods and shows more stable training dynamics, avoiding\\\\nthe oscillations and overfitting issues common in traditional\\\\nadversarial training.\\\\nAblation Study\\\\nTo understand the contribution of different components in\\\\nour AMSDAS framework, we conducted systematic ablation\\\\nexperiments. Table 3 presents the results of these experiments\\\\non CIFAR-10 with ResNet-18.\\\\nThe ablation results reveal several critical insights. The\\\\nmost striking finding is that early layer smoothing alone cap-\\\\ntures the majority of the performance benefits, achieving\\\\n79.42% accuracy compared to the full framework’s 79.46%.\\\\nThis demonstrates that early layers play a disproportionately\\\\nimportant role in determining network robustness, likely be-\\\\ncause adversarial perturbations primarily exploit instabilities\\\\nin low-level feature extraction (Zhang et al. 2019a).\\\\nWhile the multi-scale extension provides only a small ad-\\\\nditional improvement (+0.04%), it represents a meaningful\\\\nand consistent gain in deep learning optimization landscapes.\\\\nA repeated experiment with the full AMSDAS configuration\\\\n0 10 20 30 40 50\\\\nEpoch1.01.21.41.61.82.02.2Training Loss(a) Training Loss Curves\\\\nAMSDAS (Full)\\\\nEarly Layers Only\\\\n0 10 20 30 40 50\\\\nEpoch304050607080Test Accuracy (%)(b) Test Accuracy Curves\\\\nAMSDAS (Full)\\\\nEarly Layers Only\\\\nAMSDAS (Full) Early Layers Only0510152025303540Loss Reduction (%)41.1% 41.1%(c) Convergence Rate\\\\n62.5 65.0 67.5 70.0 72.5 75.0 77.5 80.0\\\\nTraining Accuracy (%)62.565.067.570.072.575.077.580.0Test Accuracy (%)(d) Generalization Analysis\\\\nAMSDAS (Full)\\\\nEarly Layers Only\\\\nPerfect GeneralizationFigure 1: Detailed ablation analysis comparing full AMS-\\\\nDAS with early-layers-only configuration. (a) Training loss\\\\ncurves show near-identical optimization trajectories. (b) Test\\\\naccuracy evolution demonstrates that early layer smoothing\\\\ncaptures most performance benefits. (c) Convergence effi-\\\\nciency analysis confirms both configurations achieve 41.1%\\\\nloss reduction rate. (d) Generalization gap scatter plot reveals\\\\nsimilar training-test accuracy relationships.\\\\nconfirms the consistency of our results, showing 79.44% ac-\\\\ncuracy, which represents a +0.01% variation from the origi-\\\\nnal experiment.\\\\nFigure 1 provides deeper insights into our ablation study.\\\\nPanel (a) shows that the training loss curves for both configu-\\\\nrations are nearly identical, both converging to a final loss of\\\\n0.991. This suggests that the multi-scale mechanism’s impact\\\\non optimization trajectory is minimal. Panel (b) confirms that\\\\nthe test accuracy curves are also highly consistent, with final\\\\naccuracies differing by only 0.01% (79.43% vs 79.42%).\\\\nThe convergence efficiency analysis in panel (c) reveals\\\\nthat both versions achieve a 41.1% loss reduction rate, indi-\\\\ncating that the multi-scale design does not impact training\\\\nefficiency. Finally, panel (d) shows nearly identical general-\\\\nization gaps between training and testing accuracy (-15.13%\\\\nfor full AMSDAS vs -15.12% for early layers only), demon-\\\\nstrating consistent generalization properties.\\\\nTable 2: Detailed comparison of AMSDAS with traditional adversarial training methods on CIFAR-10 with ResNet-18.\\\\nMethod Test Accuracy (%) Best Accuracy (%) Train Accuracy (%) Train Loss\\\\nStandard PGD Adversarial Training 76.28 78.48 68.15 0.8604\\\\nTRADES 78.01 78.28 65.94 0.9369\\\\nFast Adversarial Training 78.56 78.56 66.06 0.9364\\\\nSmooth Adversarial Training 79.44 79.67 64.30 0.9913\\\\nAMSDAS (Standard) 79.46 79.66 83.85 0.991\\\\nAMSDAS (High LR) 77.36 80.47 83.85 0.991\\\\nTable 3: Ablation study results on CIFAR-10 with ResNet-18, showing the contribution of different components in our AMSDAS\\\\nframework.\\\\nConfiguration Accuracy (%) Improvement\\\\nEarly layers smoothing only 79.42 –\\\\nFull AMSDAS (all layers) 79.43 +0.01%\\\\nRepeated experiment (full) 79.44 +0.01%\\\\nHyperparameter Optimization\\\\nHigh learning rate (0.05) 80.47 +1.04%\\\\nMinimal configuration 79.45 +0.03%\\\\nStandard configuration 79.46 +0.04%\\\\nParameter Sensitivity Analysis\\\\nOur experiments included extensive hyperparameter opti-\\\\nmization to maximize AMSDAS’s effectiveness. The most\\\\nsignificant finding from our hyperparameter optimization is\\\\nthe critical role of learning rate in AMSDAS performance.\\\\nWe achieved the best performance with 80.47% accuracy us-\\\\ning learning rate 0.05, compared to 79.46% accuracy with\\\\nlearning rate 0.005. This demonstrates that AMSDAS ben-\\\\nefits significantly from higher learning rates, which enable\\\\nbetter exploration of the smooth activation space and more\\\\neffective adaptation to input vulnerability.\\\\nThe vulnerability scaling factor plays a crucial role in\\\\nAMSDAS’s adaptive mechanism. The optimal value of 0.5\\\\nprovides the best balance between robustness and accu-\\\\nracy, while lower values (0.1) reduce computational cost but\\\\nslightly decrease performance, and higher values increase\\\\nrobustness but may lead to overfitting.\\\\nStatistical Significance Analysis\\\\nTo verify the statistical significance of our results, we con-\\\\nducted multiple experiments and analyzed the aggregate per-\\\\nformance metrics. Table 4 presents a summary of these statis-\\\\ntics.\\\\nAMSDAS achieves a mean accuracy of 80.75\\\\nIn comparison, baseline adversarial training methods\\\\nachieve a mean accuracy of 78.07\\\\nThe higher mean loss of AMSDAS (0.793 vs 0.368) con-\\\\nfirms our earlier observation that our method avoids over-\\\\nfitting to training data. Despite this higher loss, AMSDAS\\\\nachieves better overall performance, with a best result of\\\\n91.68\\\\nResNet-18 MobileNetV2\\\\nArchitecture01020304050607080Test Accuracy (%)79.4%\\\\n71.1%79.7%\\\\n73.2%(a) AMSDAS Performance Across Architectures\\\\nFinal Accuracy\\\\nBest Accuracy\\\\nCIFAR-10 Fashion-MNIST\\\\nDataset01234567Improvement over Baseline (%) +1.4%+6.7%(b) AMSDAS Generalization Across DatasetsFigure 2: Architecture and dataset generalization analysis. (a)\\\\nComparison between ResNet-18 and MobileNetV2 shows\\\\nconsistent performance trends. (b) Cross-dataset evaluation\\\\ndemonstrates AMSDAS’s strong adaptation to different data\\\\ndistributions.\\\\nCross-Architecture and Cross-Dataset\\\\nGeneralization\\\\nFigure 2 presents our analysis of AMSDAS’s generalization\\\\ncapabilities across architectures and datasets.\\\\nPanel (a) compares performance between ResNet-18 and\\\\nMobileNetV2 on CIFAR-10. The 8.35\\\\nPanel (b) shows cross-dataset generalization, where AMS-\\\\nDAS achieves 91.68\\\\nTraining Dynamics Analysis\\\\nFigure 3 provides crucial insights into the training behavior\\\\nof AMSDAS compared to standard methods.\\\\nThe training loss evolution in panel (a) reveals that AMS-\\\\nDAS maintains a consistently higher loss throughout train-\\\\ning, converging to 0.99 compared to standard training’s 0.35.\\\\nThis pattern suggests that AMSDAS optimizes for a different\\\\nTable 4: Statistical analysis of experimental results across different configurations, showing mean and standard deviation of\\\\nperformance metrics.\\\\nMethod Mean Accuracy (%) Std. Dev. Mean Loss\\\\nBaseline Adversarial Methods 78.07 1.35 0.931\\\\nAMSDAS (Ours) 80.75 8.45 0.793\\\\n0 10 20 30 40 50\\\\nEpoch0.000.250.500.751.001.251.501.752.00Training Loss(a) Training Loss - CIFAR-10\\\\nAMSDAS\\\\nStandard Training\\\\nStandard AT\\\\n0 10 20 30 40 50\\\\nEpoch5055606570758085Test Accuracy (%)(b) Test Accuracy - CIFAR-10\\\\nAMSDAS\\\\nStandard Training\\\\nStandard AT\\\\nAMSDAS\\\\n(CIFAR-10)AMSDAS\\\\n(Fashion-MNIST)StandardTrainingStandardAT020406080100Test Accuracy (%)79.4%91.7%\\\\n78.0%80.5%(c) Final Performance Comparison\\\\nFigure 3: Training dynamics comparison between AMSDAS\\\\nand baseline methods. (a) Training loss evolution shows\\\\nAMSDAS maintains higher final loss, avoiding overfitting.\\\\n(b) Test accuracy curves demonstrate AMSDAS’s supe-\\\\nrior stability. (c) Performance comparison across different\\\\ndatasets confirms the consistent benefits.\\\\nobjective that prioritizes robustness over fitting training data\\\\nexactly.\\\\nThe test accuracy curves in panel (b) demonstrate\\\\none of AMSDAS’s most important advantages: signifi-\\\\ncantly improved training stability. While standard adver-\\\\nsarial training exhibits characteristic oscillations in later\\\\ntraining stages—often attributed to the adversarial exam-\\\\nple generation process constantly finding new vulnerabil-\\\\nities—AMSDAS maintains more consistent performance.\\\\nThis stability is a direct consequence of our adaptive smooth-\\\\ning approach, which dynamically adjusts activation proper-\\\\nties based on vulnerability scores.\\\\nThe cross-dataset comparison in panel (c) further confirms\\\\nAMSDAS’s generalization capabilities, showing consistent\\\\nimprovements across different data distributions. This is par-\\\\nticularly significant because generalization across datasets is\\\\none of the most challenging aspects of robust model devel-\\\\nopment (Hendrycks et al. 2021).\\\\nComputational Efficiency Analysis\\\\nWe evaluated the computational overhead introduced by\\\\nAMSDAS to assess its practical viability. AMSDAS adds\\\\napproximately 5-7\\\\nDuring inference, the computational overhead is negligible\\\\n(less than 1\\\\nAdversarial Attack Evaluation\\\\nTo comprehensively evaluate AMSDAS’s robustness, we\\\\ntested our models against various adversarial attacks. Un-\\\\nder PGD attack with 𝜖=8/255, AMSDAS achieves 79.46We also evaluated AMSDAS’s robustness against trans-\\\\nfer attacks, where adversarial examples generated from one\\\\nmodel are used to attack another. AMSDAS shows improved\\\\nrobustness against transfer attacks compared to baseline\\\\nmethods, indicating better generalization to unseen attack\\\\npatterns.\\\\nConclusion\\\\nThis paper presents Adaptive Multi-Scale Dynamic Activa-\\\\ntion Smoothing (AMSDAS), a novel approach for enhanc-\\\\ning deep neural network robustness against adversarial at-\\\\ntacks. Our work addresses fundamental challenges in adver-\\\\nsarial training through three key innovations: multi-scale ac-\\\\ntivation smoothing, vulnerability-aware dynamic adaptation,\\\\nand coordinated perturbation budget adjustments. AMSDAS\\\\nachieves significant robustness improvements over standard\\\\ntraining (6.90% improvement on CIFAR-10) while main-\\\\ntaining computational efficiency and avoiding overfitting is-\\\\nsues common in traditional adversarial training methods. The\\\\nmethod demonstrates strong generalization across different\\\\narchitectures and datasets, with particularly impressive re-\\\\nsults on Fashion-MNIST (91.68% accuracy). Our ablation\\\\nstudies reveal the critical role of early network layers in de-\\\\ntermining adversarial robustness, providing valuable insights\\\\nfor future robust architecture design.\\\\nFuture work will focus on several promising directions:\\\\nintegrating AMSDAS with other defense mechanisms such\\\\nas adversarial weight perturbation or certified robustness\\\\napproaches; exploring adaptive assignment of network re-\\\\ngions beyond our current three-tier approach; investigating\\\\nthe relationship between vulnerability scoring metrics and\\\\nout-of-distribution generalization; and developing theoreti-\\\\ncal frameworks to automatically determine optimal smooth-\\\\nness parameters based on network architecture to eliminate\\\\nmanual tuning requirements.\\\\nReferences\\\\nAndriushchenko, M.; Croce, F.; Flammarion, N.; and Hein,\\\\nM. 2020. Understanding and Improving Fast Adversarial\\\\nTraining. In Advances in Neural Information Processing\\\\nSystems , 16048–16059.\\\\nBai, T.; Luo, J.; Zhao, J.; Wen, B.; and Wang, Q. 2021. Re-\\\\ncent Advances in Adversarial Training for Adversarial Ro-\\\\nbustness. arXiv preprint arXiv:2102.01356 .\\\\nGoodfellow, I. J.; Shlens, J.; and Szegedy, C. 2014. Explain-\\\\ning and Harnessing Adversarial Examples. arXiv preprint\\\\narXiv:1412.6572 .\\\\nHendrycks, D.; Basart, S.; Mu, N.; Kadavath, S.; Wang, F.;\\\\nDorundo, E.; Desai, R.; Zhu, T.; Parajuli, S.; Guo, M.; Song,\\\\nD.; Steinhardt, J.; and Krause, A. 2021. The Many Faces of\\\\nRobustness: A Critical Analysis of Out-of-Distribution Gen-\\\\neralization. In Proceedings of the IEEE/CVF International\\\\nConference on Computer Vision , 8340–8349.\\\\nHuang, Y.; Guo, Y.; Li, Y.; Gao, J.; and Li, Y.-G. 2021.\\\\nAdaptive Adversarial Training for Robust Deep Learning. In\\\\nInternational Conference on Computer Vision and Pattern\\\\nRecognition , 1–10.\\\\nMadry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; and Vladu,\\\\nA. 2017. Towards Deep Learning Models Resistant to Ad-\\\\nversarial Attacks. arXiv preprint arXiv:1706.06083 .\\\\nRebuffi, S.-A.; Gowal, S.; Calian, D. A.; Stimberg, F.;\\\\nWiles, O.; and Mann, T. 2021. Fixing Data Augmenta-\\\\ntion to Improve Adversarial Robustness. arXiv preprint\\\\narXiv:2103.01946 .\\\\nSzegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.;\\\\nGoodfellow, I.; and Fergus, R. 2014. Intriguing Properties of\\\\nNeural Networks. In International Conference on Learning\\\\nRepresentations .\\\\nWang, Y.; Ma, X.; Bailey, J.; Yi, J.; Zhou, B.; and Gu, Q.\\\\n2021. On the Convergence and Robustness of Adversarial\\\\nTraining. arXiv preprint arXiv:2112.08304 .\\\\nWong, E.; Rice, L.; and Kolter, J. Z. 2020. Fast Is Better\\\\nThan Free: Revisiting Adversarial Training. In International\\\\nConference on Learning Representations .\\\\nXie, C.; Tan, M.; Gong, B.; Wang, J.; Yuille, A. L.; and Le,\\\\nQ. V. 2020. Smooth Adversarial Training. In Proceedings of\\\\nthe IEEE/CVF Conference on Computer Vision and Pattern\\\\nRecognition , 5147–5156.\\\\nZhang, H.; Chen, H.; Xiao, C.; Gowal, S.; Stanforth, R.; Li,\\\\nB.; and Song, D. 2019a. Interpreting Adversarial Exam-\\\\nples by Activation. In International Conference on Learning\\\\nRepresentations .\\\\nZhang, H.; Yu, Y.; Jiao, J.; Xing, E.; El Ghaoui, L.; and Jor-\\\\ndan, M. 2019b. Theoretically Principled Trade-off Between\\\\nRobustness and Accuracy. In International Conference on\\\\nMachine Learning , 7472–7482.\\\\nReproducibility Checklist\\\\nThis paper:\\\\n•Includes a conceptual outline and/or pseudocode descrip-\\\\ntion of AI methods introduced (yes/partial/no/NA) yes\\\\n•Clearly delineates statements that are opinions, hypoth-\\\\nesis, and speculation from objective facts and results\\\\n(yes/no) yes\\\\n•Provides well marked pedagogical references for less-\\\\nfamiliar readers to gain background necessary to replicate\\\\nthe paper (yes/no) yes\\\\nDoes this paper make theoretical contributions?\\\\n(yes/no) yes\\\\nIf yes, please complete the list below.\\\\n•All assumptions and restrictions are stated clearly and\\\\nformally. (yes/partial/no) yes\\\\n•All novel claims are stated formally (e.g., in theorem\\\\nstatements). (yes/partial/no) yes\\\\n•Proofs of all novel claims are included. (yes/partial/no)\\\\nyes\\\\n•Proof sketches or intuitions are given for complex and/or\\\\nnovel results. (yes/partial/no) yes\\\\n•Appropriate citations to theoretical tools used are given.\\\\n(yes/partial/no) yes\\\\n•All theoretical claims are demonstrated empirically to\\\\nhold. (yes/partial/no/NA) yes\\\\n•All experimental code used to eliminate or disprove\\\\nclaims is included. (yes/no/NA) NA\\\\nDoes this paper rely on one or more datasets?\\\\n(yes/no) yes\\\\nIf yes, please complete the list below.\\\\n•A motivation is given for why the experiments are con-\\\\nducted on the selected datasets (yes/partial/no/NA) yes\\\\n•All novel datasets introduced in this paper are included in\\\\na data appendix. (yes/partial/no/NA) NA\\\\n•All novel datasets introduced in this paper will be made\\\\npublicly available upon publication of the paper with a\\\\nlicense that allows free usage for research purposes. (yes/-\\\\npartial/no/NA) NA\\\\n•All datasets drawn from the existing literature (potentially\\\\nincluding authors’ own previously published work) are\\\\naccompanied by appropriate citations. (yes/no/NA) yes\\\\n•All datasets drawn from the existing literature (potentially\\\\nincluding authors’ own previously published work) are\\\\npublicly available. (yes/partial/no/NA) yes\\\\n•All datasets that are not publicly available are described\\\\nin detail, with explanation why publicly available alterna-\\\\ntives are not scientifically satisficing. (yes/partial/no/NA)\\\\nNADoes this paper include computational\\\\nexperiments? (yes/no) yes\\\\nIf yes, please complete the list below.\\\\n•This paper states the number and range of values tried\\\\nper (hyper-) parameter during development of the paper,\\\\nalong with the criterion used for selecting the final pa-\\\\nrameter setting. (yes/partial/no/NA) yes\\\\n•Any code required for pre-processing data is included in\\\\nthe appendix. (yes/partial/no) yes\\\\n•All source code required for conducting and analyzing\\\\nthe experiments is included in a code appendix. (yes/par-\\\\ntial/no) yes\\\\n•All source code required for conducting and analyzing\\\\nthe experiments will be made publicly available upon\\\\npublication of the paper with a license that allows free\\\\nusage for research purposes. (yes/partial/no) yes\\\\n•All source code implementing new methods have com-\\\\nments detailing the implementation, with references to\\\\nthe paper where each step comes from (yes/partial/no)\\\\nyes\\\\n•If an algorithm depends on randomness, then the method\\\\nused for setting seeds is described in a way sufficient to\\\\nallow replication of results. (yes/partial/no/NA) yes\\\\n•This paper specifies the computing infrastructure used for\\\\nrunning experiments (hardware and software), including\\\\nGPU/CPU models; amount of memory; operating sys-\\\\ntem; names and versions of relevant software libraries\\\\nand frameworks. (yes/partial/no) yes\\\\n•This paper formally describes evaluation metrics used\\\\nand explains the motivation for choosing these metrics.\\\\n(yes/partial/no) yes\\\\n•This paper states the number of algorithm runs used to\\\\ncompute each reported result. (yes/no) yes\\\\n•Analysis of experiments goes beyond single-dimensional\\\\nsummaries of performance (e.g., average; median) to in-\\\\nclude measures of variation, confidence, or other distri-\\\\nbutional information. (yes/no) yes\\\\n•The significance of any improvement or decrease in per-\\\\nformance is judged using appropriate statistical tests (e.g.,\\\\nWilcoxon signed-rank). (yes/partial/no) yes\\\\n•This paper lists all final (hyper-)parameters used for each\\\\nmodel/algorithm in the paper’s experiments. (yes/par-\\\\ntial/no/NA) yes\\\\n\\\",\\n  \\\"metadata\\\": {\\n    \\\"file_type\\\": \\\".pdf\\\",\\n    \\\"pages\\\": 9\\n  },\\n  \\\"file_path\\\": \\\"upload/AAAI_adv.pdf\\\",\\n  \\\"char_count\\\": 38843,\\n  \\\"word_count\\\": 5072\\n}\", \"error_information\": \"\"}"
  }
]