Reproducibility Study on Curriculum by SmoothingDownload PDF

06 Dec 2020 (modified: 05 May 2023)ML Reproducibility Challenge 2020 Blind SubmissionReaders: Everyone
Abstract: Curriculum learning is the dataset preprocessing technique used to train deep neural networks where the core idea is to sort datasets according to their difficulty level. During the initial phase of training, we expose the deep network to easy examples and gradually increase the difficulty of examples. This idea's primary motive is the faster convergence of deep networks and better generalization of the task. During the network's early training phase, information propagated can contain distorted artifacts due to noise that can disturb the training. In the paper, the author proposed a curriculum-based technique that smooths feature embedding of a CNN using anti-aliasing or low-pass filters. Their main idea is controlling the high-frequency information propagated during the training of CNNs by convolving the output of a CNN feature map of each layer with a Gaussian kernel. As the Gaussian kernel variance decreases, it increases the amount of high-frequency information available within the network for inference. As the amount of information in the feature map increases, the network can learn to represent the data better progressively. In this paper mainly the author has targeted mainly three directions of experiments: 1.) Better task performance: How does the model's accuracy vary when trying with curriculum and without a curriculum learning approach? 2.) Better feature extraction: If we use the trained model to extract the features from different datasets to train a weak classifier and for various vision tasks where pretraining is required, how does the model's performance vary? 3.) Generative Models: How does the curriculum-based smoothing help in different vision tasks which utilize CNNs such as generative models? The authors tested their hypothesis on several variants of the CNN architectures. They targeted the image classification task, feature extraction, and the transfer of the learned model to a different task; and conducted experiments on four different network architectures: VGG16, ResNet-18, Wide-ResNet50, and ResNeXt50 with and without Curriculum by Smoothing(CBS). They further evaluated their models on zero-shot domain adaptation and generative models to test their solution's generality. In our reproducibility plan, we will run the same set of experiments and reproduce the observations. Further, we will contrast our findings with the ones documented in this paper. As outlined in the quantitative results, the comparison reveals improvements over the CNN models trained without using CBS. We plan to verify this observation and also provide any valuable insights discussing the reproducibility aspect of the proposed technique.
3 Replies

Loading