Food-101 Dataset Notes¶
Transforms¶
Variation 1¶
train_transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),ImageNetPolicy(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
Variation 2¶
From: https://nbviewer.jupyter.org/github/shubhajitml/food-101/blob/master/food-101-pytorch.ipynb
train_tfms = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(self.imgenet_mean, self.imgenet_std)])
valid_tfms = transforms.Compose([
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(self.imgenet_mean, self.imgenet_std)])
Variation 3¶
From: https://github.com/dashimaki360/food101/blob/master/src/train.py
data_transforms = {
'train': transforms.Compose([
transforms.RandomResizedCrop(input_size),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
'test': transforms.Compose([
transforms.Resize(input_size),
transforms.CenterCrop(input_size),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}
SOTA¶
| Method | Top - 1 | Top - 5 | Publication | |— |— |— |— | | HoG |8.85 | - | ECCV2014 | | SURF BoW-1024 | 33.47 | - | ECCV2014 | | SURF IFV-64 | 44.79 | - | ECCV2014 | | SURF IFV-64 + Color Bow-64 | 49.40 | - | ECCV2014 | | IFV | 38.88 | - | ECCV2014 | | RF | 37.72 | - | ECCV2014 | | RCF | 28.46 | - | ECCV2014 | | MLDS | 42.63 | - | ECCV2014 | | RFDC | 50.76 | - | ECCV2014 | | SELC | 55.89 | - | CVIU2016 | | AlexNet-CNN | 56.40 | - | ECCV2014 | | DCNN-FOOD | 70.41 | - | ICME2015 | | DeepFood | 77.4 | 93.7 | COST2016 | | Inception V3 | 88.28 | 96.88 | ECCVW2016 | | ResNet-200 | 88.38 | 97.85 | CVPR2016 | | WRN | 88.72 | 97.92| BMVC2016 | |ResNext-101| 85.4|96.5| Proposed | WISeR | 90.27 | 98.71 | UNIUD2016 | | DenseNet - 161 | 93.26 | 99.01 | Proposed |
Model training and SoTA results¶
From: https://github.com/pyligent/food101-image-classification
Deep Convolution Neural Network model have achieved remarkable results in image classification problems. For food 101 data the current SoTA results are:
InceptionV3 : 88.28% / 96.88% (Top 1/Top 5)
ResNet200 : 90.14% (Top 1)
WISeR : 90.27% / 98.71% (Top 1/Top 5)
My Results: By using the pre-trained ResNet50 model, started by training the network with an image size of 224x224 for 16 epochs , training on image size of 512x512 for additional 16 epochs.
top_1_accuracy:
89.63%top_5_accuracy:
98.04%