{"metadata":{"accelerator":"GPU","colab":{"provenance":[]},"kaggle":{"accelerator":"nvidiaTeslaT4","dataSources":[{"sourceId":7136932,"sourceType":"datasetVersion","datasetId":4118400},{"sourceId":7140103,"sourceType":"datasetVersion","datasetId":4120898},{"sourceId":7406372,"sourceType":"datasetVersion","datasetId":4307280}],"dockerImageVersionId":30636,"isInternetEnabled":true,"language":"python","sourceType":"notebook","isGpuEnabled":true},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.12"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"### Data preparations","metadata":{"id":"nHKaApd7yxuZ"}},{"cell_type":"code","source":"!pip install neptune lightning pytorchvideo","metadata":{"id":"G6WJsfm8yzBZ","outputId":"cbfed9d0-06ef-4aa9-d89c-52d81fe30225","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"import neptune\n\nrun = neptune.init_run(\n    project=\"\",\n    api_token=\"\",\n)  # your credentials\n","metadata":{"id":"vi3a5yNHSnQG","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"from pytorchvideo.data import LabeledVideoDataset, make_clip_sampler, labeled_video_dataset\nfrom pytorchvideo.transforms import (\n    RandomResizedCrop,\n    RandomShortSideScale,\n    ApplyTransformToKey,\n    Normalize,\n)\nfrom torchvision.transforms import (\n    Compose,\n    Lambda,\n    RandomHorizontalFlip,\n    RandomResizedCrop,\n    RandomCrop,\n)\n\nimport pandas as pd\nimport numpy as np\nimport os\nimport shutil\nfrom torch.utils.data import DataLoader\nfrom pytorch_lightning.callbacks import ModelCheckpoint, LearningRateMonitor\n\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom pytorch_lightning import LightningModule, seed_everything, Trainer\nfrom torch.optim.lr_scheduler import CosineAnnealingLR\nimport torchmetrics\nfrom sklearn.metrics import classification_report\nimport time","metadata":{"id":"04mrmwwoyxub","outputId":"020e538c-4dea-4842-8ca7-86c688b8fb77","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"mean = np.array([0.9133, 0.2737, 0.2737])\nstd  = np.array([0.1576, 0.2508, 0.2508])\n\nsolver = \"kaggle\" \n\nRRCropScale = (5E-2, 4E-1)\nRRCropScale_test = (1E-1, 2E-1)\n\nepochs = 400\nbatch_size = 50\nnum_workers = 16\n\nns = {'train': 0.6, 'val': 0.2,  'test': 0.2}\nseed = 0 # random seeds are 42, 0, 17, 9, 3\nrun['seed'] = seed\ntarget_cell = 'rbc'\ntarget_label = f'high_{target_cell}'\nnum_classes = 2\n\ndataset_path = \"/kaggle/input/cells-classification/dataset\"\ndataframe_path = \"/kaggle/input/cells-classification/\"\n\nuse_curriculum = True\n\n# Ensure that all operations are deterministic on GPU (if used) for reproducibility\ntorch.backends.cudnn.deterministic = True\ntorch.backends.cudnn.benchmark = False","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"DEVICE = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"video_transforms = Compose([\n    ApplyTransformToKey(key='video',\n    transform=Compose([\n        RandomResizedCrop(64,scale=RRCropScale,antialias=True),\n        RandomHorizontalFlip(p=0.5),\n        Lambda(lambda x: x / 255.0),\n        Normalize(mean,std),\n    ]),\n    ),\n])\n\nvideo_transforms_test = Compose([\n    ApplyTransformToKey(key='video',\n    transform=Compose([\n        RandomResizedCrop(64,scale=RRCropScale_test,antialias=True),\n        #RandomHorizontalFlip(p=0.5),\n        Lambda(lambda x: x / 255.0),\n        Normalize(mean,std),\n    ]),\n    ),\n])","metadata":{"id":"TvTfp6DMyxud","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"seed_everything(seed)\ndataframe = pd.read_csv(dataframe_path + \"DataFrame.csv\", index_col=0)\ndataframe = dataframe.sample(frac=1, random_state=seed)\ntrain_size = int(dataframe.shape[0] * ns['train'])\ntrain_data = dataframe[0:train_size]\ntest_data = dataframe[train_size:]\ntest_size = int(dataframe.shape[0] * ns['test'])\nval_data = test_data[0:test_size]\ntest_data = test_data[test_size:]\ntrain_data, val_data, test_data","metadata":{"id":"qD43cH8J_tX4","outputId":"daad7246-c4f5-4ad2-8b1e-3e3f05fbecf3","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"class CustomDataset(LabeledVideoDataset):\n    def __init__(self, dataset_path, dataframe, target_name, transforms, \n                 clip_sampler_type='random', clip_duration=0.3):\n      df = dataframe.reset_index()\n      paths = []\n      for i, file_name in enumerate(df['files']):\n          temp_dict = df.iloc[i].to_dict()\n          temp_dict['label'] = df[target_name][i]\n          temp_dict.pop('files')\n          temp_dict.pop('index')\n          paths.append((f\"{dataset_path}/{file_name}\", temp_dict))\n      super().__init__(labeled_video_paths=paths,\n                       clip_sampler=make_clip_sampler(clip_sampler_type, clip_duration),\n                       transform=transforms, decode_audio=False)","metadata":{"id":"bm40twbHybAo","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"if solver == \"kaggle\":\n    os.chdir(\"/kaggle/working/\")","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"checkpoint_callback = ModelCheckpoint(save_weights_only=True, \n                                      mode=\"min\", \n                                      monitor=\"val/loss\",\n                                      dirpath=\"checkpoints\", \n                                      filename=\"file\") \nlr_monitor = LearningRateMonitor(logging_interval=\"epoch\")","metadata":{"id":"Z8boZL_OrZSG","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"class CurriculumTrainer():\n    def set_difficulty(self, dataframe, target_cells, target_name, alpha, beta):\n        # Calculating distances between mean and target value\n        df = dataframe.copy(deep=True)\n        df['distances'] = (df[target_cell] - df[target_cell].mean()).abs()\n\n        # Normalizing blur and distances\n        df['blur'] = (df['blur'] - df['blur'].min()) / (df['blur'].max() - df['blur'].min())\n        df['distances'] = (df['distances'] - df['distances'].min()) / (df['distances'].max() - df['distances'].min())\n\n        # Calculating and normalizing difficulty\n        df['difficulty'] = alpha * df['blur'] + beta * df['distances']\n        df['difficulty'] = (df['difficulty'] - df['difficulty'].min()) / (df['difficulty'].max() - df['difficulty'].min())\n        return df\n\n    def evaluate_competence(self, max_epochs, current_epoch, c0, p):\n        return min(1, (current_epoch*((1-c0**p)/max_epochs)+c0**p)**(1/p))\n\n    def fit(self, model, dataframe, target_cells, target_name, max_epochs, c0, p, alpha, beta):\n        dataframe = self.set_difficulty(dataframe, target_cells, target_name, alpha, beta)\n        self.competence = c0\n        seed_everything(seed)\n        for epoch in range(1, max_epochs+1):\n            selected_data = dataframe[dataframe.difficulty <= self.competence]\n            dataset = CustomDataset(dataset_path=dataset_path, dataframe=selected_data,\n                                    target_name=target_name,\n                                    transforms=video_transforms)\n            loader = DataLoader(dataset, batch_size=model.batch_size, num_workers=model.numworkers, \n                                pin_memory=True, drop_last=False)\n            print(f\"-----------------------\\nEpoch {epoch}, competence = {self.competence}, dataset size = {dataset.num_videos}\")\n            run[\"model/competence\"].append(self.competence)\n            run[\"train/dataset_size\"].append(dataset.num_videos)\n            self.trainer = Trainer(max_epochs=1,\n                                   precision='16-mixed',\n                                   accumulate_grad_batches=2,\n                                   enable_progress_bar=True,\n                                   enable_model_summary=False,\n                                   num_sanity_val_steps=0,\n                                   callbacks=[lr_monitor, checkpoint_callback])\n            self.trainer.fit(model, loader)\n            self.competence = self.evaluate_competence(max_epochs, epoch, c0, p)\n\n    def validate(self, model):\n        self.trainer.validate(model)\n\n    def test(self, model):\n        self.trainer.test(model)","metadata":{"id":"Fa66LLF-XWKh","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"### Model","metadata":{"id":"SgbT8X8Tyxui"}},{"cell_type":"code","source":"class TestModel(LightningModule):\n    def __init__(self):\n        super(TestModel, self).__init__()\n        # model architecture\n        self.video_model = torch.hub.load(\"facebookresearch/pytorchvideo\", \"efficient_x3d_xs\", pretrained=True)\n        self.relu = nn.ReLU()\n        self.linear = nn.Linear(400, num_classes)\n\n        self.lr = 1e-3\n        self.batch_size = batch_size\n        self.numworkers = num_workers\n        # evaluation metric\n        self.metric = torchmetrics.Accuracy(task='multiclass', num_classes=num_classes)\n        # loss function\n        self.criterion = nn.CrossEntropyLoss()\n        # helpers\n        self.target_label = target_label\n        self.training_step_outputs = []\n        self.validation_step_outputs = []\n        self.testing_step_outputs = []\n\n    def forward(self, x):\n        x = self.video_model(x)\n        x = self.relu(x)\n        x = self.linear(x)\n        return x\n\n    def configure_optimizers(self):\n        opt = torch.optim.AdamW(params=self.parameters(), lr=self.lr)\n        scheduler = CosineAnnealingLR(opt, T_max=10, eta_min=1e-6, last_epoch=-1)\n        return {'optimizer': opt, 'lr_scheduler': scheduler}\n\n    def training_step(self, batch, batch_idx):\n        video, label = batch['video'], batch['label']\n        out = self.forward(video)\n        loss = self.criterion(out, label)\n        metric = self.metric(out, label.to(torch.int64))\n        self.training_step_outputs.append({'loss': loss, 'metric': metric})\n        return {'loss': loss, 'metric': metric}\n\n    def on_train_epoch_end(self):\n        outputs = self.training_step_outputs\n        loss = torch.stack([x['loss'] for x in outputs]).mean().cpu().detach().numpy().round(2)\n        metric = torch.stack([x['metric'] for x in outputs]).mean().cpu().detach().numpy().round(2)\n        self.training_step_outputs = []\n        self.log('train/loss', loss)\n        self.log('train/metric', metric)\n        run[\"train/loss\"].append(loss)\n        run[\"train/metric\"].append(metric)\n\n    def val_dataloader(self):\n        dataset = CustomDataset(dataset_path=dataset_path, dataframe=val_data,\n                              target_name=self.target_label,\n                              clip_sampler_type='random',\n                              transforms=video_transforms_test)\n        loader = DataLoader(dataset, batch_size=self.batch_size, num_workers=self.numworkers,\n                            pin_memory=True, drop_last=False)\n        return loader\n\n    def validation_step(self, batch, batch_idx):\n        video, label = batch['video'], batch['label']\n        out = self.forward(video)\n        loss = self.criterion(out, label)\n        metric = self.metric(out, label.to(torch.int64))\n        self.validation_step_outputs.append({'loss': loss, 'metric': metric})\n        return {'loss': loss, 'metric': metric}\n\n    def on_validation_epoch_end(self):\n        outputs = self.validation_step_outputs\n        loss = torch.stack([x['loss'] for x in outputs]).mean().cpu().detach().numpy().round(2)\n        metric = torch.stack([x['metric'] for x in outputs]).mean().cpu().detach().numpy().round(2)\n        self.validation_step_outputs = []\n        self.log('val/loss', loss)\n        self.log('val/metric', metric)\n        run[\"val/loss\"].append(loss)\n        run[\"val/metric\"].append(metric)\n        print({'loss': loss, 'metric': metric})\n\n    def test_dataloader(self):\n        dataset = CustomDataset(dataset_path=dataset_path, dataframe=test_data,\n                              target_name=self.target_label,\n                              clip_sampler_type='random',\n                              transforms=video_transforms_test)\n        loader = DataLoader(dataset, batch_size=self.batch_size, \n                            num_workers=self.numworkers, pin_memory=True, drop_last=False)\n        return loader\n\n    def test_step(self, batch, batch_idx):\n        video, label = batch['video'], batch['label']\n        out = self.forward(video)\n        self.testing_step_outputs.append({'label': label, 'pred': out})\n        return {'label': label, 'pred': out}\n\n    def on_test_epoch_end(self):\n        outputs = self.testing_step_outputs\n        label = torch.cat([x['label'] for x in outputs]).cpu().detach().numpy()\n        pred = torch.cat([x['pred'].argmax(dim=1) for x in outputs]).cpu().detach().numpy()\n        self.testing_step_outputs = []\n        print(classification_report(label, pred))","metadata":{"id":"XEILRhilyxuj","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"model = TestModel()","metadata":{"id":"BDS3zCEpyxuj","outputId":"a1571cb6-1004-460d-a3d1-6e09b39e7b35","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"params = {\n    \"n_epochs\": epochs,\n    \"c0\": 0.05,\n    \"p\": 2,\n    # In difficulty function: alpha * df['blur'] + beta * df['distances']\n    \"alpha\": 0.5, \n    \"beta\":  0.5, \n}\nrun[\"model/parameters\"] = params\nrun[\"model/architecture\"] = \"efficient_x3d_xs\"\nrun[\"model/difficulty_func\"] = \"alpha * df['blur'] + beta * df['distances']\"","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"if use_curriculum:\n    trainer = CurriculumTrainer()\n    start = time.time()\n    trainer.fit(model, train_data, target_cell, target_label,\n                params[\"n_epochs\"], params[\"c0\"], params[\"p\"],\n                params[\"alpha\"], params[\"beta\"])\n    stop = time.time()\nelse:\n    seed_everything(seed)\n    trainer = Trainer(max_epochs=params[\"n_epochs\"],\n                      precision='16-mixed',\n                      accumulate_grad_batches=2,\n                      enable_progress_bar=True,\n                      num_sanity_val_steps=0,\n                      callbacks=[lr_monitor, checkpoint_callback])\n    start = time.time()\n    seed_everything(seed)\n    dataset = CustomDataset(dataset_path=dataset_path, dataframe=train_data,\n                          target_name=model.target_label,\n                          clip_sampler_type='random',  \n                          transforms=video_transforms)\n    loader = DataLoader(dataset, batch_size=batch_size, num_workers=num_workers,\n                        pin_memory=True, drop_last=False)\n    trainer.fit(model, loader)\n    stop = time.time()\n\n\nprint(f\"Elapsed time: {(stop - start)/60} min\")\nrun['elapsed_time'] = stop - start","metadata":{"id":"X8lHfORtSnQL","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"model = TestModel.load_from_checkpoint(checkpoint_callback.best_model_path)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"trainer.validate(model)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"trainer.test(model)","metadata":{"id":"aR99tqb2lyR1","trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"### Additional settings and initializations related to multi-view","metadata":{}},{"cell_type":"code","source":"import torch.nn.functional as F\n\nimport matplotlib.pyplot as plt\nimport seaborn as sns\n\nfrom sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, f1_score, precision_score, recall_score, accuracy_score, roc_auc_score","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"CE_metrics = torchmetrics.classification.MulticlassCalibrationError(num_classes=num_classes, n_bins=32, norm='l1')","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# Multi-view mode (the use of 'num_views' augmented copies of a sample for predictions)\nmultiView = {'isMultiView': True, 'num_views': 15, 'num_views_x': 10}\n\ntr=(1/num_classes) # threshold between certain and uncertain predictions (if hard estimation)\n\nnSamples = 8\nnumBins = 20","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"### Additional functions related to multi-view","metadata":{}},{"cell_type":"code","source":"# Hard and soft certainty estimation of the predictions\ndef certain_predictions(x, labels,probv, pred, tr=(1/num_classes)):\n    # one hot encoding for labels\n    yoh = F.one_hot(labels,num_classes=num_classes)\n    \n    certainties = torch.zeros_like(yoh)\n    certainty_maxH_s = probv\n    certainty_maxY_s = torch.zeros_like(certainty_maxH_s)\n\n    certainty_maxH_h = (certainty_maxH_s > tr) * 1\n    certainty_maxY_h = (certainty_maxY_s > tr) * 1\n    cert_attr = {\n        'certainties':certainties,\n        'certainty_maxH_s':certainty_maxH_s,\n        'certainty_maxH_h':certainty_maxH_h,\n        'certainty_maxY_s':certainty_maxY_s,\n        'certainty_maxY_h':certainty_maxY_h,\n    }\n    return cert_attr","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"def accuracy_mv(labels, pred, certainty_h):\n    with torch.no_grad():\n        n_samples = labels.shape[0]\n        n_correct = (pred == labels).sum().item()\n\n        n_samples_cer = (certainty_h == 1).sum().item()\n        n_correct_cer = torch.logical_and((pred == labels),\n                                      (certainty_h == 1)).sum().item()\n\n        acc_without_u = (n_correct / n_samples)\n        if n_samples_cer < 1.0:\n            acc_with_u = torch.tensor(0.0)\n        else:\n            acc_with_u = n_correct_cer / n_samples_cer\n\n        accuracy_mv_attr = {\n            'pred':pred,\n            'certainty_h':certainty_h,\n            'n_samples':n_samples,\n            'n_correct':n_correct,\n            'n_samples_cer':n_samples_cer,\n            'n_correct_cer':n_correct_cer,\n            'acc_without_u':acc_without_u,\n            'acc_with_u':acc_with_u,\n        }\n    return accuracy_mv_attr","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"def test_vis(model, loader):\n    test_pred = []\n    test_prob = []\n    test_prob_comp = []\n    test_cert_comp = []\n    test_labe = []\n    test_inde = []\n    test_cert_maxH_s = []\n    test_cert_maxH_h = []\n    test_cert_maxY_s = []\n    test_cert_maxY_h = []\n    \n    with torch.no_grad():\n        for batch in loader: \n            inputs = batch['video'].to(DEVICE)\n            labels = batch['label'].to(DEVICE)\n            indexes= batch['video_index'].to(DEVICE)\n            x = model(inputs).to(DEVICE)\n            \n            # probabilities, max_probabilities, predicted classes\n            prob = nn.functional.softmax(x,1)\n            probv, pred = torch.max(prob,1)\n            \n            test_pred.append(pred)\n            test_prob.append(probv)\n            test_prob_comp.append(prob)\n            test_labe.append(labels)\n            test_inde.append(indexes)\n            \n            # certain predictions\n            cert_attr = certain_predictions(x,labels,probv,pred,tr)\n            test_cert_comp.append(cert_attr['certainties'])\n            test_cert_maxH_s.append(cert_attr['certainty_maxH_s'])\n            test_cert_maxH_h.append(cert_attr['certainty_maxH_h'])\n            test_cert_maxY_s.append(cert_attr['certainty_maxY_s'])\n            test_cert_maxY_h.append(cert_attr['certainty_maxY_h'])\n\n    test_labels = torch.cat(test_labe,dim=0)\n    test_indexes= torch.cat(test_inde,dim=0)\n    \n    test_predictions = torch.cat(test_pred,dim=0)\n    test_probabilities=torch.cat(test_prob,dim=0)\n    test_prob_complete=torch.cat(test_prob_comp,dim=0)\n    test_cert_complete=torch.cat(test_cert_comp,dim=0)\n\n    test_certainties_maxH_s = torch.cat(test_cert_maxH_s,dim=0)\n    test_certainties_maxH_h = torch.cat(test_cert_maxH_h,dim=0)\n    test_certainties_maxY_s = torch.cat(test_cert_maxY_s,dim=0)\n    test_certainties_maxY_h = torch.cat(test_cert_maxY_h,dim=0)\n\n    ECE = CE_metrics(test_prob_complete, test_labels)\n    acc = torch.sum(test_labels == test_predictions) / torch.sum(test_labels == test_labels)\n    \n    test_attr = {\n        'test_labels':test_labels,\n        'test_indexes':test_indexes,\n        'test_predictions':test_predictions,\n        'test_probabilities':test_probabilities,\n        'accuracy':acc, \n        'test_certainties_maxH_s':test_certainties_maxH_s,\n        'test_certainties_maxH_h':test_certainties_maxH_h,\n        'test_certainties_maxY_s':test_certainties_maxY_s,\n        'test_certainties_maxY_h':test_certainties_maxY_h,\n        'ECE':ECE\n    }\n    print('please, wait a bit more')\n            \n    return test_attr ","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"### Single-view predictions with the test set","metadata":{}},{"cell_type":"code","source":"seed_everything(seed)\ndataset = CustomDataset(dataset_path=dataset_path, dataframe=test_data,\n                              target_name=target_label,clip_sampler_type='random',\n                              transforms=video_transforms_test)\nloader = DataLoader(dataset, batch_size=batch_size, num_workers=num_workers, pin_memory=True, drop_last=False)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"test_attr = test_vis(model.to(DEVICE), loader)\nECE = test_attr['ECE'].item()\ntest_acc = test_attr['accuracy'].item()\nprint('accuracy', test_acc)\nprint('ECE = ', ECE)\ndel test_attr","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"### Multi-view predictions with the test set","metadata":{}},{"cell_type":"code","source":"test_labels_mv = []\ntest_indexes_mv= []\ntest_predictions_mv = []\ntest_probabilities_mv = []\ntest_certainties_maxH_s_mv = []\ntest_certainties_maxH_h_mv = []\ntest_certainties_maxY_s_mv = []\ntest_certainties_maxY_h_mv = []\n\nfor i in range(multiView['num_views']):\n    test_attr = test_vis(model.to(DEVICE),loader)\n\n    test_labels_mv.append(test_attr['test_labels'].to('cpu').numpy())\n    test_indexes_mv.append(test_attr['test_indexes'].to('cpu').numpy())\n    \n    test_predictions_mv.append(test_attr['test_predictions'].to('cpu').numpy())\n    test_probabilities_mv.append(test_attr['test_probabilities'].to('cpu').numpy())\n    test_certainties_maxH_s_mv.append(test_attr['test_certainties_maxH_s'].to('cpu').numpy())\n    test_certainties_maxH_h_mv.append(test_attr['test_certainties_maxH_h'].to('cpu').numpy())\n    test_certainties_maxY_s_mv.append(test_attr['test_certainties_maxY_s'].to('cpu').numpy())\n    test_certainties_maxY_h_mv.append(test_attr['test_certainties_maxY_h'].to('cpu').numpy())\n\n    del test_attr\n\nlabels_mv = torch.as_tensor(test_labels_mv)\nindexes_mv= torch.as_tensor(test_indexes_mv)\n\npredictions_mv = torch.as_tensor(test_predictions_mv)\nprobabilities_mv=torch.as_tensor(test_probabilities_mv)\n\ncertainties_maxH_s_mv = torch.as_tensor(test_certainties_maxH_s_mv)\ncertainties_maxH_h_mv = torch.as_tensor(test_certainties_maxH_h_mv)\n\ncertainties_maxY_s_mv = torch.as_tensor(test_certainties_maxY_s_mv)\ncertainties_maxY_h_mv = torch.as_tensor(test_certainties_maxY_h_mv)\n\n# Alighning\nlabels_mv = labels_mv.ravel()\nindexes_mv= indexes_mv.ravel()\n\npredictions_mv = predictions_mv.ravel()\nprobabilities_mv = probabilities_mv.ravel()\n\ncertainties_maxH_s_mv = certainties_maxH_s_mv.ravel()\ncertainties_maxH_h_mv = certainties_maxH_h_mv.ravel()\n\ncertainties_maxY_s_mv = certainties_maxY_s_mv.ravel()\ncertainties_maxY_h_mv = certainties_maxY_h_mv.ravel()\n\n# Create dictionary and df\ndata_dict = {'labels_mv':labels_mv,\n             'indexes_mv':indexes_mv,\n             'predictions_mv':predictions_mv, \n}\ndata_df = pd.DataFrame(data_dict)\n# Calculate median alongs the MV\nindex_uniq = data_df.indexes_mv.unique()\n\nmode = np.zeros((index_uniq.shape[0], data_df.drop(['indexes_mv'], axis = 1).shape[1]))\nfor i in range(index_uniq.shape[0]):\n    mode[i,:] = data_df[data_df.indexes_mv == index_uniq[i]].drop(['indexes_mv'], axis = 1).mode()\n\ndata_al = pd.DataFrame(mode, index=list(index_uniq), columns=data_df.drop(['indexes_mv'], axis=1).columns)\nprint('Mode values:\\n', data_al[:50])\n\nmode_labels = torch.tensor(data_al['labels_mv'])\nmode_predictions = torch.tensor(data_al['predictions_mv'])\n\nmode_labels, mode_predictions","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"cm = confusion_matrix(mode_labels.to('cpu'), mode_predictions.to('cpu'))\nprint('Accuracy of mode based MV predictions is (see the confusion matrix below)', accuracy_mv(mode_labels,mode_predictions,mode_predictions)['acc_without_u'])\ncm_display = ConfusionMatrixDisplay(cm,display_labels=[0,1]).plot()","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"test_accuracy_summary = {\n    'accuracy': test_acc,\n    'MV-mode': accuracy_mv(mode_labels,mode_predictions,mode_predictions)['acc_without_u'],\n}\nprint('ECE:',ECE)\ntest_accuracy_summary","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"run['checkpoint'].upload('checkpoints/file.ckpt')\nrun.stop()","metadata":{"trusted":true},"execution_count":null,"outputs":[]}]}